- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: -RMS-F-RER, file read error
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 01:32 AM
09-08-2009 01:32 AM
We have OpenVMS V8.3-1H1 running on IA64. One physical 137Gb SCSI disk drive, mounted as one logical device, no partitioning.
Everything was fine so far but recently I’ve started noticing errors while copying EXE images. For example I have a folder with 100 exe images (each is ~50Mb size). While copying folder’s contents to another one I have one-two errors like this:
SYSTEM$ copy [.srv1]*.exe [.srv2]
%COPY-E-READERR, error reading SYS$SYSDEVICE:[USER.SRV1]CR080929.EXE;1
-RMS-F-RER, file read error
-SYSTEM-F-PARITY, parity error
%COPY-W-NOTCMPLT, SYS$SYSDEVICE:[USER.SRV2]CR080929.EXE;1 not completely copied
which results in partially copied exe. Though after it I can copy these ‘failed’ EXE files one by one and it goes fine in most cases.
As far as I understand it is a hardware related error which leads to the question 'Is there any chance to detect failed hardware or diagnose the system/hdd to reveal the root cause?'
Any input is appreciated.
Best regards,
Dmitry Sinelnikov
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 01:45 AM
09-08-2009 01:45 AM
Re: -RMS-F-RER, file read error
the disk should have logged an error due to the parity error on read. You need to analyse SYS$ERRORLOG:ERRLOG.SYS to find out about the error and the affected LBA (logical block number).
You will most likely need to run SEA (part of WEBES) or even DECevent V3.4 (only available on OpenVMS Alpha) to decode/translate the error log entry and find this piece of information.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 02:04 AM
09-08-2009 02:04 AM
Re: -RMS-F-RER, file read error
$ show time
8-SEP-2009 13:57:53
$ dir /col=1 /date
Directory SYS$SYSROOT:[SYSERR]
ERRFMT_IPMI_SEL.DAT;1 17-JUL-2009 04:04:45.66
ERRLOG.SYS;1 17-JUL-2009 03:51:27.42
Total of 2 files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 02:11 AM
09-08-2009 02:11 AM
Solutionthere it is: ERRLOG.SYS. All hardware-related errors are written/appended to that BINARY file.
You need a tool, to decode/translate the error information in that file.
OpenVMS (since V7.3-2) comes with ANAL/ERR/ELV, but this tool does not decode the details in most types of errlog entries.
You need SEA (System Event Analyzer), which comes as part of the WEBES tool suite. There is also a version of WEBES for Windows, if you don't want to install WEBES on your OpenVMS I64 system. But I'm not sure, if SEA will decode enough of the details of those disk errors to allow you to obtain the LBA numbers.
Only DECevent V3.4 will do this, but you need an OpenVMS Alpha system to install and run DECevent.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 02:16 AM
09-08-2009 02:16 AM
Re: -RMS-F-RER, file read error
note that you could also run $ ANAL/DISK/READ on this disk, this should report all read errors, which occur on blocks allocated to any file on the disk.
Nevertheless, you would probably want to replace this disk, before the errors spread or increase. Make sure you keep a good backup of the data, but watch the backup operation log files, as BACKUP might also have problems reading those blocks.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 03:03 AM
09-08-2009 03:03 AM
Re: -RMS-F-RER, file read error
I see numerouse errors in this file. Seems to be hdd related problem, since anal/disk/read returns numerous parity errors like this:
%ANALDISK-W-READFILE, file (9166,1,0) ACCOUNTNG.DAT;1
error reading VBN 3450886
-SYSTEM-F-PARITY, parity error
Thus I have a question regarding backup procedure. Is it possible to run it while system is running ? I have some difficulties booting from IA64 DVD (after executing fsn:\efi\boot\bootia64.efi system displays two warnings and freezes with cursor blinking, thus I can not enter DCL commands - another issue to figure out...) so I have to run BACKUP utility on a live system.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 03:20 AM
09-08-2009 03:20 AM
Re: -RMS-F-RER, file read error
>> -SYSTEM-F-PARITY, parity error
That means trouble indeed.
But be aware of an other error you may get in the future after backup and restores involving an IO problem:
RER, file read error
FORCEDERROR, forced error flagged in last sector read
That would be the original error carried forward as a reminder that the data is not to be trusted. Cleared by writing the block (file).
>> Thus I have a question regarding backup procedure. Is it possible to run it while system is running ?
Yes, with minor only caveats in this case.
Obviously activity could happen behinds Backup's back. That may lead to stale data or inconsistencies. But in this situation you know not to expect many, if any, changes. You can, and should, accept that risk in this case by specifying /IGNORE=INTERLOCK.
>> I have some difficulties booting from IA64 DVD (after executing fsn:\efi\boot\bootia64.efi system displays two warnings and freezes with cursor blinking, thus I can not enter DCL commands - another issue to figure out...) so I have to run BACKUP utility on a live system.
That suggests to me that there may be a bigger IO problem, but it could also be the same one. If you can get (touch!) the box (guess you can to stick in the dvd), then I would power down and re-seat everything:
- memory
- disk drive
- drive cable
- pci interface
- pci cage (rx26[02]0?)
Good luck!
Hein
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 03:36 AM
09-08-2009 03:36 AM
Re: -RMS-F-RER, file read error
(after executing fsn:\efi\boot\bootia64.efi system displays two warnings and freezes with cursor blinking, thus I can not enter DCL commands - another issue to figure out...)
Would you care to show us those warnings ? Maybe we can draw conclusions from seeing the real messages and further help you along...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 03:36 AM
09-08-2009 03:36 AM
Re: -RMS-F-RER, file read error
BACKUP/IGNORE=INTERLOCK may cause corruption (read: data loss) when files are being written to during update: indexed or relative files, databases...
To minimise the risk, stop every process that may update a file, and stop databases during backup. If it is possible to dismount the disk from the system, mount it locally, you can safely back it up.
Keep in mind though, that files that are corrupted on disk, will be backup in that (corrupted) state.
Eventually, think about rebooting the machine Minimal to do your backup.
Do so ASAP. The disk seems broken to be. this is an error you would not want to see .
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 03:43 AM
09-08-2009 03:43 AM
Re: -RMS-F-RER, file read error
Some more details regarding boot from DVD:
Regular OpenVMS boot displays the same warnings:
LOADER-W-Conout device path cannot be set to multiple devices
LOADER-I-Select Console Device Paths from the Boot Manager Menu.
then freezes for about 5-10 minutes and then I see the system being loaded correctly.
After mounting DVD in EFI shell and executing 'fsn:\efi\boot\bootia64.efi' I see the same two warning as if it is a regular OpenVMS boot up with the only difference - it freezes for a longer time. I tried to wait almost half an hour with no result.
p.s. ANAL/DISK/READ finished running, shows ~100-150 parity errors (all data on a disk is about 40Gbs).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 03:50 AM
09-08-2009 03:50 AM
Re: -RMS-F-RER, file read error
please see the OpenVMS I64 V8.3-1H1 Server Upgrade and Installation manual chapter A.2 for how to set up your consoles.
ftp://ftp.hp.com/pub/openvms/doc/BA322-90077.PDF
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 03:55 AM
09-08-2009 03:55 AM
Re: -RMS-F-RER, file read error
you need to take the caches into account !
Once you've copied the files and received the parity errors, the disk block data may still be cached somewehere - most likely in the OpenVMS XFC cache. When you then COPY the files again, all the blocks will be taken out of the cache and no real disk access may happen !
This may explain what you're seeing.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 04:29 AM
09-08-2009 04:29 AM
Re: -RMS-F-RER, file read error
That would seem wrong and NOT the OpenVMS way of doing things. That would imply that a second application could silently get bad data and build on top of that, write it back, and in the process of the write 'fix' the disk through hardware bad block re-vectoring, and all evidence would be gone.
Yikes!
If no one else replies here with more insights, then that is something I'll have to try some say. We could use the LDdriver to inject a parity error, and then test with that.
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 04:33 AM
09-08-2009 04:33 AM
Re: -RMS-F-RER, file read error
you may be right. The cache (XFC) theory was just a good explanation for what Dmitry is seeing. Maybe there are other caches (on disk?) involved as well ? Or the PARITY error is somewhere else and only shows up under certain IO load patterns ?!
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 04:57 AM
09-08-2009 04:57 AM
Re: -RMS-F-RER, file read error
As for the cache being involved - I'm not completely sure but it seems that cache is not the case:
1) I 'm copying 100Mb file, after parity error I see only 80Mb size chunk of the initial EXE image in a target directory - COPY operation terminated somewhere on 80th megabyte.
2) I invoke copy operation again and it copies all 100 Mb correctly. If file was cached during first copy operation - there must have been only 80Mb which were cached before the 'parity error' occurred, am I right?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 05:02 AM
09-08-2009 05:02 AM
Re: -RMS-F-RER, file read error
with your more detailled description given now, I agreee, that the 'cache theory' is wrong.
You need to decode the errlog entries to better understand, what's going wrong.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 10:10 AM
09-08-2009 10:10 AM
Re: -RMS-F-RER, file read error
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2009 09:22 PM
09-08-2009 09:22 PM
Re: -RMS-F-RER, file read error
please start another topic for the 'DVD booting' problem. If necessary, you can still refer to this topic by including a link.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-09-2009 05:59 AM
09-09-2009 05:59 AM
Re: -RMS-F-RER, file read error
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1370165
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-09-2009 06:31 AM
09-09-2009 06:31 AM
Re: -RMS-F-RER, file read error
LOADER-W-Conout device path cannot be set to multiple devices
LOADER-I-Select Console Device Paths from the Boot Manager Menu.
...
can be resolved after reading:
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1338643
and then the documents referenced there...
if you have hardware service for this box, call your provider now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2009 04:59 AM
09-14-2009 04:59 AM
Re: -RMS-F-RER, file read error
Utility copied about 5-6 Gb of data and then got stuck in numerous 'read sector error'.
I guess it proves that OpenVMS system drive is non-recoverable. Also assuming BACKUP operation would fail on it too. Sad but true.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2009 05:14 AM
09-14-2009 05:14 AM
Re: -RMS-F-RER, file read error
But regardless, a stuffed-up disk is a stuffed-up disk.
That's what host-based volume shadowing (HBVS) is for; disks can and do fail.
And disk failure rates and failure patterns might not be as expected:
http://labs.hoffmanlabs.com/node/93