Operating System - OpenVMS
1753513 Members
4913 Online
108795 Solutions
New Discussion

-RMS-F-RER, file read error

 
SOLVED
Go to solution

-RMS-F-RER, file read error

Hi all,

We have OpenVMS V8.3-1H1 running on IA64. One physical 137Gb SCSI disk drive, mounted as one logical device, no partitioning.
Everything was fine so far but recently I’ve started noticing errors while copying EXE images. For example I have a folder with 100 exe images (each is ~50Mb size). While copying folder’s contents to another one I have one-two errors like this:

SYSTEM$ copy [.srv1]*.exe [.srv2]
%COPY-E-READERR, error reading SYS$SYSDEVICE:[USER.SRV1]CR080929.EXE;1
-RMS-F-RER, file read error
-SYSTEM-F-PARITY, parity error
%COPY-W-NOTCMPLT, SYS$SYSDEVICE:[USER.SRV2]CR080929.EXE;1 not completely copied

which results in partially copied exe. Though after it I can copy these ‘failed’ EXE files one by one and it goes fine in most cases.

As far as I understand it is a hardware related error which leads to the question 'Is there any chance to detect failed hardware or diagnose the system/hdd to reveal the root cause?'

Any input is appreciated.

Best regards,
Dmitry Sinelnikov
21 REPLIES 21
Volker Halle
Honored Contributor

Re: -RMS-F-RER, file read error

Dmitry,

the disk should have logged an error due to the parity error on read. You need to analyse SYS$ERRORLOG:ERRLOG.SYS to find out about the error and the affected LBA (logical block number).

You will most likely need to run SEA (part of WEBES) or even DECevent V3.4 (only available on OpenVMS Alpha) to decode/translate the error log entry and find this piece of information.

Volker.

Re: -RMS-F-RER, file read error

unfortunately there is only one log file in SYS$ERRORLOG created two months ago

$ show time
8-SEP-2009 13:57:53

$ dir /col=1 /date
Directory SYS$SYSROOT:[SYSERR]
ERRFMT_IPMI_SEL.DAT;1 17-JUL-2009 04:04:45.66
ERRLOG.SYS;1 17-JUL-2009 03:51:27.42

Total of 2 files.
Volker Halle
Honored Contributor
Solution

Re: -RMS-F-RER, file read error

Dmitry,

there it is: ERRLOG.SYS. All hardware-related errors are written/appended to that BINARY file.

You need a tool, to decode/translate the error information in that file.

OpenVMS (since V7.3-2) comes with ANAL/ERR/ELV, but this tool does not decode the details in most types of errlog entries.

You need SEA (System Event Analyzer), which comes as part of the WEBES tool suite. There is also a version of WEBES for Windows, if you don't want to install WEBES on your OpenVMS I64 system. But I'm not sure, if SEA will decode enough of the details of those disk errors to allow you to obtain the LBA numbers.

Only DECevent V3.4 will do this, but you need an OpenVMS Alpha system to install and run DECevent.

Volker.
Volker Halle
Honored Contributor

Re: -RMS-F-RER, file read error

Dmitry,

note that you could also run $ ANAL/DISK/READ on this disk, this should report all read errors, which occur on blocks allocated to any file on the disk.

Nevertheless, you would probably want to replace this disk, before the errors spread or increase. Make sure you keep a good backup of the data, but watch the backup operation log files, as BACKUP might also have problems reading those blocks.

Volker.

Re: -RMS-F-RER, file read error

Thank you, Valker.

I see numerouse errors in this file. Seems to be hdd related problem, since anal/disk/read returns numerous parity errors like this:
%ANALDISK-W-READFILE, file (9166,1,0) ACCOUNTNG.DAT;1
error reading VBN 3450886
-SYSTEM-F-PARITY, parity error

Thus I have a question regarding backup procedure. Is it possible to run it while system is running ? I have some difficulties booting from IA64 DVD (after executing fsn:\efi\boot\bootia64.efi system displays two warnings and freezes with cursor blinking, thus I can not enter DCL commands - another issue to figure out...) so I have to run BACKUP utility on a live system.

Hein van den Heuvel
Honored Contributor

Re: -RMS-F-RER, file read error


>> -SYSTEM-F-PARITY, parity error

That means trouble indeed.

But be aware of an other error you may get in the future after backup and restores involving an IO problem:

RER, file read error
FORCEDERROR, forced error flagged in last sector read

That would be the original error carried forward as a reminder that the data is not to be trusted. Cleared by writing the block (file).

>> Thus I have a question regarding backup procedure. Is it possible to run it while system is running ?

Yes, with minor only caveats in this case.
Obviously activity could happen behinds Backup's back. That may lead to stale data or inconsistencies. But in this situation you know not to expect many, if any, changes. You can, and should, accept that risk in this case by specifying /IGNORE=INTERLOCK.

>> I have some difficulties booting from IA64 DVD (after executing fsn:\efi\boot\bootia64.efi system displays two warnings and freezes with cursor blinking, thus I can not enter DCL commands - another issue to figure out...) so I have to run BACKUP utility on a live system.

That suggests to me that there may be a bigger IO problem, but it could also be the same one. If you can get (touch!) the box (guess you can to stick in the dvd), then I would power down and re-seat everything:
- memory
- disk drive
- drive cable
- pci interface
- pci cage (rx26[02]0?)

Good luck!
Hein

Volker Halle
Honored Contributor

Re: -RMS-F-RER, file read error

Dmitry,


(after executing fsn:\efi\boot\bootia64.efi system displays two warnings and freezes with cursor blinking, thus I can not enter DCL commands - another issue to figure out...)


Would you care to show us those warnings ? Maybe we can draw conclusions from seeing the real messages and further help you along...

Volker.
Willem Grooters
Honored Contributor

Re: -RMS-F-RER, file read error

On the backup issue:

BACKUP/IGNORE=INTERLOCK may cause corruption (read: data loss) when files are being written to during update: indexed or relative files, databases...
To minimise the risk, stop every process that may update a file, and stop databases during backup. If it is possible to dismount the disk from the system, mount it locally, you can safely back it up.
Keep in mind though, that files that are corrupted on disk, will be backup in that (corrupted) state.

Eventually, think about rebooting the machine Minimal to do your backup.

Do so ASAP. The disk seems broken to be. this is an error you would not want to see .
Willem Grooters
OpenVMS Developer & System Manager

Re: -RMS-F-RER, file read error

Interesting thing about these 'parity errors' is that I can still easily copy the file one by one as I wrote before. Am I right in my understanding that parity errors doesn't mean physical disk block corruption?

Some more details regarding boot from DVD:
Regular OpenVMS boot displays the same warnings:
LOADER-W-Conout device path cannot be set to multiple devices
LOADER-I-Select Console Device Paths from the Boot Manager Menu.
then freezes for about 5-10 minutes and then I see the system being loaded correctly.
After mounting DVD in EFI shell and executing 'fsn:\efi\boot\bootia64.efi' I see the same two warning as if it is a regular OpenVMS boot up with the only difference - it freezes for a longer time. I tried to wait almost half an hour with no result.

p.s. ANAL/DISK/READ finished running, shows ~100-150 parity errors (all data on a disk is about 40Gbs).