Re: DISK QUESTIONS

John Pendergrass · ‎09-15-2004

History: I received 4 disk errors (noted below) and despite the fact that the operation completed successfully, it would be helpful to know the following.

1)What is a SYMBOL ECC ERROR and what causes these?

2)Is there a method to map a logical block to a file name?

Mohamed K Ahmed · ‎09-15-2004

What are the errors exactly?

Willem Grooters · ‎09-15-2004

John,
1) IIRC: ECC = Error Correction Code. A method used to be able to correct minor errors in data. If the disk reads data, it will calculate a checksum and compare it with the stored ECC number. In case of mismatch, the disk firmware is able to correct this, based on the algorithm.
2) Yes.
If you want to see the blocks that hold the contents of a file, this is the command:

$ DUMP/HEADER/BLOCKS=COUNT=0

You'll find the retrieval pointers in the end of the output. These are the logical blocks where a fragment starts and the number of block that the fragment spans.

Doing it the other way requires a scan of INDEXF.SYS. the DFU utility has the facility to do this. If you know the Logical Block Number (LBN), SEARCH /LBN= will give you the file(s) that map on this block

Willem

Willem Grooters
OpenVMS Developer & System Manager

Uwe Zessin · ‎09-15-2004

In this context, a 'SYMBOL' means a group of bits. Sounds like the disk drive was able to correct the error, so it delivered the data and the operation completed successfully.

.

labadie_1 · ‎09-15-2004

You should worry when you see "uncorrectable ECC error". Of course, other errors require your attention too, but I think this one is the "worst"

Guillou_2 · ‎09-15-2004

Hi,

if you have the freeware DFU you can do
dfu>search/lbn=

regards

Steph

John Eerenberg · ‎09-16-2004

If you have DECevent (or similar), you can look for "Recovered Error." If you keep getting this recoverable error often and the LBA's are fairly close, then it is time to replace the disk.

You have 4 errors so far, if you get more, think about replacing the disk. At least that is what we do in being proactive (maybe a little too). The frequency of the error is something you can discuss with your HP service rep.

It is better to STQ then LDQ

Bojan Nemec · ‎09-16-2004

Hi,

There is a small command procedure to find the file containing the logical block. In fact two command procedures:
lbn.com

$ if f$trnlnm("file_found","lnm$job").nes."" then deassign/job file_found
$e:
$ on warning then goto e
$l:
$ f = f$search(p1)
$ if f.eqs."" then goto end
$ s='f$file(f,"EOF")'
$ if s.gt.0
$ then
$ pipe dump/head/block=end=0 'f' | -
search/exact/match=and sys$pipe "Count:","LBN:" | -
@lbn1 'p2'
$ endif
$ if f$trnlnm("file_found","lnm$job").eqs."" then goto l
$ deassign/job file_found
$ write sys$output f
$end:

and
lbn1.com

$ sea = 'p1'
$l:
$ read sys$pipe l/end=end/error=end
$ l=f$edit (l,"compress,trim")
$ blocks = f$element(1," ",l)
$ lbn = f$element(3," ",l)
$ elbn = 'lbn' + 'blocks'
$ if sea.ge.lbn.and.sea.lt.elbn
$ then
$ write sys$output "Found ''sea' betwen ''lbn' and ''elbn'"
$ define/job file_found 1
$ endif
$ goto l
$end:

Run the lbn.com command procedure with a wildcard file specification as first parameter and the logical block number as the second parameter.

To search the whole system disk type:
$ @lbn sys$sysdevice:[*...]*.*;* 123456

This will (very slowly) search all the files for that LBN. Open files will not be searched! But you will receive a message like this
%SYSTEM-W-ACCONFLICT, file access conflict
\SYS$SYSROOT:[SYSEXE]ACME$SERVER_CONFIG.TMP;1\

for them.

Bojan

Wim Van den Wyngaert · ‎09-17-2004

John,

I'm not 100% sure of this answer!

I have the impression that the disk errors are reported by the disk controller. If it says recovered, it means that the layer reporting the error has recovered it and there is no problem (e.g; by disk mirror or raid).

If it was not recovered, it is possible that the shadowing software corrected it.

Without shadowing, the error was passed with a "presumably correct version of the data" to the program. If the program didn't test the IO status, it is possible that it continued. It is also possible that the data was incorrect ...

To find corrupt files : anal/disk/read disk.

To repair : delete/erase or purge/erase after you recovered the contents. I found that /noerase didn't always repaired the error.

Wim

Wim

Wim Van den Wyngaert · ‎09-17-2004

John,

I also had disks that gave 200 errors in a few weeks and after that al went fine. So, 4 is no problem.

Wim

Wim

Uwe Zessin · ‎09-17-2004

$ ANALYZE /DISK_STRUCURE /READ_CHECK

will not necessarily find 'corrupt files'. It simply reads all allocated blocks twice and compare the data (according to HELP). That only ensures that the media can be read.

If the data has been 'corrupted' by some other error it can not report this, because it does not understand the logical structure of a file.

.

John Pendergrass · ‎09-17-2004

Thanks guys for all the response. Don't know why it took so long for it to post as I submitted the question a couple of months back. I'll try DFU next time it occurs.

Wim Van den Wyngaert · ‎09-17-2004

John, Uwe,

The anal/disk/read will read all blocks and check for pariry errors. Thats's the only error I got until now. It may b e that it reads twice but in my opinion modern disks should return the same contents twice (but again, not 100% sure).

For shadow sets this could result in disk errors but those will be most probably be corrected.

Wim

Jan van den Ende · ‎09-17-2004

Wim,

To repair : delete/erase or purge/erase after you recovered the contents. I found that /noerase didn't always repaired the error.

I guess this is expectable behaviour:

On a DELETE (which purge also is) with (may be implied) /NOerase, you just mark the disk blocks as available, clean up the header, and do everything that goes with keeping it consistent. You do NOT do anything to the contents of the diskblocks themselves, so any pattern on the disk reported as bad, stays there. If you apply /ERASE, the disks blocks ARE written to. If that can be done without errors, the error is gone. An dif not, that block is re-located by the drive, and although PHYSICALLY the error is still there, it will LOGICALLY be gone. The INTERNAL disk functionality will have to move the head to the bad block replacement area the address this block, but to the world outside the drive it is PRESENTED as errorfree.

Jan

Don't rust yours pelled jacker to fine doll missed aches.

Wim Van den Wyngaert · ‎09-19-2004

Jan,

But what if the "bad block flag" is set for that file ? Is /erase still necessary ?
I wonder which utilities set the flag when encountering a bad block ...

Wim

Wim

Keith Parris · ‎09-27-2004

If there is a Forced-Error Flag set for the sector, it will be reset when the sector is overwritten. Overwriting may take place either when you include the /ERASE qualifier on a DELETE or PURGE, or some time later when the LBN is allocated to another file and the contents are overwritten with new data at that time.

Wim Van den Wyngaert · ‎09-27-2004

Keith,

Do you mean that during delete/erase, the block is not moved to the bad block list of the disk itself nor the one of VMS ?

Wim

Wim

Keith Parris · ‎09-27-2004

I don't think the VMS bad-block mechanism (BADBLK.SYS, etc.) gets used much these days. (Maybe on floppy disks?) Modern disks tend to have their own internal bad-block revectoring mechanisms. Generally, the bad block is revectored by the drive under control of VMS when an error that is uncorrectable (or correctable, but beyond a specified error severity threshold) is first detected; the data (as best it can reconstructed -- if it can't be fixed completely, a Forced-Error Flag is included on the sector) is moved to a new sector at the time of the error. So when you rewrite it (with the erase pattern using /ERASE, or by overwriting it as you populate a new file later), you're writing to the revectored location provided by the drive.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: DISK QUESTIONS

DISK QUESTIONS