HPE EVA Storage

CHKDSK /R Question and reported error on Windows 2003 64bit

 

CHKDSK /R Question and reported error on Windows 2003 64bit

I've got a LUN presented from our EVA8100 server to a Windows 2003 64bit server. The LUN is 599GB.

Over the weekend, the LUN started reporting free space at 29GB Free. HOWEVER, the free space should have been ~470GB.

The LUN in question is part of a clustered resource with 2 other LUNS. None of the other LUNS reported any issues and the EVA logs show no issues that I am aware of.

Now, I unmounted the LUN (G:\ Drive) within cluster services and remounted it.

At this point, it showed free space correctly.

To give myself a "warm & fuzzy" I ran a CHKDSK /R on the LUN and ran into the following error:

C:\>chkdsk g: /r
The type of the file system is NTFS.
Volume label is Oracle02.

CHKDSK is verifying files (stage 1 of 5)...
3696 file records processed.
File verification completed.
1282 large file records processed.
0 bad file records processed.
0 EA records processed.
0 reparse records processed.
CHKDSK is verifying indexes (stage 2 of 5)...
10802 index entries processed.
Index verification completed.
5 unindexed files processed.
CHKDSK is verifying security descriptors (stage 3 of 5)...
3696 security descriptors processed.
Security descriptor verification completed.
80 data files processed.
CHKDSK is verifying file data (stage 4 of 5)...
The disk does not have enough space to replace bad clusters
detected in file 669 of name .

An unspecified error occurred.

C:\>

Our PC/LAN group advised doing a reboot and redoing the CHKDSK so I did.

The same error occurred, but with a different file number (619).

So, I'm curious what I might be looking at here. Is this a "real" error, and if so, what are some possible solutions? Anyone else seen anything like this?

We are in the process of opening up a ticket with HP support on this as well but I wanted to get your input.

Thanks,
Chris Taylor
Oracle DBA
7 REPLIES 7
Patrick Terlisten
Honored Contributor

Re: CHKDSK /R Question and reported error on Windows 2003 64bit

Hello,

it seems that chkdsk is trying to rebuild some bad clusters. A harddrive normally have some sectors for such operations, but not a presented vdisks from an EVA. A vdisks is something virtual, there are no bad sectors/ clusters or anything like that. Can you run chkdsk without /R? What's the output.

Best regards,
Patrick
Best regards,
Patrick

Re: CHKDSK /R Question and reported error on Windows 2003 64bit

We didn't run it without the /R and now the LUN is online and in use so I can't check it.
Patrick Terlisten
Honored Contributor

Re: CHKDSK /R Question and reported error on Windows 2003 64bit

Hello Chris,

did you noticed any failures in the event log of the server? For example disk errors or something like that?

Best regards,
Patrick
Best regards,
Patrick

Re: CHKDSK /R Question and reported error on Windows 2003 64bit

The only thing we saw was the incorrect file allocation reporting at the OS level.

It is my understanding that clusters are a logical grouping of sectors on disks.

I wish I could draw a picture here.

So, lets say I have 512-byte clusters, and I have a 2K file, then I will use 4 clusters in writing that file. Now, lets say each cluster is on a different physical disk in the EVA. If I'm having cluster (logical) corruption, I would assume a CHKDSK /R would repair the clusters based on the information contained on the VDISK. I think what I'm trying to say is that a CHKDSK /R should operate on a VDISK the same as a regular disk drive when we're talking about logical file structures (vs. physical structure) - isn't that correct?

Or am I blowing smoke?

Re: CHKDSK /R Question and reported error on Windows 2003 64bit

Crap. Actually there is errors in the System Log of the event viewer.

Give me a few minutes to pull these together.

I swear I didn't see these last night. Maybe I didn't go back far enough.

Re: CHKDSK /R Question and reported error on Windows 2003 64bit

Ok, most of this info is from our passive node (it was passive during the time of the experiencing the disk issues on the active node). This node is now the active node. We're going to bring up the other node here in a bit, and I'll check the event logs on it again.

12/25/2009 12:27:04PM
The Microsoft Software Shadow Copy Provider service entered the running state.

12/25/2009 12:30:15 PM
Source: NTFS
EVENT ID: 55
Type: Error
The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume \Device\HarddiskVolumeShadowCopy92.


12/25/2009 12:32:17 PM
End of NTFS errors (the NTFS errors were repeated 274 times)

12/25/2009 11:22:12 PM
hpqilo2 reports warning
NetFN 0x4, command 0x2D timed out
Type: Warning
Event ID: 57
Source: hpqilo2

12/25/2009 11:22:12 PM
Failed GET SENSOR READING, sensor 20
Type: Warning
Source: hpqilo2
Event ID: 57
(Many HPQILO errors follow until 11:24)

12/27/2009 02:22:49 AM
The Distributed Link Tracking log was corrupt on volume Y: and has been re-created. This log is used to automatically repair file links, such as Shell Shortcuts and OLE links, when for some reason those links become broken.
Source: Distributed Link Tracking
Type: Error
Event ID: 12503

12/27/2009 07:36:52 PM
HP MPIO DSM for EVA4x00/6x00/8x00 family of Disk Arrays is attempting an operation on \Device\MPIODisk2. The Type is noted in the dump data.
Source: MPIO
Type: Information
Event ID: 37
(I think this is the CHKDSK /R command but not sure)

Re: CHKDSK /R Question and reported error on Windows 2003 64bit

Strange, there are no errors on the ACTIVE node during this time. The Active node was the only node with these drives mounted.