1748134 Members
3467 Online
108758 Solutions
New Discussion юеВ

Predictive Failure

 
Pavan Callikan
Occasional Advisor

Predictive Failure

Dear All

I'm having a problem with my integrated netraid contoller on an LH 3000 server running Windows 2000 Server. It so happens that yesterday when rebooting my server, after posting properly, i got the following message which said "Incorrect NVRAM configuration". I had not changed anything on my Raid system and luckily i went into the Netraid Express configuration and told it to read the configuration from disk. The machine then booted properly and further restatrts were alright.

What could have caused t6he problem? I checked to see the BIOS and system date and time to make sure that the CMOS battery hadn't died but it is fine. However after booting i find that in the Netraid assistant log file it keeps mentioning the following error message "NOTIFY: Check Condition on Ch 0 ID 2 with the following sense key - Fri May 17 17:20:28 2003 (dot) 70 00 0b 00 00 00 0 (dot) 28 00 00 00 00 47 0)" and then the following messages keep repeting " Notify message: Predictive failure detected". This goes on and on. What is the problem???
6 REPLIES 6
Alicia White
Esteemed Contributor

Re: Predictive Failure

Generally, when the controller gives a predictive failure error, it's a good idea to pay attention to it. I think the NVRAM error you had could have been caused by the bad drive.

Have you gone into either NetRAID Assistant or NetRAID Express Tools to see if the drive has any errors on it? If you have a lot of errors on the drive, I think that would be further indication that the drive needs to be replaced.

To analyze the sense key code, you can try two web sites:
http://www.adaptec.com/worldwide/support/supporteditorial.html?prodkey=interpreting_codes

http://www.spectralogic.com/index.cfm?fuseaction=support.senseCodeForm&CatID=247

The pairs of numbers are "bytes," which are numbered 0 to 13. The important bytes here are:

00 = condition (70 or 71 indicate an error)
02 = Sense Key Value
12 = Additional Sense Key (ASK)
13 = Additional Sense Key Qualifier (ASKQ)

According to both the Spectralogic and adaptec sites, the "0b" in byte 2 of the sense key means an aborted command. Meaning, for some reason, the command to the device could not be completed and was aborted. I entered your error info into the SL website (bytes 02, 12 & 13) and SL recommends the following trouble-shooting steps for this error:
1. check cabling & reseat connections
2. replace cable and/or terminator
3. replace drive
4. replace controller
(see http://www.spectralogic.com/index.cfm?fuseaction=support.senseCodeResult&CatID=311&senseKey=b&senseCode=47&senseQualifier=00)

I would recommend checking the drive for errors and then call NetServer Tech Support to get their advice. You should talk to them about the NVRAM error you saw and the Predictive Failure error you are seeing now. I think the drive needs to be replaced.

Alicia
Alicia White
Esteemed Contributor

Re: Predictive Failure

Generally, when the controller gives a predictive failure error, it's a good idea to pay attention to it. I think the NVRAM error you had could have been caused by the bad drive.

Have you gone into either NetRAID Assistant or NetRAID Express Tools to see if the drive has any errors on it? If you have a lot of errors on the drive, I think that would be further indication that the drive needs to be replaced.

To analyze the sense key code, you can try two web sites:
http://www.adaptec.com/worldwide/support/supporteditorial.html?prodkey=interpreting_codes

http://www.spectralogic.com/index.cfm?fuseaction=support.senseCodeForm&CatID=247

The pairs of numbers are "bytes," which are numbered 0 to 13. The important bytes here are:

00 = condition (70 or 71 indicate an error)
02 = Sense Key Value
12 = Additional Sense Key (ASK)
13 = Additional Sense Key Qualifier (ASKQ)

According to both the Spectralogic and adaptec sites, the "0b" in byte 2 of the sense key means an aborted command. Meaning, for some reason, the command to the device could not be completed and was aborted. I entered your error info into the SL website (bytes 02, 12 & 13) and SL recommends the following trouble-shooting steps for this error:
1. check cabling & reseat connections
2. replace cable and/or terminator
3. replace drive
4. replace controller
(see http://www.spectralogic.com/index.cfm?fuseaction=support.senseCodeResult&CatID=311&senseKey=b&senseCode=47&senseQualifier=00)

I would recommend checking the drive for errors and then call NetServer Tech Support to get their advice. You should talk to them about the NVRAM error you saw and the Predictive Failure error you are seeing now. I think the drive needs to be replaced.

Alicia
Alicia White
Esteemed Contributor

Re: Predictive Failure

Generally, when the controller gives a predictive failure error, it's a good idea to pay attention to it. I think the NVRAM error you had could have been caused by the bad drive.

Have you gone into either NetRAID Assistant or NetRAID Express Tools to see if the drive has any errors on it? If you have a lot of errors on the drive, I think that would be further indication that the drive needs to be replaced.

To analyze the sense key code, you can try two web sites:
http://www.adaptec.com/worldwide/support/supporteditorial.html?prodkey=interpreting_codes

http://www.spectralogic.com/index.cfm?fuseaction=support.senseCodeForm&CatID=247

The pairs of numbers are "bytes," which are numbered 0 to 13. The important bytes here are:

00 = condition (70 or 71 indicate an error)
02 = Sense Key Value
12 = Additional Sense Key (ASK)
13 = Additional Sense Key Qualifier (ASKQ)

According to both the Spectralogic and adaptec sites, the "0b" in byte 2 of the sense key means an aborted command. Meaning, for some reason, the command to the device could not be completed and was aborted. I entered your error info into the SL website (bytes 02, 12 & 13) and SL recommends the following trouble-shooting steps for this error:
1. check cabling & reseat connections
2. replace cable and/or terminator
3. replace drive
4. replace controller
(see http://www.spectralogic.com/index.cfm?fuseaction=support.senseCodeResult&CatID=311&senseKey=b&senseCode=47&senseQualifier=00)

I think the drive needs to be replaced. If the server is still under warranty, talk to Tech Support about the errors you are seeing and see if you can get the drive replaced under your warranty.

Alicia
Alicia White
Esteemed Contributor

Re: Predictive Failure

ARGGHHH!! Sorry about the multiple replies... I think the web site had a bit of problem there.

My apologies to all.

Alicia
Pavan Callikan
Occasional Advisor

Re: Predictive Failure

Dear Alicia

Thanks a lot for anserwing my question so quickliy, i would like to say that when i went into the NetRaid Assistant and checked the physical status of the drive attached to Channel 0 ID 2 and found that there were no errors found, (neither predictive or otherwise), i would however like to point out that the errors are still getting logged.

Is it possible that it is a false positive? Or should i give serious thought to changing the drive and keeping a hot spare ready in case of disaster?

Thanks
Alicia White
Esteemed Contributor

Re: Predictive Failure

I think there is a problem with false errors popping up with some NetRAID controllers. What would happen is that a group of errors would get "stuck" in the NVRAM of the controller. So, this same group of errors would be regenerated every time you started NetRAID Assistant.

There is a way to tell if the errors are false. What you need to do is start NetRAID Assistant and view the log. Note the time of the errors very carefully, did the last set of errors occur at the same time you started NetRAID Assistant (NRA)? Close down NRA and start it up again, look at the log, did you get more errors corresponding to the second time you started the utility?

But, when I used to support NetServers, I don't think I ever saw an issue of a "false" predictive failure. It usually involved an identical group of 6 or 7 sense key errors popping up every time NRA was started.

If you have any questions, contact Tech Support. I don't think they'll give you a problem with replacing the drive if you are still getting the predictive failure on the drive.

Alicia