GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Legacy
- >
- Netservers
- >
- Re: LH4r RAID failure
Netservers
1847431
Members
2686
Online
110264
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-07-2003 08:20 AM
08-07-2003 08:20 AM
LH4r RAID failure
Hi there,
I've got a very nasty prob here: On one of our LH4r we have a RAID 5 consisting of 5 36 Gig drives connected to the onboard controller. The drive in slot 5 (ID08) failed, leaving the RAID degraded. I inserted an brand new additional disk into slot 6 (ID09) and defined it as a hot spare. Rebuild began, and 20 minutes later, the hot spare failed as well. I replaced the drive which originally failed in the 5th slot with another new disk. Rebuild began, and at 13% this drive switched to fail. I inserted my last new disk into slot 5, and at 13% this one switched to fail, too.
Any hints what to do next HIGHLY appreciated. Our data is hanging on a string.
Thanks,
Peter
I've got a very nasty prob here: On one of our LH4r we have a RAID 5 consisting of 5 36 Gig drives connected to the onboard controller. The drive in slot 5 (ID08) failed, leaving the RAID degraded. I inserted an brand new additional disk into slot 6 (ID09) and defined it as a hot spare. Rebuild began, and 20 minutes later, the hot spare failed as well. I replaced the drive which originally failed in the 5th slot with another new disk. Rebuild began, and at 13% this drive switched to fail. I inserted my last new disk into slot 5, and at 13% this one switched to fail, too.
Any hints what to do next HIGHLY appreciated. Our data is hanging on a string.
Thanks,
Peter
- Tags:
- RAID
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-07-2003 11:44 AM
08-07-2003 11:44 AM
Re: LH4r RAID failure
We had a Dell server give us false failures a few months ago. Do you happen to have an external device (tape drive) running off the same controller? If so, try removing it and then rebuilding with one of the drives that failed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-07-2003 09:19 PM
08-07-2003 09:19 PM
Re: LH4r RAID failure
Rebuilds fail for a variety of reasons. Out of date firmware, problems with one of the active drives (such as media or "other" errors), or even problems with parity data on the array.
I probably don't need to tell you this but make sure you have a VERY GOOD BACKUP of your data before doing any of this trouble-shooting.
The fact that the rebuild seems to fail at the same percentage repeatedly would lead me to think that there is a problem with one of the drives that is still part of the array. If the controller has problems reading the data from that drive, any rebuild will be doomed to failure.
To see if this is the problem, you need to check the properties of all the drives in the array to see if there are any errors. You can do this in the NOS by using NetRAID Assistant (or Novel MegaManager). If you don't have this utility installed, you can check the properties of the drives in NetRAID express tools by pressing control M during POST.
Are there any errors on the drives? If there are errors, then it might be possible to get around the bad blocks by connecting the drives to a regular scsi controller and then running a verify media operation on the drives to remap the bad blocks. But, this COULD make the situation worse by actually corrupting data as good and/or corrupt data is copied from bad blocks to good blocks.
If there are no HW errors, then you might have problem with either the hard drive firmware. It is possible that a communication error caused by out of date firmware is causing the rebuild to fail. If this is the case, then you can update the firmware and try the rebuild again.
You can check HDD firmware under the drive properties. You can download the CD image for the FW update utility with all possible NetServer HDD firmware files at:
http://h20004.www2.hp.com/soar_rnotes/bsdmatrix/matrix65146en_US.html
Run this utility which will tell you if any of the drives need to have their firmware updated. The beauty of downloading the huge CD image file is that there is no guesswork about which of the dozens of FW files you need to download: the utility will find the files on the CD and update accordingly.
If the parity information is bad, there is basically nothing you can do: the parity info the controller uses to rebuild the failed drive is bad so a rebuild will NEVER work. In this case, you have no option other than to back up data, re-format the drives in the array, reinstal the operating system then restore from backup.
I hope this helps.
Alicia
I probably don't need to tell you this but make sure you have a VERY GOOD BACKUP of your data before doing any of this trouble-shooting.
The fact that the rebuild seems to fail at the same percentage repeatedly would lead me to think that there is a problem with one of the drives that is still part of the array. If the controller has problems reading the data from that drive, any rebuild will be doomed to failure.
To see if this is the problem, you need to check the properties of all the drives in the array to see if there are any errors. You can do this in the NOS by using NetRAID Assistant (or Novel MegaManager). If you don't have this utility installed, you can check the properties of the drives in NetRAID express tools by pressing control M during POST.
Are there any errors on the drives? If there are errors, then it might be possible to get around the bad blocks by connecting the drives to a regular scsi controller and then running a verify media operation on the drives to remap the bad blocks. But, this COULD make the situation worse by actually corrupting data as good and/or corrupt data is copied from bad blocks to good blocks.
If there are no HW errors, then you might have problem with either the hard drive firmware. It is possible that a communication error caused by out of date firmware is causing the rebuild to fail. If this is the case, then you can update the firmware and try the rebuild again.
You can check HDD firmware under the drive properties. You can download the CD image for the FW update utility with all possible NetServer HDD firmware files at:
http://h20004.www2.hp.com/soar_rnotes/bsdmatrix/matrix65146en_US.html
Run this utility which will tell you if any of the drives need to have their firmware updated. The beauty of downloading the huge CD image file is that there is no guesswork about which of the dozens of FW files you need to download: the utility will find the files on the CD and update accordingly.
If the parity information is bad, there is basically nothing you can do: the parity info the controller uses to rebuild the failed drive is bad so a rebuild will NEVER work. In this case, you have no option other than to back up data, re-format the drives in the array, reinstal the operating system then restore from backup.
I hope this helps.
Alicia
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Events and news
Customer resources
© Copyright 2026 Hewlett Packard Enterprise Development LP