- Community Home
- >
- Storage
- >
- Entry Storage Systems
- >
- Disk Enclosures
- >
- Problem w/ Raid4SI & DS2100
Disk Enclosures
1752230
Members
5452
Online
108785
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-12-2003 05:47 AM
09-12-2003 05:47 AM
Problem w/ Raid4SI & DS2100
Setup:
HP9000 L2000
A5856A RAID 4SI controller
Six DS2100 disk cabinets
Twenty 36GB drives
This equipment is in a managed colocation center with highly redundant environmental controls (power, cooling, etc).
The storage is carved as two logical drives. One is 0+1, the other 0+5. Three hotspares. All filesystems, including boot, are on this external storage.
Yesterday morning both logical drives went offline.
I booted from the install disk and ran the raid configuration tool. The system was reporting SEVEN! failed drives. Three of these were the hotspares, which were all now assigned to a raid array. The three drives for which they had taken over were all marked Ready. But they must have earlier also been marked as Failed, since the hotspares kicked in.
So call it 10 failed drives. Or whatever... suffice it to say that a bunch were marked "Failed" in a very short timeframe, a few minutes max.
These failed drives were spread over at least three of the four SCSI channels on the RAID 4SI controller.
With few options, I just started forcing the "failed" drives to an "online" state, ignoring the warnings about data integrity.
Rebooted and the system seems fine. irconcheck is running now and, at 80% complete, hasn't yet reported any problems.
Questions:
1. What happened?
2. What can I do to ensure it never happens again?
TIA.
HP9000 L2000
A5856A RAID 4SI controller
Six DS2100 disk cabinets
Twenty 36GB drives
This equipment is in a managed colocation center with highly redundant environmental controls (power, cooling, etc).
The storage is carved as two logical drives. One is 0+1, the other 0+5. Three hotspares. All filesystems, including boot, are on this external storage.
Yesterday morning both logical drives went offline.
I booted from the install disk and ran the raid configuration tool. The system was reporting SEVEN! failed drives. Three of these were the hotspares, which were all now assigned to a raid array. The three drives for which they had taken over were all marked Ready. But they must have earlier also been marked as Failed, since the hotspares kicked in.
So call it 10 failed drives. Or whatever... suffice it to say that a bunch were marked "Failed" in a very short timeframe, a few minutes max.
These failed drives were spread over at least three of the four SCSI channels on the RAID 4SI controller.
With few options, I just started forcing the "failed" drives to an "online" state, ignoring the warnings about data integrity.
Rebooted and the system seems fine. irconcheck is running now and, at 80% complete, hasn't yet reported any problems.
Questions:
1. What happened?
2. What can I do to ensure it never happens again?
TIA.
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP