- Community Home
- >
- Storage
- >
- Midrange and Enterprise Storage
- >
- HPE EVA Storage
- >
- MSA1000 drive failure caused Pool deactivations an...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-05-2005 07:15 PM
тАО09-05-2005 07:15 PM
MSA1000 drive failure caused Pool deactivations and Volume dismounts
The servers have NetWare OES (V6.5 SP3) and the firmware on the MSA1000 was updated to the latest version (FabricOS v3.2.0a, MSA V4.48)
One of the U320 146GB drives failed last night, yet despite the MSA selecting a hot spare and starting an array rebuild as would be expected, every Pool and Volume on the NetWare server deactivated with "device failure" messages.
I was under the impression that the point of having a RAID array was that a drive failure would be seamlessly repaired and that functionality would not be impaired (only slowed a little depending on the priority of the Rebuild setting).
I do not understand why Pools residing on totally separate arrays (I have defined 4 separate RAID5 arrays across the MSA cabinets) also failed, nor why the server had to be power cycled in order to allow any volumes at all to be seen and mounted by clients again.
There is no redundancy built into the SAN infrastructure (no secondary SAN switch or duplexed fibres and HBA's) but then, even if there was, this device failure would I assume still have occurred, as it appears was a problem involving the MSA controller failing a low-level NetWare OS diskaccess request when the drive failed rather than responding with the requested data while repairing the fault in the background.
Any ideas on what is happening here?
Is this a configuration error or a fault in the MSA?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-06-2005 02:26 PM
тАО09-06-2005 02:26 PM
Re: MSA1000 drive failure caused Pool deactivations and Volume dismounts
According to a respondent on the Novell forums this is a problem that has been heard of before with MSA1000's and that I should contact HP Support.
I have now done this, however, the HP engineer has not heard of it.
Anyone else ever had a server lockup when a single disk in a RAID5 array has failed?
This fundamental flaw means our whole DAS to SAN migration programme is halted until the problem can be determined and resolved.
As a next step the only thing I can think of currently is to deliberately pull another disk to see if the identical fault reappears.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-13-2005 11:28 AM
тАО09-13-2005 11:28 AM
Re: MSA1000 drive failure caused Pool deactivations and Volume dismounts
I pulled a disk from one of the arrays deliberately and as expected it started a rebuild using one of the hot spares. This time though there was no 'device failure' error on the host server and read and write access continued unaffected.
A relief, but it doesn't explain the first time failure.
There has been no word on the HP Support case I raised either.
Guess will have to put it down to gamma rays.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-13-2005 01:57 PM
тАО09-13-2005 01:57 PM
Re: MSA1000 drive failure caused Pool deactivations and Volume dismounts
Very strange indeed!...the fact that you won't be able to reproduce the problem means its going to be hard to get a resolution from HP as well!....in the early days of our EVA implementations - during an online Vdisk exansion presented to a Windows host - the whole Vdisk was lost as in disappeared!! No response from HP - and we were never able to reproduce this problem either - !Mind you, since then I have done the same procedure heaps of time and have never had any problems.
So, not that I am helping you in any way with this message, I just wish you good luck in finding a resolution and let us know how you go with the case. Were there any other hosts (running different OSs) accessing storage from the MSA? or was the error only seen on Novell boxes - just trying to isolate that it really was an issue with MSA1000 rather than Novell or Novell with MSA1000 storage mix?
Regards,
Saket.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-13-2005 03:14 PM
тАО09-13-2005 03:14 PM
Re: MSA1000 drive failure caused Pool deactivations and Volume dismounts
At the time there was only one NetWare OES server connected to the SANswitch and powered up. Didn't even have a configuration loaded or saved at that time.
Since then we have added a Win2003 server and an ISL to another MSA1000 system.
(so there are now about 7 hosts on the one fabric - equal mix of NetWare 6.x and Win2003)
One other change I had made since the failure was updating the NetWare ACU from V2.76 to V2.77 (the latest), although there were no listed fixes of any magnitude or import in the latter's release info.
Can't see how that would have had much of an effect although this upgrade did fix one noticeable fault, which was that every time I exited the ACU there would be one (random) disk in the cabinet flashing its fault light - disconcerting for the other staff, who would initially see it and panic, thinking a drive had failed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-20-2006 12:43 PM
тАО06-20-2006 12:43 PM
Re: MSA1000 drive failure caused Pool deactivations and Volume dismounts
One suspicion is that since I have 23 volumes on that controller, over a dozen hosts AND the rebuild / expand prioritys were both set at HIGH... the controller ran out of resources and didn't actually disconnect hosts, but the hosts timed out from the slow response. Interesting though that I then rebooted the controller and the system is running like new. Has anyone heard of memory leaks associated with firmware version 4.32?