HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- DL320s P400 array controller operation with SATA
ProLiant Servers (ML,DL,SL)
1825780
Members
2230
Online
109687
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2007 10:32 AM
11-07-2007 10:32 AM
DL320s P400 array controller operation with SATA
I have several DL320s running RedHat Linux at the 2.6 Kernel.
I have had an issue multiple times when using 12 SATA drives on a P400 array controller at 2.08 firmware. When a drive fails, it doesn't fail outright. I/O to the array slows down until the server is useless. After a shutdown and POST, the array controller reports a failed drive. I replace the drive, it rebuilds and performance is back where it ought to be.
Is there any reason the P400 doesn't detect a degraded drive during operation and kick it out of the array so the server can continue to function on parity or mirror copy? It seems to heroically attempt to retry I/O when it would make more sense to kick the drive out.
device_point='/dev/cciss/c0d2/part2' recent_max_lat=4387685us ops_out=5 oldest_op_out=10155183us (excessive)
Has anyone seen this?
Additionally, the ADU report shows extremely high counts on the "bus faults" count. This may be the nature of SATA, but mostly the fields are unpopulated...all zero's except read blocks, write blocks, and bus faults.
I have had an issue multiple times when using 12 SATA drives on a P400 array controller at 2.08 firmware. When a drive fails, it doesn't fail outright. I/O to the array slows down until the server is useless. After a shutdown and POST, the array controller reports a failed drive. I replace the drive, it rebuilds and performance is back where it ought to be.
Is there any reason the P400 doesn't detect a degraded drive during operation and kick it out of the array so the server can continue to function on parity or mirror copy? It seems to heroically attempt to retry I/O when it would make more sense to kick the drive out.
device_point='/dev/cciss/c0d2/part2' recent_max_lat=4387685us ops_out=5 oldest_op_out=10155183us (excessive)
Has anyone seen this?
Additionally, the ADU report shows extremely high counts on the "bus faults" count. This may be the nature of SATA, but mostly the fields are unpopulated...all zero's except read blocks, write blocks, and bus faults.
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2007 03:35 PM
11-07-2007 03:35 PM
Re: DL320s P400 array controller operation with SATA
Hello Dave,
Refer: http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?locale=en_US&o
bjectID=c01068337
Im aware, that you WONT experience BSOD; But updating the latest Drivers/FIRMWARE for the controller might help;
Also, "bus faults" or "SCSI bus Downshift" errors are more likely because of the CABLE than the HDD.
;) Regards.
Refer: http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?locale=en_US&o
bjectID=c01068337
Im aware, that you WONT experience BSOD; But updating the latest Drivers/FIRMWARE for the controller might help;
Also, "bus faults" or "SCSI bus Downshift" errors are more likely because of the CABLE than the HDD.
;) Regards.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-08-2007 05:05 AM
11-08-2007 05:05 AM
Re: DL320s P400 array controller operation with SATA
HappyDude,
Thanks for the suggestions. I am typically a strong believer in updating drivers and firmware. The issue you point out is a problem with MS storport driver. I am using Linux, so I don't think it is related.
In terms of bus faults and speed downshifts being related to the cable....I would tend to agree. There are 3 components in the cable path from controller to drive. The P400 has a cable routed to the front of the system, and this communicates through the mainboard to the hard drive backplanes. There appear to be two backplanes. So if cables are suspected, it could be any of those components.
The issue I am experiencing is not necessarily related to bus faults. In the SCSI world, bus faults at the numbers I am talking about would result in an extremely high number of BUS RESETS and it would be clear that there was a cable error (system would likely log parity errors too). I have seen high bus fault counts on every DL320s that I have looked at (at least 6).
The issue I have is an array slowing down to the point that the acknowledgement of the write is way out of bounds for any application or host to deal with. The problem seems to revolve around retries on a disk. The way the Smart Array controller line is advertised is to predictively fail using SMART data from the drive, or determine that a disk is not responding and fail it. This does not occur until I power the system off, then back on. On POST, the array controller fails the drive. There is no evidence of problems with that drive until I do a cold boot. I believe this is an issue with the array controller. I have been slowly moving my controllers to 4.06 firmware, however the release notes for this version make no mention of this issue. They simply state SATA performance improvements.
I am using battery backed cache on the controller in all circumstances. 50% read / 50% write. Unless I am filling up all that cache and going to write through, I would excpect an acknowledgement of a write.
Dave
Thanks for the suggestions. I am typically a strong believer in updating drivers and firmware. The issue you point out is a problem with MS storport driver. I am using Linux, so I don't think it is related.
In terms of bus faults and speed downshifts being related to the cable....I would tend to agree. There are 3 components in the cable path from controller to drive. The P400 has a cable routed to the front of the system, and this communicates through the mainboard to the hard drive backplanes. There appear to be two backplanes. So if cables are suspected, it could be any of those components.
The issue I am experiencing is not necessarily related to bus faults. In the SCSI world, bus faults at the numbers I am talking about would result in an extremely high number of BUS RESETS and it would be clear that there was a cable error (system would likely log parity errors too). I have seen high bus fault counts on every DL320s that I have looked at (at least 6).
The issue I have is an array slowing down to the point that the acknowledgement of the write is way out of bounds for any application or host to deal with. The problem seems to revolve around retries on a disk. The way the Smart Array controller line is advertised is to predictively fail using SMART data from the drive, or determine that a disk is not responding and fail it. This does not occur until I power the system off, then back on. On POST, the array controller fails the drive. There is no evidence of problems with that drive until I do a cold boot. I believe this is an issue with the array controller. I have been slowly moving my controllers to 4.06 firmware, however the release notes for this version make no mention of this issue. They simply state SATA performance improvements.
I am using battery backed cache on the controller in all circumstances. 50% read / 50% write. Unless I am filling up all that cache and going to write through, I would excpect an acknowledgement of a write.
Dave
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Support
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP