ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Smart Array P800 Controller Accelerator issue

 
OlofNord
Occasional Visitor

Smart Array P800 Controller Accelerator issue

Hello there.

We are running a DL180 G6 with a Smart Array P800 disc controller and 8 harddrives in RAID 5.

When logging into the server using HP System Management Homepage it states that the 'Accelerator' is disabled, and this accelerator have brought us lots of grief.

This particular disc controller are using two batteries to power a disk cache, and initially we interpreted this issue as coming from faulty batteries and replaced the first one, and then both.
This did not address the error message.


As that did not address it we have since then done the following:
Verified configuration
Cleared IML
Upgraded BIOS
Upgraded iLO
Upgraded HP server firmware+drivers
Upgraded centos

Restarting the server and booting to it using ACU:
#ctrl all show
Smart Array P800 in Slot 2

#ctrl all show status
Smart Array P800 in Slot 2
Controller Status OK
Cache Status OK
Battery/Capacitor Status OK

View controller info

#ctrl slot=2 show

Smart Array P800 in Slot

Bus Interface: PCI

Serial Number: PAFGF0M9SWF1AO

Cache Serial Number: PA82B0A9SWE37U

RAID 6 (ADG) Status: Enabled

Controller Status: OK

Hardware Revision: Rev E

Firmware Version: 7.22

Rebuild Priority: Medium

Expand Priority: Medium

Surface Scan Delay: 15 secs

Surface Scan Mode: Idle

Queue Depth: Automatic

Monitor and Performance Delay: 60 min

Elevator sort: Enabled

Degraded Performance Optimization: Disabled

Inconsistency Repair Policy: Disabled

Wait For Cache Room: Disabled

Surface Analysis Inconsistency Notification: Disabled

Post Promt Timeout: 0 secs

Cache board Present: True

Cache Status: OK

Accelerator Ratio: 25% Read / 75% Write

Drive Write Cache: Disabled

Total Cache Size: 512 MB

No-Battery Write cache: Disabled

Cache Backup Power source: Batteries

Battery/Capacitor Count: 2

Battery/Capacitor Status: OK

SATA NCQ Supported: True


Firmware version in use is  7.24, which is the latest version
Driver version in use is 3.6.28-RH3 (cciss, as hpsa is not compatible)
cciss is now part of the main linux kernel and should be upgraded with a yum update

The latest HP Service Pack for Proliant (2014.02.0B) is installed
The latest BIOS is installed (2013.07.01 A)
The iLO software is updated to the latest firmware (4.26)

A yum update have been issued.

Neither of the above have addressed the issue.
The only time this error message is cleared is shortly after a reboot -  but it then re-occurs approx one day later.

Have anyone else experienced this issue or have any information in regards to how to fix it?
Any help would be much appreciated.

Regards,
Olof Nord

 

 

P.S. This thread has been moved from ProLiant Storage Systems to ProLiant Servers (ML,DL,SL). - Hp Forum Moderator

2 REPLIES
waaronb
Respected Contributor

Re: Smart Array P800 Controller Accelerator issue

Are you talking about that entry that says "Drive Write Cache: Disabled" ?

I think that refers to the write-caching built into the drives themselves, not the array controller's own write caching. The array cache looks fine... batteries are okay and it's set to 25% read / 75% write.

The drive write cache is disabled by default. You can turn that on if you want in the controller settings: "Physical Drive Write Cache State"

Just be aware that a power loss or crash means anything in the drive's own write cache will be lost, it's not backed up by anything like the array controller's cache, so there is a risk involved. That's why it's disabled by default.
OlofNord
Occasional Visitor

Re: Smart Array P800 Controller Accelerator issue

hello waaronb, 

Thanks for your reply.

 

What I should have stated in my post, the reason for us looking into the system is that it flags up in the HP System Management Homepage as 'Temp disabled - Failed, Replace battery 2', please see attached screenshot.

 

What seems to happen is that following a reboot the error is cleared for some time (up to several hours) just to fail again.

 

The ACU setting/debug information gathered was taken following rebooting the system using the ACU 8.75 CD. As the information was gathered and the system booted back to CentOS again the error was cleared for some time, just as described above.