Server Management - Systems Insight Manager
1752609 Members
4048 Online
108788 Solutions
New Discussion юеВ

Re: Disk status

 
John Oliver3
Advisor

Disk status

On one Proliant DL380G3, there's a Management Log that says "POST Error: 1786-Drive Array Recovery Needed" But nowhere in SIM or on the main SMH page for this server is there any mention of disks at all.

I could swear that at a previous employer there was something that let me know if a disk went bad. It would be ridiculous to have to log in to every SMH and comb through the logs every day to check and see if there's a disk problem.

Am I missing something? I installed the whole PSP, so I shouldn't be missing anything. I think...
9 REPLIES 9
Rob Buxton
Honored Contributor

Re: Disk status

At the time the event happened an SNMP Trap would have been sent to HPSIM.
If no event was received you could check that the Trap Destination has been set correctly on the server. That needs to point to the HPSIM server with a suitable community name.

If the trap destination was set correctly, check the versions of the agents on the server and check the latest MIB Update (7.80) has been added to HPSIM.

Also check the events for the server, if there was no correct MIB loaded it may have just turned up as a generic unregistered trap. But you may be able to verify timings.

It is the Agents on the server that generate the trap to HPSIM. HPSIM will then process the event depending on the handling process you've set up.
Neysters, Lutz
Frequent Advisor

Re: Disk status

If you want to see all about your disk stati, you should log on to your SMH first. There should be a box "storage", provided by the "HP Insight Storage Agents", running as a service.
You will see an entry for each active controller in this box, e.g. "Smart Array 6400 Controller in Slot 1". Clicking on this entry, you will receive an overview of the status of the controller itself and each physical and logical disk belonging to this controller.
Just two things about the disk status can't be found there: File System Space Used and Logical Disks are informations provided in the box "Operating System". You only will be notified once by trap, if you set a threshold on your SMH. Everything will stay green on the SMH even if your Data Partition is 100% full.
David Claypool
Honored Contributor

Re: Disk status

A POST error means Power-On Self-Test--in other words, this happened before the OS even loaded. It should be logged in the Integrated Management Log and should result in HP SIM flagging the system as being in a non-green condition, leading you to investigate. If an error happens when the system is operational and the agents have a properly set event destination, the event will show up in the HP SIM event collections.
Timothy Cusson
Valued Contributor

Re: Disk status

John,

You have mentioned several different issues but you have not provided very much detail, so, I'll provide a brief step by step procedure on how to install the HP System Management Agents on a ProLiant Windows server.

1) Verify the SNMP service is installed and running, look in services. If not, go to Add/Remove Programs, Add/Remove Windows Component, Management and Monitoring Tools, Details, Simple Network Management Protocol

2)
a) Verify the minimum SNMP settings are configured. You must have a community string for READ, and optionally a community string for READ/CREATE. The HP agents use SNMP as a security feature; multiple servers must have matching community strings, a sort of password.
b) If you don├в t add the READ/CREATE, you can't change settings, and you can't clear the Integrated Management Log from the web based HP System Management Homepage.
c) Configure the SNMP trap destination, this tells the local HP agents where (which server or servers) to send agents event information, for example disk failure, memory error, redundant power supply failure...
d) After making any changes to SNMP settings, you must re-start the SNMP service, so the changes take effect.

3)
a) Now you are ready to install the HP Product Support Pack. You can download the PSP or you can use the HP PSP CDrom that came with the server.
b) The product opens and creates an Excel type list of software, drivers and agents that are available to install. NOTE: PSP will auto-magically install only what├в s needed and skip over the rest.
c) In the left column, you should see a red flag indicating configurable components. You must go in there and select what type of security you want HP Management Agents to use, the simplest is "Trust-ALL", the best one is "Trust by Certificate", but you need HP-SIM installed first, so HP-SIM can provide a certificate.
d) Now you can click the install button in the centre / top area. NOTE: When the PSP installs the new NIC drivers, the server will drop all network connections for 30 seconds or so! (this is called a gatcha) Sometimes you don't have to reboot, but I find it├в s a good idea to reboot after installing the PSP.

4)
a) OK, we're getting closer, we have SNMP installed and configured with community strings, we have the HP Management Agents configured with a security model, probably "Trust-All" to start with. Now click on the HP Management Homepage icon, a green square one.
b) You should see the server name and model on the top right, and six or seven sub menus like "Storage", "Environment", "System", "Software" and "Version Control". NOTE: If you don't see them, it means the agents failed to start. Check the Windows event log to see what happened? Usually, SNMP is not running or is not setup right.

5)
a) Everything should be green or blue in the HP-System Management Homepage (HP_SMH). To test the operation of the agents, if you have dual power supplies, un-plug one of them, wait about one or two minutes, then refresh the HP-SMH, you should see a Yellow condition at the top of the page, click on this and it will jump you right to the failing component, very convenient! Another test would be to unplug the iLO network cable.

b) All of the above was on the local machine, if you have a monitoring server running HP-SIM or some other "trap catching" software, the event you caused above would have generated a trap and sent it to SNMP on the local machine, which in turn would send it to the trap destination machine. So, log onto the HP-SIM machine and verify it received the trap. NOTE: If you have email notification configured on HP-SIM, you will receive an email about the trap/event.
b) If it did not receive the trap, you can troubleshoot by sending a test trap. One way is to go into control panel, then HP Agents, then click on "Send Test Trap". Another way is from the HP-SMH, click on settings, SNMP, then "Send SNMP Test Trap".

6)
Log files, almost all ProLiants (not the economy 100 series) keep a log file in NVRAM called the IML, Integrated Management Log. There are a few ways to view this log file...
a) - Start - Programs - HP System Tools - HP ProLiant Integrated Log Viewer
b) - Open HP SMH, click on Logs, Click on HP Integrated Management Logs
c) - Boot the server from HP SmartStart CD, use its Integrated Management Log viewer.

7)
a) The logs will have every hardware related event since the day the server was first turned on, unless someone clears them. Normally, the first entry is "POST Error: 1785 Drive Array not configured."
b) If you want to clear the log from the HP_SMH, you must have configured the SNMP write or create community string and enabled sets.
c) If you have administrative rights, you can clear the IML from the Programs, HP System Tools, HP ProLiant Integrated Management Log Viewer.

Regards,

Tim
John Oliver3
Advisor

Re: Disk status

Lutz:

In the "Storage" box I see "File System Space Used", "Standard IDE Controller on System Board" (both have green checkmarks), and "Floppy Drives" (has no symbol at all). No mention about the array controller at all. I do not have an "Operating System" box, either.

That box is running Linux, CentOS 4.4 IIRC.
John Oliver3
Advisor

Re: Disk status

David:

In SIM, this particular system is represented by a "Major" (triangle with !) under HS, "i"s under MP and SW, and a green check for "ES". In the SMH for this system, the "Major" items are "Embedded NEC98431" and "Remote Communication Options for Embedded NEC98431", probably because there's no patch cable in the iLO port. There are also question marks for the second Ethernet interface (unconfigured, unused) and the virtual interface under "NIC". Everything else with a symbol has a green checkmark.
John Oliver3
Advisor

Re: Disk status

Tim:

It's very possible that something is missing. At the moment, I have:

[root@vs1 ~]# rpm -qa | grep hp
hprsm-7.7.0-99.rhel4
hpsmh-2.1.7-168
hpdiags-7.7.0-142
rhpl-0.148.3-1
hpasm-7.7.0-115.rhel4
hpadu-7.70-12
[root@vs1 ~]# rpm -qa | grep cpq
cpqacuxe-7.70-12

I looked at the SMHes for a couple of systems running Windows, and they have items for the SmartArray that this system does not... I'm guessing I'm missing a package?
Neysters, Lutz
Frequent Advisor

Re: Disk status

quote
In the "Storage" box I see "File System Space Used", "Standard IDE Controller on System Board" (both have green checkmarks), and "Floppy Drives" (has no symbol at all). No mention about the array controller at all. I do not have an "Operating System" box, either.
That box is running Linux, CentOS 4.4 IIRC.
/quote

Sorry, John, you didn't mention your OperatingSystem.
What I wrote down was all about SMH on Windows, im my case Windows Server 2003. Seems to be quite different with Linux.
If you click on your "Standard IDE Controller" it should tell you all about the disks belonging to it (hope so). And a click on "File system Space Used" should be enlightening, too.
John Oliver3
Advisor

Re: Disk status

OK, it looks like this got resolved. Yesterday, I had noticed that there was a cpq_cciss-2.6.14-7 package that I hadn't installed, so I installed it. This morning, I got emails about trap alerts saying a drive had failed. I went to the SMH, and there is now an item for the SmartArray. I guess I had thought that the cciss package was just drivers, but it obviously contains an agent as well.