HPE 3PAR StoreServ Storage
1820280 Members
3126 Online
109622 Solutions
New Discussion

HPE StoreServ Controller Node Issue

 
SOLVED
Go to solution
HP1133
Occasional Advisor

HPE StoreServ Controller Node Issue

Hi im working with a HPE 3PAR StoreServe 8400 controller node system. One of the two controller nodes (Node 0) in Cage0 has failed (Node Offline due to failure 0xd). 

On looking at the Cage details, the two Power Supplies of Cage0 are showing failed status. However the other controller node (Node1) using these two PSUs is working fine. The fan of both PSUs is operating on high speed however there is no environmetal issue.

There are 3 expansion cages connected to this control cage. On checking their status, the "AC State" is showing failed status in all despite all of them functioning normally (operating via Node1 loop) and no PSU alarms can be seen on the back panel. 

Can some one please guide me on this.

12 REPLIES 12
support_s
System Recommended

Query: HPE StoreServ Controller Node Issue

System recommended content:

1. HPE 3PAR StoreServ 7450 Storage - Troubleshooting Storage System Components

 

Please click on "Thumbs Up/Kudo" icon to give a "Kudo".

 

Thank you for being a HPE valuable community member.


Accept or Kudo

HP1133
Occasional Advisor

Re: Query: HPE StoreServ Controller Node Issue

Output of checkhealth -detail, showenv is as follows of Node 1 which is in the same cage as the faulty node 0. The canister 1 is showing ok status while canister 0 is showing unrecoverable state. 

Cabling: Missing Node for cabling check

Node		node:1			Environmental factor SBB Canister 0 CPU2 is Unrecoverable                                                                                                              
Node		node:1			Environmental factor SBB Canister 0 memory is Unrecoverable                                                                                                            
Node		node:1			Environmental factor SBB Canister 0 near CPU is Unrecoverable

SBB Canister 0 near CPU       ---        ---        --- Unrecoverable   
SBB Canister 0 CPU2       ---        ---        --- Unrecoverable   
SBB Canister 0 memory       ---        ---        --- Unrecoverable  

 

Where does the issue lies.

Mahesh202
HPE Pro

Re: HPE StoreServ Controller Node Issue

Hi HP1133

It appears that you have a failure in Cage0 of your HPE 3PAR StoreServ 8400 controller node system. Here are some steps you can take to troubleshoot the issue:

  1. Verify Power Supply Units (PSUs): Since the PSUs in Cage0 are showing a failed status, it's essential to confirm if they are functioning correctly. Check the power connections and ensure they are securely plugged in. If possible, try swapping the PSUs between Cage0 and another cage to determine if the issue is with the PSUs or the cage itself.
  2. Check Power Distribution Units (PDUs): Verify that the PDUs providing power to Cage0 are functioning correctly. Ensure they are powered on, and the power cables are securely connected. You may also want to check if there are any tripped circuit breakers or power issues in the facility that could be affecting the power supply to Cage0.
  3. Environmental Monitoring: Although you mentioned there is no environmental issue, it's worth double-checking the temperature and other environmental conditions in the data center where the system is located. Ensure there are no excessive heat or airflow problems that could be affecting the PSUs or other components.
  4. Investigate Node 0 Failure: The failed status of Node 0 could be related to the PSU issue in Cage0. Once you have addressed the PSU problem, try restarting Node 0 to see if it recovers. If the issue persists, you may need to investigate further to determine the cause of the failure, such as hardware malfunctions or software errors.
  5. Address Expansion Cage AC State: If the AC State is showing a failed status for the expansion cages, even though they are operating normally via Node 1, it could indicate a communication or configuration issue. Ensure the interconnect cables between the cages and the controller nodes are securely connected. Additionally, check the network connectivity and ensure the proper zoning and configuration settings are in place.

I would also recommend you reach out to HPE Tech support for assistance. They will have specific knowledge and experience with the HPE 3PAR StoreServ 8400 system and can provide further guidance tailored to your specific situation.

Hope this helps.!!

Regards
Mahesh.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
HP1133
Occasional Advisor

Re: HPE StoreServ Controller Node Issue

Dear Mahesh

Thankyou for your detailed response. We have checked the cabling as you suggested. Everything is ok there.

Please note that PSU LEDs are normal when checked physically but are showing failed status in the 3PAR GUI. Also note the same 2 PSUs are powering the node1 which is working fine.

The back panel LEDs of the controller node are (Status LED: Blinking amber, UID LED: Blinking blue).

Can you please suggest any further debugging to confirm where the issue lies.

Mahesh202
HPE Pro

Re: HPE StoreServ Controller Node Issue

Hi HP1133

If the PSU LEDs are normal when checked physically but show a failed status in the 3PAR GUI, it indicates a possible communication or reporting issue between the PSU and the controller node. Here are some further steps you can take to narrow down the problem:

  1. Reset Power Supply Units (PSUs): Try resetting the PSU by power cycling them. This can be done by powering off the system, unplugging the PSUs from the power source, waiting for a few minutes, and then plugging them back in. Power on the system again and monitor if the PSU status in the GUI updates correctly.
  2. Verify Firmware and Software: Ensure that both the firmware on the controller nodes and the software running on the 3PAR system are up to date. In some cases, firmware or software bugs can cause reporting issues. Check the HPE support website for any available updates or patches that address similar issues.
  3. Check Controller Node Logs: Access the logs of the controller node that is reporting the issue (Node 0). Look for any error messages or events related to the PSUs or power subsystem. This can provide further insights into the problem and help identify the root cause.
  4. Perform a Controller Node Failover: If the PSUs are working fine for Node 1 but showing a failed status for Node 0, you can try performing a failover from Node 0 to Node 1. This will switch the active controller node from Node 0 to Node 1. If Node 1 takes over and functions properly with the same PSUs, it may indicate a problem with Node 0 itself, such as a hardware issue or configuration problem.

As suggested earlier I would request you to contact HPE Tech Support and get a detailed analysis of this issue.

Hope this helps.!!

Regards
Mahesh.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
HP1133
Occasional Advisor

Re: HPE StoreServ Controller Node Issue

Hi Mahesh

Thankyou for the suggestions. Just as a reminder we have the following controller cages

  • Controller Cage 4 has 2 Nodes (Node 3, Node 2) and 2 PSUs (PCM0 and PCM1, both provide power to Node3 and Node 2)
  • Controller Cage 0 has 2 Nodes (Node 1, Node 0-faulty) and 2 PSUs (PCM0 and PCM1, both provide power to Node1 and Node 0)

We powered off the system. We also reseated the PCM Power unit of the affected node (Cage0 PCMs).

  • On starting the system, the status LED of Cage 0 (present on cage front panel) containing the faulty Node 0 is now changed to Solid Green (Ok) whereas previously this was Solid Amber.
  • The controller node LED status of Node 0 is (Status LED: Blinking Amber, UID LED Blinking Blue) which is the same as before.
  • On viewing using HPE 3PAR GUI, an alert of "Node 0, Interface Card not present (0x405)" was generated.
  • Please note that node 0 is offline hence its status cant be seen in the GUI. However PSU status of Node 1 was checked that uses the same PCMs of node 0, and it is showing AC State failed status which is same as before system restart. Node 1 is operating fine and providing loop back to the expansion cages that were connected to node0 and node 1.
  • The fan is now operating at normal speed whereas previously both fans of cage 0 PCMs were operating at high speed.

Please guide us what to do next.

Mahesh202
HPE Pro

Re: HPE StoreServ Controller Node Issue

Hi HP1133

it seems that after powering off the system and reseating the PCM power unit of the affected node, there have been some changes in the LED statuses and the system behavior. You may try a few more steps to further investigate and troubleshoot the issue:

  1. Verify Node 0 Connectivity: Confirm that all connections, including network and interconnect cables, are securely connected to Node 0. Check for any loose or disconnected cables that could be causing the "Interface Card not present" alert. Ensure that the interface card for Node 0 is properly seated and functioning correctly.
  2. Check Node 0 Health: Even though Node 0 is offline, you can still gather information about its health by reviewing system logs or using the command-line interface (CLI) of your HPE 3PAR system. Look for any specific error codes or log messages related to Node 0. This information might provide additional insights into the root cause of the issue.
  3. Verify Power Supply Status: Monitor the power supply status and AC state for both PCM0 and PCM1 that provide power to Node 0 and Node 1. Check if there are any alerts or error messages related to the power supplies. It's possible that the power supply status might have changed after the system restart, so ensure that the PSUs are functioning correctly.
  4. Review Controller Cage Configuration: Double-check the configuration settings for the controller cage and ensure that the correct settings are applied for Node 0 and Node 1. Verify that the interconnectivity between the controller nodes, expansion cages, and power supplies is properly configured.

It's important to exercise caution when working with hardware components, and if you are unsure about any procedures or need expert guidance, it is best to seek help from HPE Technical support for further assistance.

Hope this helps.!!

Regards
Mahesh.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
HP1133
Occasional Advisor

Re: HPE StoreServ Controller Node Issue

@Mahesh202 

Hi Mahesh

Thankyou for your continued support and timely response.

As you suggested, we tried to see node 0 status from CLI using showcage, shownode, shownodeenv, showport etc. All Node0 status are missing (No information for Node 0). Node 0 information is only available in the inventory commands as expected.

No new/old alerts were generated related to power supply. On checking Cage 0 and connected Expansion cages power status from GUI it says ("AC State" failed). But on checking cage power supply via CLI, the AC State is Ok for all nodes. Moreover, on executing show power supply, it only displays the PSU details for Controller Node 2,3. and Node0,1 information is not present there.  

Please note all PSU LEDs are OK

Is there any status left to check that can help isolate the issue to node or PSU? Please provide guidance

HP1133
Occasional Advisor

Re: HPE StoreServ Controller Node Issue

Hi Team

Please guide on the above.

Satish04
HPE Pro
Solution

Re: HPE StoreServ Controller Node Issue

Hi HP1133,

Collect all relevant information about the issue, including the output of the CLI commands you have run, screenshots of the GUI showing the discrepancies, any relevant logs or events, and details about your system configuration.

Then reach out to HPE Tech Support with all the information or you can create a support case on HPE Support Portal, if you have an active support contract.


Hope this helps.

Regards,
Satish



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
emaster
Visitor

Re: HPE StoreServ Controller Node Issue

@HP1133 Hello,

Did you find a solution?

HP1133
Occasional Advisor

Re: HPE StoreServ Controller Node Issue

Node was faulty and replaced.