- Community Home
- >
- Storage
- >
- Entry Storage Systems
- >
- MSA Storage
- >
- MSA 2040 Controller-B killed partner controller. (...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-30-2020 03:59 AM
04-30-2020 03:59 AM
Hello,
We have MSA 2040 and one of the controller wend down with this error
killed partner controller. (reason pcie link recovery failed)
the controller does not accept any restart action
network of that controller is working fine and can only ssh to it, we cant access it by browsers
what can we do to fix this ?
Thanks in advance
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-30-2020 07:26 AM - last edited on 06-29-2022 04:37 AM by Sunitha_Mod
04-30-2020 07:26 AM - last edited on 06-29-2022 04:37 AM by Sunitha_Mod
SolutionHello,
The PCIE link is the inner-communication link between the two controller which handles their heartbeat communication. Without a log review it is hard to tell if Controller A lost the heartbeat to Controller B and issued a terminate to Controller B, or if COntroller B lost the heartbeat communication with Controller A and took itself down.
My suggestion is to gather a set of logs first and hold onto them in case you need them later. The controller that is down, try to issue a shutdown command through SSH. The command will likely fail but it is good to try and shut it down correctly first. Then remove the controller from the backplane about an inch. Wait for 5-10 minutes. This will allow failover to occur and the surviving controller to rescan the backend and perform other necessary steps. Check the controller that was removed and ensure all the LEDs have stopped flashing. Then insert the controller. It should boot up and resume normal operation.
There have been several fixes for PCIE link errors in newer firmware. Be sure you are running the latest firmware versions. You can check the firmware on your system here: www.hpe.com/storage/msafirmware
Also, if you wish to check other components on your array you can perform an MSA Health Check here:
www.hpe.com/storage/MSAHealthCheck
Download your MSA Log File from your MSA array
Upload the MSA Log File into the MSA Health Check website
Review Results by clicking through the tabs and saving the PDF report
Links to array, enclosure, and drive firmware will be provided
If restarting the controller does not resolve your issue, then please open a support case with HPE Support using the following web link, if the unit is under warranty: https://support.hpe.com/hpesc/public/home
If the unit is out of warranty, you can open a chat support case with HPE using the following web link to check for options: https://pg-receiver-pro.glb.itcs.hpe.com/WCLWeb/WCLEntry.aspx
Cheers,
Shawn
I work for Hewlett Packard Enterprise. The comments in this post are my own and do not represent an official reply from HPE. No warranty or guarantees of any kind are expressed in my reply.
I work for HPE
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-30-2020 11:51 PM
04-30-2020 11:51 PM
Re: MSA 2040 Controller-B killed partner controller. (reason pcie link recovery failed)
Thank you for the Valuable instructions
But we still have one volume owned by controller B,
Is it safe to change owner first or it will Breakdown the data inside it (this volume is an oracle RAC datastore)
Or can i shutdown and remove the controller even if there is still a volume owned by this controller?
BR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2020 12:28 PM
05-01-2020 12:28 PM
Re: MSA 2040 Controller-B killed partner controller. (reason pcie link recovery failed)
Hello,
From the array, shutting down a controller and failing over the volume owned by the controller to the other controller is not an issue. That will happen smoothly.
More concering will be how the hosts are mapped to the system. As long as the hosts are mapped correctly so that all hosts have access to both controllers you will not have a disruption. My suggestion is to ensure the mapping is correct so the database associated with the volume owned by Controller B is also correctly mapped to the Controller A pathing. I would also confirm the multipathing is correct for the host(s) associated with that database to ensure a smooth failover.
Cheers,
Shawn
I work for Hewlett Packard Enterprise. The comments in this post are my own and do not represent an official reply from HPE. No warranty or guarantees of any kind are expressed in my reply.
I work for HPE
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2020 01:42 PM
05-01-2020 01:42 PM
Re: MSA 2040 Controller-B killed partner controller. (reason pcie link recovery failed)
Thank you again for your help
I tried to shutdown the controller but this error shows
Failed to shut down Storage Controller Controller B.
The reboot operation cannot be completed because a recovery is in progress. (2020-05-01 23:38:55)
I cant access the MSA physicaly until sunday
If it doesnt accept shutting down (i can restart it only)
Can i remove the controller without shutting it down??
Or this will cause a damage?
BR,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2020 01:57 PM
05-01-2020 01:57 PM
Re: MSA 2040 Controller-B killed partner controller. (reason pcie link recovery failed)
Hello,
I would always suggest you try shutting down the controller prior to removing it. However, if the command fails like you stated I would remove the controller and follow the rest of the instructions listed previously. Be sure to allow Controller B time enough to fully discharge before you re-insert it. This will also give Controller A and the hosts enough time to properly failover.
Cheers,
Shawn
I work for Hewlett Packard Enterprise. The comments in this post are my own and do not represent an official reply from HPE. No warranty or guarantees of any kind are expressed in my reply.
I work for HPE
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-03-2020 12:40 AM
05-03-2020 12:40 AM
Re: MSA 2040 Controller-B killed partner controller. (reason pcie link recovery failed)
You are Amazing.
thank you, you saved my day
I took the controller out for 15 minutes then put it back.
everything working fine now.
Thanks again