Servers - General
cancel
Showing results for 
Search instead for 
Did you mean: 

R3000XR or R3000h sheds power to load segment 1 immediately

 
S M Belshaw
Advisor

R3000XR or R3000h sheds power to load segment 1 immediately

However, during testing, switching input power at the main fuse box, all works fine - servers shut down gracefully and load segments shed and restore power according to configuration. The problem only occurs from some unexpected power interruption! What tells the UPS when to shed a segment and where is the info stored? The power on delay is obviously held in the UPS, since no servers are on at the time, but is the power off for the segment configured in the software but downloaded and held in the UPS? Some background is in the attached TXT file...


My final thought is this - is it likely to be some rapid external utility power fluctuations that provoke this reaction, or could it be some faulty power supply on the controlled equipment on that segment? I would have thought the UPS would have isolated any fault from the wider power distribution, so why do the lights flash? Or, is it just a problem with LANsafe, because the UPS is told to shed a segment by software in real time and it isn't stored in the UPS?
2 REPLIES 2
Brian Vo
Trusted Contributor

Re: R3000XR or R3000h sheds power to load segment 1 immediately

Not familiar with LanSafe but check these 2 delays:
-Shutdown delay: the time that UPS would operate on battery when power oss occurred, continue provide power to servers and not send shutdonw command yet.

-OS shutdown delay: the time after UPS has sent shutdown command to servers and waits for OS shutdown.
These delays once set in LanSafe would be downloaded to UPS for its timers.

So you see the first "shutdown delay" would handle in case you have a power flickering.

Brian
S M Belshaw
Advisor

Re: R3000XR or R3000h sheds power to load segment 1 immediately

Hi Brian, thanks for getting back.

The segments are configured with various delays, with segment 1 being the lowest. Lansafe configured for server A shutdown after 7 and allow 3 minutes, server B shutdown after 8 minutes and allow 2, so total segment 1 power off delay of 10 minutes.

However, in parallel, I have contacted HP support (the new UPS is on warranty) and they state that, while the power on delay after utility power is restored is downloaded and held in the UPS, the total segment power off delay is not. They tell me that the controlling software tells the UPS when to shed the segment at the time.

Having now loaded HP Power Manager for this UPS, circumstances seem to confirm that. WHile the power on delay was picked up from the UPS and loaded into HPPM for me, no aspects of the shutdown were. Odd, I know, but that's what they tell me.

Having had a couple of further incidents, I think I have a clue to what is happening, at least some of the time, provided by the much better data recording provided by HPPM. I'll log it here in case it is useful to others.

It seems that one long power cut causes a complete shutdown according to the configuration and near complete battery depletion - since we keep one segment on as long as possible to support the phone exchange and VoIP networking gear. Power comes back up and follows the staggered power on delays, ensuring some battery capacity, then network gear up first, followed by domain controllers, and other servers last (the servers on the segment being chopped). But, after the domain controllers (and UPS controller) boot, but before these other servers do, we get another power cut of less than 2 minutes.

Lansafe can't see the servers on this segment, because they are not yet up & running, assumes the configuration has changed and that there are no controlled servers on that segment and chops it immediately.

This doesn't explain the times segment 1 shut down immediately during the day on a power cut NOT following a previous graceful shutdown, but the problem here was agent servers losing contact with the controller for some other reason, and shutting immediately. I have no idea why the contact was lost, or why it suddenly became an issue this June, but have changed software now anyway.