HPE EVA Storage

EVA4400 Controller2 replacement problem

 
Adrian Graham_1
Regular Advisor

EVA4400 Controller2 replacement problem

Folks,

Got yet another brand new 4400 with a failed controller 2 - it had power issues/random restarts etc which upset the VMs connected to the enclosure. Event logs didn't tell us anything other than the controller had restarted, for some reason the controller termination event log hasn't been populated since 30th march.

Management server event logs not useful since we had the 'correct parse file not found' problem, uploading a new parse file wipes the logs, or it did in this case.

Primary is running 09522000 (CR153Flep) and we knew we had a spare good controller running an older version (CR11EFlep). Following the procedure here:

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c01366936

We inserted the new controller which appeared to loop 3 times in the POSTs then halt. WOCP showed it was getting several error 0403047f then halting - why wasn't it automatically code-loading from controller 1?

The advisory here:

http://www11.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c01629457

does say to make sure the replacement controller was the same XCS version but that wasn't possible in this case and we expected a firmware upgrade on controller 2, certainly this worked on the training courses.

Inserting C2 with no device ports connected ('this will give a soft error' which it did) resulted in C2 coming up but was 'unknown' to WOCP and stayed at the old XCS version.

I guess the next step is the HP 'power everything down and back up at 30 second intervals'...

Cheers!
9 REPLIES 9
Anthony Martin_1
Frequent Advisor

Re: EVA4400 Controller2 replacement problem

Hi Adrian,
Bad news really. The controller is not "hot swap". To replace a controller, the EVA4400 must be shutdown, controller 2 replaced, then usual power up sequence

http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c01774626/c01774626.pdf

Cheers
Anthony
gregersenj
Honored Contributor

Re: EVA4400 Controller2 replacement problem

The EVA doesn't need to be shutdown due to controller replacement.

Shutdown the failed controller, then replace it.

You should do these kind of work, on low load times only.

I don't know what error: 0403047f.

But since, you havn't had any entries in the CEL since march 30th. The master is busy or partly hung.
So replacing the other controller wasn't the best choice.

I believe, this is why you don't get the XCS sync.

You should plan a service window, and ask for help.

Right now you need to be very carefull.

BR
/jag

Accept or Kudo

Adrian Graham_1
Regular Advisor

Re: EVA4400 Controller2 replacement problem

Hi Anthony,

The very first paragraph of that document says you always have access to one controller, they are warm swap - shutdown and halt the failed controller then swap.

Anyway, yes, the lack of event logging is a concern as is the 'error getting name' instead of a controller number so I've managed to get a shutdown window at 10am. Watch this space!
Adrian Graham_1
Regular Advisor

Re: EVA4400 Controller2 replacement problem

OK, rebooted the world in an HP stylee (30 secs between enclosures etc) and the controllers still failed to see each other, CVEVA reported 'no controllers found' though the LUNs were presented OK.

Controller termination log is still at 30th March so we have a severe problem here.

I love these things.
gregersenj
Honored Contributor

Re: EVA4400 Controller2 replacement problem

The event log could be due to master controller is busy, or after some Command view upgrade's

Have you upgraded CV after march 30th?
If so you might have to rename the directory with logfiles.

Do you have connection to the EVA from CV at all, by now?

I still believe you should call HP for help.

BR
/jag

Accept or Kudo

Adrian Graham_1
Regular Advisor

Re: EVA4400 Controller2 replacement problem

We installed it 4 weeks ago - brand new and latest versions of everything. It went live 2 weeks ago and fell over a week later.

Currently we're back running on one controller and everything is happy. We already sent the logs to HP who said 'oh that's a known fault, just turn it off and back on again'. Sometimes I despair.

On the plus side the DB servers connected to it run 50% faster than they did when they were connected to an MSA1500.
gregersenj
Honored Contributor

Re: EVA4400 Controller2 replacement problem

They must have to try again harder.
:(
Kick them

BR
/jag

Accept or Kudo

gregersenj
Honored Contributor

Re: EVA4400 Controller2 replacement problem

Adrian

It just happend, that I got involved in a case simialar to your yesterday.

Customer, upgraded the EVA 2 or 3 weeeks ago, but failed to power cycle the EVA after the upgrade, and he didn't have sufficient free space.

Controller 1 was down, 3 disks was unidentifiable 2 of them was out of diskgroupe. adn 1 disk reportet data inaccessible.
Disk failure protection level was at single.
Available disk space was only 566 GB more than occupied, ot showed noting in VRAID creation possiblilities.
It didn't log since late febuary.
Disks was 450Gb FC disks.
I/O modules on the a-side didn't report S/N + HW rev., wich indicates, that customer didn't power cycle after xcs Upgrade.
he did fw upgrade some of the disks but not all.
Leveling was hung.
We could not groupe disks.

I was on-site. Called HP and got case elevated to level 2 support, who we allowed remote access.

We set portection level to none.
The level 2 suportet, did som magic, wich got the 2 failing disks, in the diskgroupe to revert (I was asked to reseat them + his magic).
After reverting was finished, customer deletet 500 GB.
Now the levelling kicked in, since it now had freespace.
We to restart the faild controller, but same error code as you 0403047f.
Then we reseatet the failed controller.
Now it got back.

Finally we power cycled the EVA including disk shelfs, to restart I/O modules.
This also got the last to disks back.

Then we FW upgraded the remaining disks.

Do get help.

BR
/jag

Accept or Kudo

Adrian Graham_1
Regular Advisor

Re: EVA4400 Controller2 replacement problem

Morning!

Every one of these that we've installed has had a problem. Currently we have our own maintenance spare on the bench and it's giving us all kinds of weird errors, not initialising correctly, errors in CVEVA, not letting me clean the MLD etc. Waiting for HP to call back.

Glad you got yours sorted though!