BladeSystem - General
1751972 Members
4944 Online
108784 Solutions
New Discussion юеВ

Re: Virtual connect lost OA communication & domain check point error

 
pakm11
Advisor

Virtual connect lost OA communication & domain check point error

Hi,
I'd like to see if any of you experienced similar problems and what did you have to do to get them fixed.
Enclosure is c7000, OAs @ 2.60 (dual OAs), VC @ 2.12 (4xVC-E+2xVC-F), iLOs are @ 1.78, Server ROMs 5/10/2009 for BL680s and 5/12/2009 for BL460s.

Error(s): When logging in to VCM a pop-up window comes saying communication between VCM and OA is lost and/or credentials missing.
Also have "domain check point pending/not valid" warning icon on.
OA does not have any error message in logs or from UI.
Servers are operational.

HP is asking me to go through ritual of re-seating OAs, and if that's not going to fix it, then reset all the VC modules, requiring an outage.

I want to hear your experiences with VC.

When it works it is next best thing, but in my case it is failing more than it should.
We are constantly being asked to upgrade the f/w to the latest even though HP can not point to a specific fix with the update which would solve our problem.
As you see I am not running the latest f/w revs but I am not too far behind either. Enclosure was updated within past 4 months.
10 REPLIES 10
marcus1234
Honored Contributor

Re: Virtual connect lost OA communication & domain check point error

hp have advisd of the most logical steps , i would plan a down time immediately and do that if not the may well be unforseen issues

if issue a rise please post here with details

Extract the Showall report:%%%
Go to the OA GUI>Enclosure settings > Configuration Scripts > fill the URL of the file in the URL box:
ftp://iss:tools@hprc.external.hp.com/showall.txt
and the hit APPLY button. The commands in the file showall.txt will be downloaded and executed in the GUI and the result will be visible in a new pop up window. You can easily copy this into a notepad text editor.

attatch it here i will check it

This is how we thank each other in the forum

http://forums11.itrc.hp.com/service/forums/helptips.do?#33

Enjoy:)





Terri Harris
Honored Contributor

Re: Virtual connect lost OA communication & domain check point error

You can physically reseat both the OA modules and the one OA tray that both modules install into without any downtime. If there are no detectable OA modules in a given c7000 the fans will ramp up to maximum, this is normal as one OA responsibility is throttling the fans.
Otherwise, can you provide the exact error message?
pakm11
Advisor

Re: Virtual connect lost OA communication & domain check point error

Warning pop-up.
pakm11
Advisor

Re: Virtual connect lost OA communication & domain check point error

Domain check point error.
pakm11
Advisor

Re: Virtual connect lost OA communication & domain check point error



This was a fully operational enclosure and no changes to configuration were made in the recent past.


* First of all, servers inside this enclosure are operational(for now) with no impact on ethernet or SAN access.
* Checked EBIPA. VC modules have proper IP addresses. Can login to OA and VC with no issues.
* OA does not have any error indicator for the enclosure.
* Logging in to VC, greeted with attached vc_error_message pop up window.
* Last successful domain check point was 1/15/2010 @ 1045AM. Check point error message is also attached.
* Domain status is in "critical" state.
* Hardware overview section has interconnect bays marked with "?". No other warning or minor level error indicators are present via VCM console.
* Enclosure overall status is failed (from VCM, OA is OK); as seen in "enclosure_status" screen shot. From there I try to test OA communication by entering OA credentials and that too fails.

From VC syslog:
********
2010-01-12T11:20:54-05:00 VCETW283001N2 vcmd: [SVR:enc0:dev3:5010:Info] Server state OK : Component fully operational
2010-01-12T11:20:54-05:00 VCETW283001N2 vcmd: [SVR:Bay 3:5002:Info] Server inserted
2010-01-15T07:51:01-05:00 VCETW283001N2 vcmd: [ENC:iwojima:2013:Critical] Enclosure state FAILED : Component is not operational due to an error
2010-01-15T07:51:01-05:00 VCETW283001N2 vcmd: [VCD:iwojima_vc_domain:1022:Critical] Domain state FAILED : Component is not operational due to an error
********

So we first had above errors, 1/15/2010 @ 7:51 AM. VC domain became unresponsive, got the SIM pages.
Then at about 10:45AM same day, last successful domain check point.

I pretty much see this is going to require us to take a down time and re-seat everything in the enclosure.

What do you think about reliability/availability/serviceability of your VC enclosure set ups?
Do you all have these kind of sporadic failures with your VC enabled enclosures?
Do you think HP is asking too many re-seat/resets from their customers?

And I am not buying in to upgrade your f/w and you'll be good line anymore. Been there done that.
Nothing HP can point out to me about a particular fix resolving my issue with these upgrades.
Last time I worked with HP for another enclosure issue, support person told me that they can not guarantee "then" current f/w upgrade will be a stable platform to run on.
There may be undiscovered bugs etc.

OK, let me hear what you have to say...


Terri Harris
Honored Contributor

Re: Virtual connect lost OA communication & domain check point error

This is an issue that HP has seen in the past. Physically reseating the OA modules & tray and/or the Virtual Connect modules once should resolve this.
Physically reseating just one time typically resolves the issue(s). If it does not, HP needs to collect the logs and review this enclosure.
pakm11
Advisor

Re: Virtual connect lost OA communication & domain check point error

Thanks for the reply.
I have an outstanding case with HP.
I have provided them all the info they'll need.
Config scripts, showall out etc etc,.
First line support asked me to reseat.
And while at it, I am told, I should upgrade FW.
He indicated I may not get this escalated before I upgrade FW.
I am holding off on that for now.
Meanwhile I circled back with post-sale tech support from Prosysis/HP.
Waiting to hear back from them as well.
What is the normal for FW upgrades?
Should we schedule one for every quarter or what?
It seems like VC modules will be requiring a lot of "tending to".
Thanks
marcus1234
Honored Contributor

Re: Virtual connect lost OA communication & domain check point error

hmm i have noticed on occasions when i have seen this type of issue ,
almost all of the time the firmware updates "Jumped" a few versions at once,

I suggest/suspect reseating interconnects and then the OA , causes the OA to do a "Full" rescan of all inventory and and once again it is aware of the components it overviews
:)
Terri Harris
Honored Contributor

Re: Virtual connect lost OA communication & domain check point error

I do agree with Mark. If the OA/VC firmware is allowed to get too many revisions behind and major upgrades are required, this can also create the situation you are currently experiencing. But otherwise it is a one-time physical reseating of the components unless the enclosure has been moved.