StoreVirtual Storage
1751765 Members
5058 Online
108781 Solutions
New Discussion юеВ

Re: Losing connection to our Lefthand

 
efirkey
Advisor

Losing connection to our Lefthand

We keep losing the connection to our lefthand and are getting the following alerts.

 

So here is what is going on. Our SAN 2 holds our company data as well as several of our vm servers. Since yesterday we have had to reboot this san 4 times and then it will last about 5 hours. I donтАЩt have any amber or red lights on the SAN, power connections are good, data connections seem to be ok. This SAN connects up to a data switch and then we have a few servers that have 2 network connections go up to the same data switch so they can connect to the SAN. We did have a space issue which is what the first thought was yesterday morning and we were able to clear up around 700GBтАЩs, so this does not seem to be the issue, but I could be wrong.

 

So I will try changing out the data cables to see if it happens again. I donтАЩt where the logs would be that might help identify the problem.

 

I get the following 6 Alerts

 

Management Group:  'xenfarm349': storage server status = no quorum

Value does not equal up

 

Management Group: xenfarm349; storage server: 'n349d002 Status = manager down

Value does not equal up

 

Management Group: xenfarm349; storage server: 'n349d002 Maximum Latency = 1285.69. Value exceeds 60

 

Volume status = offline.  In manageement group xenfarm349. the following volumes are offline (different volumes listed here)

 

Volume status = normal. value changed.

 

Management group: xenfarm349 Storage Server  status = storage overloaded.  Value does not equal up.

2 REPLIES 2
Gediminas Vilutis
Frequent Advisor

Re: Losing connection to our Lefthand

How many storage systems and manager processes do you have in your SAN? 

 

Message 'Server  status = storage overloaded' usually is an early symptom of failing disk. If you have HP support, send management group support bundle to support, they will tell which disk to replace.

 

Gediminas

ryan_1212
Advisor

Re: Losing connection to our Lefthand

Storage overload means that one (or more) nodes in the cluster is performing slower than the others for some reason. When this happens it kicks that offending node out of the cluster temporarily so it does not slow down the entire cluster for I think 15 or 20 minutes, and then tries to resync the data again. u

 

This turns into a vicious cycle. In my case, it was a failing disk controller on a lefthand\dell PE2950 kit, and then a faulty replacement controller, and then finally a good controller which resolved the issues.