HPE EVA Storage
1849488 Members
5931 Online
104044 Solutions
New Discussion

AIX servers using XP512 disk developing disk errors.

 
SFSupport
New Member

AIX servers using XP512 disk developing disk errors.

We've recently encountered a rash (by rash I mean about 8 servers over the last 2 months) of AIX servers having drive errors on some but not all of the LUNs assigned to them. Out of the blue anywhere from 1 to 7-8 LUNs will appear in an offline/error status to the server. This, so far, only happens on one adapter so our built in redundancy is working but it is chewing up a lot of valuable time in troubleshooting HBAs, fibre, switchports, etc. So far, the only solution has been to allocate new, clean LUNs to the server and the admins will then copy the data/file systems over and everyone is happy again.

What could the root cause of this be? I'm suspecting it has something with the way AIX handles the drives but, of course, the server admins are pointing the finger at the SAN. I'd really like to get to the bottom of this, no matter what the source of the issue is. I'd be willing to supply further info (HDLM version, HBA driver version, array firmware, etc.) to anyone willing to help.

Thanks
2 REPLIES 2
Ivan Ferreira
Honored Contributor

Re: AIX servers using XP512 disk developing disk errors.

If you obtain statistics from the switch, do you see errors on the ports connected to the offline disks?

You can disable the autonegotiation on the switch port and manually configure the speed. We had problems with the autonegotiation on a server (Alpha + Tru64).

The best option that you have is to demostrate that you are not getting errors at the fabric level.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
SFSupport
New Member

Re: AIX servers using XP512 disk developing disk errors.

The errors we're seeing are a high # of loss of sync and enc out. There are no accompanying loss of signal or link failure.
I think you are on to something, though, with the autonegotiate setting. We do have 2GB HBAs running into 1GB switch ports in this instance. I'll definitely be following up on that. Thank you.