General
cancel
Showing results for 
Search instead for 
Did you mean: 

Disk device path suddenly changed and telnet errors=hung system

Jim Wallace
Frequent Advisor

Disk device path suddenly changed and telnet errors=hung system

Hi. Possibly two issues here, but they happened together so I am somewhat suspicious. Am looking for any clues please;

Yesterday we had a customer machine 'hang' on a rx7620 running UX 11.23. (Single partition only. Boots from local disks and has two extra vg's sat on HP SAN).

Currently logged in users just froze, whilst new telnet connection attempts got an error about missing telnet device drivers (see note below from syslog).

After forcing a reboot via iLO, I noted that one of the vg's also failed to activate.

syslog revealed error regarding telnet was:
"Fatal error: Telnet device drivers missing: No such device." I've seen in other ITRC posts that this can relate to missing pty device entries - but that doesn't seem to be the case here (I did a count of them, then ran insfe -e, counted again and it was same).

As for the failed vg ... Ran ioscan -fnCdisk and saw paths to SAN attached volumes as being c14t0d0 and c14t0d2 (when this second one should be c14t0d1!).

I then did a vgexport of the old vg, created new vg and group entries and was then able to vgimport from the new c14t0d2 path and activate the vg. Quick edit of fstab and we were able to mount the lvol once again (we also had to run fsck on the volume before it would mount [to be expected given it was open/active at point of the hang]).

I can't see anything in syslog reporting disk/vg issues prior to the hang/reboot.

Customer has assured me no changes were made (and no errors reported) on the SAN side of things. FC interconnects and switches all in order too.

I have seen infrequent, similar problems before on other customers machines, where SAN and local disk device paths/names changed for no good reason, but we never got the bottom of those either!

Your thoughts/comments would be most welcomed!
Thanks,
Doris.
9 REPLIES
Noé
Valued Contributor

Re: Disk device path suddenly changed and telnet errors=hung system

Hello.

In my opinion, if it has changed the path from c14t0d2 to c14t0d1, it must have been for some change at the san level (probably some change in a port on the switchs).

I think the telnet error is due to disk problem.

Regards.
Noé
Jim Wallace
Frequent Advisor

Re: Disk device path suddenly changed and telnet errors=hung system

Thanks for that Noe,

I've asked customer if they can check the logs on the switches.

Puzzled if it could be the switches though, as both paths go through same physical controller and switch connections. I would have expected any change in switch config/status to have changed the overall device instance number/paths and we would have lost both VG's?
Noé
Valued Contributor

Re: Disk device path suddenly changed and telnet errors=hung system

Hello Jim.

Both paths go through same physical controller but each port of the physical controller have a different port in the switch. I think something happended in the switch port of the c4t0d1 disk.

Regards.
TTr
Honored Contributor

Re: Disk device path suddenly changed and telnet errors=hung system

The disk unit number, the last part in the device name indicated by the d#, is controlled by the storage system. Check if any changes were made in the storage array that caused the disk to disappear and the appear again with a different unit number. The server freezing may have something to do with the disk disappearing.
Jim Wallace
Frequent Advisor

Re: Disk device path suddenly changed and telnet errors=hung system

Thanks TTr!

Customer has assured me there have been no changes on the SAN side of things. I've currently got them checking SAN switch logs for any events too.
TTr
Honored Contributor

Re: Disk device path suddenly changed and telnet errors=hung system

I still say it's the disk array. The server can not change the UNIT number neither can the switches. The UNIT number is the LUN number that the array presents out. The customer should check the array, maybe something happened and they are not aware of it.

Refer to page 13 of http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c02019269/c02019269.pdf
Shibin_2
Honored Contributor

Re: Disk device path suddenly changed and telnet errors=hung system

It won't change automatically from Server side. It will change only when somebody changed the port at switch and re-zoned.
Regards
Shibin
Torsten.
Acclaimed Contributor

Re: Disk device path suddenly changed and telnet errors=hung system

The change from c14t0d1 to c14t0d2 cannot happen on its own.

The d1 and d2 value represent the LUN number (e.g. LUN1 vs. LUN2); so this has NOTHING to do with switch ports, zoning etc ...


An array presents a LUN with a LUN number, this number has been changed and such changes are made by storage administrators (or maybe by misconfiguration).



It could be helpful to tell the array model, an ioscan -fn, etc ... just more details.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Torsten.
Acclaimed Contributor

Re: Disk device path suddenly changed and telnet errors=hung system

... and I agree with TTr, if somebody takes LUN1 away from the server, the server is "loosing" the disk, cannot read and write anymore, hence will likely hang.


>> Customer has assured me no changes were made ...




... as expected. ;-)

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!