Disk Arrays
cancel
Showing results for 
Search instead for 
Did you mean: 

VA7410 - the lun switch problem

Carl Houseman
Super Advisor

VA7410 - the lun switch problem

I've read a couple other threads talking about VA7400 and VA7410 lun switch problems. The only solution discussed was switching to high performance mode. I'm running A140 firmware which I believe to be the latest. Is there no other solution?

Jul 13 08:43:26 hpux vmunix:
Jul 13 08:43:26 hpux vmunix: SCSI: Write error -- dev: b 31 0x070100, errno: 126, resid: 8192,
Jul 13 08:43:26 hpux vmunix: blkno: 163948632, sectno: 327897264, offset: 379674624, bcount: 8192.
Jul 13 08:43:26 hpux vmunix: SCSI: Read error -- dev: b 31 0x070100, errno: 126, resid: 2048,
Jul 13 08:43:26 hpux vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048.
Jul 13 08:43:26 hpux vmunix: LVM: Performed a switch for Lun ID = 0 (pv = 0x0000000055ab8040), from raw device 0x1f070100 (with priority: 0, and current flags: 0x40) to raw device 0x1f040100 (with priority: 1, and current flags: 0x0).
Jul 13 08:43:26 hpux vmunix:
Jul 13 08:43:47 hpux vmunix: LVM: vg[1]: pvnum=0 (dev_t=0x1f040100) is POWERFAILED
Jul 13 08:43:47 hpux vmunix: LVM: Restored PV 0 to VG 1.
Jul 13 08:43:52 hpux vmunix: LVM: Recovered Path (device 0x1f070100) to PV 0 in VG 1.
Jul 13 08:43:52 hpux vmunix: LVM: Performed a switch for Lun ID = 0 (pv = 0x0000000055ab8040), from raw device 0x1f040100 (with priority: 1, and current flags: 0x0) to raw device 0x1f070100 (with priority: 0, and current flags: 0x80).
4 REPLIES
tkc
Esteemed Contributor

Re: VA7410 - the lun switch problem

Hi Carl,

The array appeared to have experienced some slowness during the error being reported. If there is no hardware issue with the array, it may be due to incorrect configuration of the primary/alternate link in the o/s or the array configuration itself or some insufficient free space in the array.
Torsten.
Acclaimed Contributor

Re: VA7410 - the lun switch problem

Did you check "armdsp -a" already?
The array may have a hardware problem like a bad controller/battery/disk.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Carl Houseman
Super Advisor

Re: VA7410 - the lun switch problem

This particular adventure in VA arrays has gotten stranger and stranger.

I've created only two luns, one in each redundancy group (rg). And there are two vg's, one for each lun. So we've got vg1 and vg2 for lun 1 (rg1) and lun 2 (rg2).

And then I've been careful to set the primary/secondary link ordering so that vg1's primary is the device associated with controller 1, and vg2's primary is the device associated with controller 2.

Looking at armlog -e, I find repeating errors, all of which indicate problems with drives in rg1 (any and all of them). Those errors are "target device events" as follows:

Event Description = DEV_ABORTED_COMMAND_EH The drive returned CHECK CONDITION status indicating an aborted I/O, which will be retried by the PDD's error recovery algorithms.

and

Event Description = The disk drive returned a status of CHECK_CONDITION, UNIT_ATTENTION, PWR_ON_RESET. The PDD could not retry this I/O.

and then there are these "controller events" which implicate M/C1:

Event Description = BACKEND_SCSI_EVENT_EH An error occurred on the disk interface. These may occur during disk hotplug.

Also get these "device events" which implicate the FRU as JA0/C1:

Event Description = The host Pass through request was not successfull. This could be due to a LIP or Hot Reset of the device.

There are also some rebuild events that go along with those errors.

A small number - maybe Less than 10% of these errors - will show up in syslog. And a console terminal hooked up to either VA7410 controller will show the array going into Warning> and then back to Ready> on a regular basis. It's an almost instant transition from Ready to Warning and back to Ready.

The vendor replaced the M/C1 controller and when that didn't help, the entire VA7410 chassis (with different controllers), but the problem persists. So I'm thinking, if it isn't the VA itself, it must be the FC connection or the FC card connected to M/C1. But there's NOTHING in syslog to indicate FC issues.

In any event, what's left is to switch the FC card connections to the VA7410, and/or connect it to an entirely different machine.

Is there anything I've overlooked? This same host and its FC connections were previously connected to a VA7110 which operated flawlessly, but as we know, the VA7110 with only one RG and only one performance path doesn't use both controllers at once. So it's possible I never put any load on the host controller which is now associated with RG1 in the VA7410.

BTW this is all private loop topology - 1 host, two FC cards connected to VA7410 which is connected to a single DS2405. Using host port 1 on each VA7410 contoller and back port 1 on each to connect the DS2405. All ports running at 2 GBPS. armlog -a says that *everything* is "good".
Carl Houseman
Super Advisor

Re: VA7410 - the lun switch problem

Disconnecting the DS2405 and re-formatting the VA7410 solved all problems. Apparently there is something not-quite-right in this DS2405.