System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

11.31 agile path, load balancing doesn't seem to be working

SOLVED
Go to solution
Charles McCary
Valued Contributor

11.31 agile path, load balancing doesn't seem to be working


I have a volume group set up with an agile path:

/dev/disk/disk25

Here's the ioscan -fnNC

# ioscan -fnNC disk
Class I H/W Path Driver S/W State H/W Type Description
===================================================================
disk 10 64000/0xfa00/0x2 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk10 /dev/rdisk/disk10
disk 11 64000/0xfa00/0x3 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk11 /dev/rdisk/disk11
disk 12 64000/0xfa00/0x4 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk12 /dev/rdisk/disk12
disk 13 64000/0xfa00/0x5 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk13 /dev/rdisk/disk13
disk 14 64000/0xfa00/0x6 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk14 /dev/disk/disk14_p2 /dev/rdisk/disk14 /dev/rdisk/disk14_p2
/dev/disk/disk14_p1 /dev/disk/disk14_p3 /dev/rdisk/disk14_p1 /dev/rdisk/disk14_p3
disk 15 64000/0xfa00/0x7 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk15 /dev/disk/disk15_p2 /dev/rdisk/disk15 /dev/rdisk/disk15_p2
/dev/disk/disk15_p1 /dev/disk/disk15_p3 /dev/rdisk/disk15_p1 /dev/rdisk/disk15_p3
disk 16 64000/0xfa00/0x8 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk16 /dev/rdisk/disk16
disk 17 64000/0xfa00/0x9 esdisk CLAIMED DEVICE HP DG0300FARVV
/dev/disk/disk17 /dev/rdisk/disk17
disk 19 64000/0xfa00/0xa esdisk CLAIMED DEVICE Optiarc DVD RW AD-5590A
/dev/disk/disk19 /dev/rdisk/disk19
disk 20 64000/0xfa00/0x13 esdisk CLAIMED DEVICE HP LOGICAL VOLUME
/dev/disk/disk20 /dev/rdisk/disk20
disk 23 64000/0xfa00/0x14 esdisk CLAIMED DEVICE HP P2000 G3 FC
/dev/disk/disk23 /dev/rdisk/disk23
disk 25 64000/0xfa00/0x15 esdisk CLAIMED DEVICE HP P2000 G3 FC
/dev/disk/disk25 /dev/rdisk/disk25


Here's the ioscan -m dsf
Persistent DSF Legacy DSF(s)
========================================
/dev/pt/pt3 /dev/rscsi/c3t0d0
/dev/pt/pt4 /dev/rscsi/c4t0d0
/dev/rdisk/disk10 /dev/rdsk/c1t1d0
/dev/rdisk/disk11 /dev/rdsk/c1t4d0
/dev/rdisk/disk12 /dev/rdsk/c1t5d0
/dev/rdisk/disk13 /dev/rdsk/c1t3d0
/dev/rdisk/disk14 /dev/rdsk/c1t2d0
/dev/rdisk/disk14_p1 /dev/rdsk/c1t2d0s1
/dev/rdisk/disk14_p2 /dev/rdsk/c1t2d0s2
/dev/rdisk/disk14_p3 /dev/rdsk/c1t2d0s3
/dev/rdisk/disk15 /dev/rdsk/c1t7d0
/dev/rdisk/disk15_p1 /dev/rdsk/c1t7d0s1
/dev/rdisk/disk15_p2 /dev/rdsk/c1t7d0s2
/dev/rdisk/disk15_p3 /dev/rdsk/c1t7d0s3
/dev/rdisk/disk16 /dev/rdsk/c1t0d0
/dev/rdisk/disk17 /dev/rdsk/c1t6d0
/dev/rdisk/disk19 /dev/rdsk/c2t0d0
/dev/rdisk/disk20 /dev/rdsk/c0t0d0
/dev/rdisk/disk23 /dev/rdsk/c5t0d0
/dev/rdsk/c6t0d0
/dev/rdisk/disk25 /dev/rdsk/c5t0d1
/dev/rdsk/c6t0d1

I've set up multiple copies to/from a filesystem within that volume group.

When I run the copies, I see the following in iostat

# iostat -L 3 500 | grep disk25
disk25_lunpath15 0 0.0 1.0
disk25_lunpath16 17 0.1 1.0
disk25_lunpath16 184760 193.0 1.0
disk25_lunpath16 374784 366.0 1.0
disk25_lunpath16 371859 363.5 1.0
disk25_lunpath16 370628 362.3 1.0
disk25_lunpath16 373199 364.5 1.0
disk25_lunpath16 372292 363.9 1.0
disk25_lunpath16 371874 366.8 1.0
disk25_lunpath16 375595 370.1 1.0
disk25_lunpath16 368744 363.7 1.0
disk25_lunpath16 370853 365.8 1.0
disk25_lunpath16 371516 366.8 1.0
disk25_lunpath16 374138 369.0 1.0
disk25_lunpath16 263551 289.7 1.0
disk25_lunpath16 259241 253.8 1.0
disk25_lunpath16 353301 345.3 1.0
disk25_lunpath16 353998 347.0 1.0
disk25_lunpath16 354319 346.3 1.0
disk25_lunpath16 353335 345.4 1.0
disk25_lunpath16 352447 344.2 1.0
disk25_lunpath16 350283 342.4 1.0
disk25_lunpath16 240112 249.2 1.0
disk25_lunpath16 52066 51.2 1.0
disk25_lunpath16 51878 50.7 1.0
disk25_lunpath16 52391 51.2 1.0
disk25_lunpath16 52736 52.2 1.0
disk25_lunpath16 50189 50.3 1.0
disk25_lunpath16 50184 50.0 1.0
disk25_lunpath16 52050 50.8 1.0
disk25_lunpath16 52731 51.5 1.0
disk25_lunpath16 54272 53.3 1.0
disk25_lunpath16 54432 53.2 1.0
disk25_lunpath16 58128 56.8 1.0
disk25_lunpath16 49844 48.7 1.0
disk25_lunpath16 58221 56.9 1.0
disk25_lunpath16 51710 50.5 1.0
disk25_lunpath16 52397 52.5 1.0
disk25_lunpath16 51030 49.8 1.0
disk25_lunpath16 52731 51.5 1.0
disk25_lunpath16 54432 53.2 1.0
disk25_lunpath16 52050 50.8 1.0
disk25_lunpath16 51541 50.3 1.0
disk25_lunpath16 53411 52.2 1.0
disk25_lunpath16 48148 47.0 1.0
disk25_lunpath16 51883 50.7 1.0
disk25_lunpath16 55608 54.3 1.0
disk25_lunpath16 54417 54.5 1.0
disk25_lunpath16 54275 53.3 1.0
disk25_lunpath16 49669 48.5 1.0
disk25_lunpath16 51370 50.2 1.0
disk25_lunpath16 51716 73.3 1.0
disk25_lunpath16 54432 53.5 1.0
disk25_lunpath16 50861 49.7 1.0
disk25_lunpath16 57685 56.3 1.0
disk25_lunpath16 52895 51.7 1.0
disk25_lunpath16 52907 51.7 1.0
disk25_lunpath16 54259 55.6 1.0
disk25_lunpath16 57834 56.5 1.0
disk25_lunpath16 53248 52.0 1.0
disk25_lunpath16 49329 48.2 1.0
disk25_lunpath16 55112 53.8 1.0
disk25_lunpath16 53234 52.0 1.0
disk25_lunpath16 53751 52.5 1.0
disk25_lunpath16 53248 52.0 1.0
disk25_lunpath16 24547 26.2 1.0


one lunpath (16) seems to be used for all copies. this is not what I expected.

Am I missing something?
7 REPLIES
Charles McCary
Valued Contributor

Re: 11.31 agile path, load balancing doesn't seem to be working

Also, here's the scsimgr get_attr command for that disk:

# scsimgr -v get_attr -D /dev/rdisk/disk25

SCSI ATTRIBUTES FOR LUN : /dev/rdisk/disk25

Generic Attributes :

name = state
current = ONLINE
default =
saved =
desc = state of the LUN (only valid for disk LUNs)

name = dev_type
current = Direct_Access
default =
saved =
desc = type of the device

name = device_file
current = /dev/rdisk/disk25
default =
saved =
desc = default DSF

name = class
current = disk
default =
saved =
desc = Class of the LUN

name = instance
current = 25
default =
saved =
desc = LUN instance number

name = hw_path
current = 64000/0xfa00/0x15
default =
saved =
desc = hardware path of the LUN

name = wwid
current = 0x600c0ff00010f57881d0d34c01000000
default =
saved =
desc = LUN World Wide Identifier (WWID)

name = serial_number
current = "00c0ff10f578000081d0d34c01000000"
default =
saved =
desc = serial number of the device

name = vid
current = "HP "
default =
saved =
desc = vendor identifier

name = pid
current = "P2000 G3 FC "
default =
saved =
desc = product identifier

name = firmware_rev
current = "T201"
default =
saved =
desc = firmware revision of the device

name = scsi_protocol_rev
current = 5
default =
saved =
desc = SPC protocol revision

name = total_path_cnt
current = 2
default =
saved =
desc = total number of lun paths

name = alias
current =
default =
saved =
desc = LUN alias

name = transient_secs
current = 120
default = 120
saved =
desc = Delay in seconds before failing back IOs after a LUN transitions out of online state

name = leg_mpath_enable
current = true
default = true
saved =
desc = Enable or disable multi-pathing of legacy device special file corresponding to that LUN

name = max_q_depth
current = 8
default = 8
saved =
desc = maximum queue depth

name = lpt_lockdown
current =
default =
saved =
desc = LUN path used for IO when load balance policy is path_lockdown

name = uniq_name
current = "0x01.0x00.0x03.0x600c0ff00010f57881d0d34c01000000"
default =
saved =
desc = LUN unique name

name = lpt_to_lockdown
current =
default =
saved =
desc = LUN path to be used for IO when load balance policy is path_lockdown

name = priority
current = 0
default = 0
saved =
desc = LUN Priority

name = hwp_abs_alias
current =
default =
saved =
desc = LUN Hardware path absolute alias

name = hwp_comp_alias
current =
default =
saved =
desc = LUN Hardware path component alias

name = encl_instance
current =
default =
saved =
desc = Enclosure instance number

name = encl_alias
current =
default =
saved =
desc = Enclosure alias

Driver esdisk Attributes :

name = capacity
current = 2577869568
default =
saved =
desc = Capacity in number of blocks


name = block_size
current = 512
default =
saved =
desc = Block size in bytes


name = path_fail_secs
current = 120
default = 120
saved =
desc = Timeout in secs before declaring a LUN path offline


name = load_bal_policy
current = round_robin
default = round_robin
saved =
desc = IO load-balancing policy


name = infinite_retries_enable
current = false
default = false
saved =
desc = Enable or disable IO infinite retries


name = preferred_path
current =
default =
saved =
desc = LUN path to use preferrably when I/O load balancing policy is set to preferred_path

name = alua_enabled
current = true
default = true
saved =
desc = Asymmetric logical unit access enabled

name = pref_tport
current =
default =
saved =
desc = Preferred target port when I/O load balancing policy is set to pref_tport

name = retry_delay_enabled
current = true
default = true
saved =
desc = Enable/disable device specified retry delay codes

name = format_secs
current = 86400
default = 86400
saved =
desc = FORMAT command Timeout in secs


name = start_unit_secs
current = 60
default = 60
saved =
desc = START UNIT command timeout in secs


name = max_retries
current = 45
default = 45
saved =
desc = Maximum number of IO retries


name = esd_secs
current = 30
default = 30
saved =
desc = Maximum time in secs for the transmission of an IO


name = disable_flags
current = WCE
default = WCE
saved =
desc = Set of bit flags disabling task management and other functions


name = ping_type
current = basic
default = basic
saved =
desc = Ping type policy


name = ping_recovery
current = immediate
default = immediate
saved =
desc = Ping path recovery policy


name = ping_count_threshold
current = 0
default = 0
saved =
desc = Ping path recovery count threshold


name = ping_time_threshold
current = 0
default = 0
saved =
desc = Ping path recovery time threshold in seconds


name = congest_max_retries
current = 90
default = 90
saved =
desc = Maximum number of retries of control IOs during device congestion


name = priority_type
current = none
default = none
saved =
desc = Priority type policy
Ismail Azad
Esteemed Contributor

Re: 11.31 agile path, load balancing doesn't seem to be working

Charles!!

Hi,
What i understand from your output is that there are no redundant lunpaths. If there are, you should see multiple legacy DSF's for a particular persistent DSF. All the mappings between the legacy DSF and persistent are one to one. Legacy DSF's are based on the LUN PATH. However, for the persisent DSF which represents the LUN itself, there is only one legacy DSF. If there are redundant lunpaths, you would see the lunpath reflecting as a legacy DSF. This is what you are missing!!. Once you get the other lunpaths up, you can use the scsimgr command to set load balancing policies like round robin, closest path, weighted round robin. Hope this will resolve your issue. Do assign points!!
Read, read and read... Then read again until you read "between the lines".....
Charles McCary
Valued Contributor

Re: 11.31 agile path, load balancing doesn't seem to be working

ioscan -m dsf

dev/rdisk/disk25 /dev/rdsk/c5t0d1
/dev/rdsk/c6t0d1

Ismail Azad
Esteemed Contributor

Re: 11.31 agile path, load balancing doesn't seem to be working

Charles,

Again adding to that, do an ioscan -m hwpath. See if the virtualized hardware address 64000/0xfa00/LUNID is mapping to more than one FC Lunpath hardware path or legacy hardware path. That would be a confirmation to whether or not you have multipathing on your server. Typically, if I have 4 lunpaths, then, 4 legacy DSF's will be present for the same persistent DSF.

Regards
Read, read and read... Then read again until you read "between the lines".....
Duncan Edmonstone
Honored Contributor
Solution

Re: 11.31 agile path, load balancing doesn't seem to be working

Charles,

The long answer:

In this case the device you are talking to is a LUN on a P2000. I believe the P2000 is an ALUA (Asymmetric Logical Unit Access) device - which means that although the LUN is presented on both controller ports of the disk array, only one controller is actually capable of servicing IO for any given LUN at any one time. The HPUX MPIO stack detects this and sets the flag alua_enabled
to true (which you can see in your scsimgr output). This will mean your IOs all go to one port unless that port stops servicing IOs. Of course another LUN could be serviced by the other controller. You can confirm this by running "scsimgr lun_map -D /dev/rdisk/disk25" you should see one lunpath as ACTIVE and one as STANDBY.

Short Answer:
The HP-UX MPIO stack is capable of sending IOs to a LUN down multiple paths, but your disk array is not capable of receiving IOs from multiple paths (at least not whilst maintaining performance).

HTH

Duncan

HTH

Duncan
Charles McCary
Valued Contributor

Re: 11.31 agile path, load balancing doesn't seem to be working

Duncan,

thanks that's it.

c
Ismail Azad
Esteemed Contributor

Re: 11.31 agile path, load balancing doesn't seem to be working

Charles,

Din't see that redundant path. Check the output of ioscan -P health and see if the status of the other lunpath is online.

Regards
Read, read and read... Then read again until you read "between the lines".....