cancel
Showing results for 
Search instead for 
Did you mean: 

Linux Native Multipath

SOLVED
Go to solution
Simon_G
Occasional Advisor

Linux Native Multipath

I have used Multipath on RHEL5 and i think on RHEL6 it has some advanced features. Could somone explain what settings/features we need to use for using it on different storage environments (SYmettrix, HP, HDS etc). What PATH grouping policy works best?  How to enable Load Balancing ? I would really appreciate any help.

 

Thanks

 

Simon.

6 REPLIES
Matti_Kurkela
Honored Contributor

Re: Linux Native Multipath

The latest recommended settings can normally be found at each storage manufacturer's compatibility information database:

  • EMC has powerlink.emc.com, which includes interoperability matrices and and the "E-lab Interoperability Navigator"; a web service where you select your server OS, HBA model, storage system model and the models of any other relevant SAN components and the service lists all the recommended patch levels, firmware versions, settings and notes/limitations EMC knows about, for that particular set of hardware+software.
  • HP has SPOCK: Single Point of Connectivity Knowledge. http://h20272.www2.hp.com/
  • I'm not familiar with the support resources of HDS, can someone else fill in?

That said, RedHat has collected the recommended settings from most major storage system manufacturers and made the multipath subsystem use them by default whenever possible. So if your storage system is older than the current installed dm-multipath patch level, the chances are that the multipath subsystem will automatically recognize the storage system type and choose reasonable default settings for you. You'll find a list of all the automatic defaults in the /usr/share/doc/dm-multipath-<version> directory on your system.

 

Your question about "the best" path group policy cannot be answered without knowing some details about your set-up. Are the controllers in your storage systems active/active or active/passive? If active/active, then all paths should be equal and can be used for round-robining or other load balancing strategies, so there will be no need of multiple path groups.

 

If active/passive, it is essential that the multipath subsystem can identify the active controller and all the paths going to it and group them accordingly. Therefore, the next question on an active/passive storage system will be: Does the storage system supply the standard ALUA information properly, or are the manufacturer's proprietary protocols needed/better for this storage model?

 

Dm-multipath will automatically do load balancing whenever possible, but the choice of LB algorithms may depend on your setup and workload. If all the HBAs in your system have equal performance, a simple round-robin may do well enough; but if e.g. some HBAs are newer/faster than others, the more advanced LB algorithms available in RHEL 6 are likely to produce a noticeably better result. You may have to run some tests on your actual workload (or an accurate enough replica of one) to find the algorithm that is the absolute best for your setup.

MK
Simon_G
Occasional Advisor

Re: Linux Native Multipath

Matti

 

Thanks much!  Since RHEL6 has some better features our management want to test DM-PATH to replace Powerpath. I know we have a major advantage when coming to patching with this move, but implementing Linux Mpath(Native), does it have any significant disadvantage in Prod env's especially with ASM (Oracle) and Oracle RAC?  BTW, in a standard  server, i would expect to see same HBA (speed etc). and we have both active/active and active/passive arrays.

 

Regards

 

Simon.

Matti_Kurkela
Honored Contributor

Re: Linux Native Multipath

I have not heard any complaints regarding dm-multipath from our DBAs.

 

I understand PowerPath is now sold for operating systems with built-in multipathing functionality mainly to enable centralized storage management: an EMC consultant praised PowerPath as allowing for a more seamless storage migration. Dm-multipath has no such extra features.

 

I once had a FibreChannel connection degrade so that its actual performance was only a small fraction of its normal speed, but the link did not fail. This was with an up-to-date RHEL 5, so the round-robin algorithm was the only option. Since the algorithm did not take into account the actual performance nor queue length on the adapter, it kept directing 50% of all I/O operations to the degraded link... thus effectively causing significant slowdown in production, as the slow link ended up dictating the overall I/O speed. A RHEL6 system with a non-round-robin LB algorithm selected could have directed most of the workload to the fully functional link, minimizing the slowdown.

 

I first implemented a quick workaround by disabling the degraded path (multipathd -k, show paths, del path <degraded path>), then set about contacting the SAN admins for troubleshooting.

Once the problem was fixed, reactivating the disabled path without a service interruption was as easy as disabling the path.

 

Lesson learned: even if all the HBAs perform equally in a normal situation, they may not do so when there is a fault present.

MK
brian944us
Occasional Visitor

Re: Linux Native Multipath

Matti

 

Very true! I have a similar situation where i see lot of SCSI aborts and the DB was crashing. I increased the scsi timeout value to 120 seconds and it seem to help, is there a special configuration i will have to run with the native Mutlipath for load balancing etc on RHEL6 to avoid the scsi abort messages or in cases where we see such messages. I turned on the scsi debugging also.

 

Brian

Simon_G
Occasional Advisor

Re: Linux Native Multipath

Matti

 

Thanks! I was just reading the DM-multipath guide. I am trying to set this up as close as possible to PowerPath (although i accept PP is more feature rich).  What in your opinion are the settings that i should configure on DM-Multipath on 6.2 RHEL which would bring about the best performance. Also this is a test box with 2 different storage attached (HP and Hitachi). I just want to test this completely, do you have any ideas? Thanks in advance fo your help!

 

Gary Simon

Matti_Kurkela
Honored Contributor
Solution

Re: Linux Native Multipath

Check the compatibility guides of each storage manufacturer for recommended settings for your specific storage models, especially if your storage models are new. RedHat has already integrated the recommended settings for many storage models into the built-in defaults for dm-multipath (see /usr/share/doc/device-mapper-multipath-<version>/multipath.conf.defaults for the full list).

 

Having said that, most available settings seem to be about compatibility and behaviour in the face of path failures, not so much about performance.

 

  • settings like path_checker, prio and path_group_policy must match the behavior of the storage system or multipathing might not work at all (or might cause excessive trespass events in an active/passive storage, completely ruining the storage performance). You should absolutely match the storage manufacturer's recommendations here.
  • many other parameters are related to how dm-multipath detects and handles path loss (polling cycles, timeout values etc.) and should not have significant performance impact when there are no path failures. If you are running a cluster, make sure the multipath timeouts and cluster timeouts will trigger in an appropriate order (i.e. multipathing should have enough time to try all paths to complete an I/O operation before cluster failover is triggered). In a cluster, you *don't* want the node to retry the I/O forever, but to eventually report the failure to the upper layers, so that the failure can be detected by the cluster system and the application failover can be started.
  • the default path_selector in RHEL 6 is still round-robin: consider the other path selector algorithms, especially if your paths can have unequal performance
  • the only "performance-related" parameters I can see are rr_min_io (for kernels older than 2.6.31) and rr_min_io_rq (for newer kernels, like in RHEL 6). Depending on how your storage system implements its read-ahead and caching, tweaking this might allow some performance advantages to be gained. But any possible gains are probably going to be highly specific to your configuration and/or access pattern, so you will have to test some different values and see if there is any significant effect.
MK