Operating System - Linux
1833783 Members
2643 Online
110063 Solutions
New Discussion

SLES11 and Infiniband ConnectX on DL785G5

 
SOLVED
Go to solution
Stefan John
Occasional Advisor

SLES11 and Infiniband ConnectX on DL785G5

Hi,

currently we have a problem regarding an installation of SLES11 + OFED bundle on a DL785G5 with an infiniband HCA (Mellanox MT25418 ConnectX DDR): running only with 16 cores everything is ok, turning on all processor boards (32 cores) the driver mlx4 fails with the message:

Aug 5 16:08:47 seismo9 kernel: mlx4_core: Mellanox ConnectX core driver v1.0 (April 4, 2008)
Aug 5 16:08:47 seismo9 kernel: mlx4_core: Initializing 0000:41:00.0
Aug 5 16:08:47 seismo9 kernel: vendor=1166 device=0140
Aug 5 16:08:47 seismo9 kernel: mlx4_core 0000:41:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Aug 5 16:08:47 seismo9 kernel: mlx4_core 0000:41:00.0: setting latency timer to 64
Aug 5 16:08:49 seismo9 kernel: mlx4_core 0000:41:00.0: command 0x13 failed: fw status = 0x1
Aug 5 16:08:49 seismo9 kernel: mlx4_core 0000:41:00.0: SW2HW_EQ failed (-5)
Aug 5 16:08:49 seismo9 kernel: mlx4_core 0000:41:00.0: Failed to initialize event queue table, aborting.
Aug 5 16:08:49 seismo9 kernel: vendor=1166 device=0140
Aug 5 16:08:49 seismo9 kernel: mlx4_core 0000:41:00.0: PCI INT A disabled
Aug 5 16:08:49 seismo9 kernel: mlx4_core: probe of 0000:41:00.0 failed with error -5
Aug 5 16:08:49 seismo9 kernel: NET: Registered protocol family 27

Is there anyone who has an idea?
Believing in miracles can solve many, many problems.
6 REPLIES 6
Steven E. Protter
Exalted Contributor

Re: SLES11 and Infiniband ConnectX on DL785G5

Shalom,

Check with the hardware maker for more current SLES 11 drivers.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
rick jones
Honored Contributor
Solution

Re: SLES11 and Infiniband ConnectX on DL785G5

I asked around internally and was told the workaround is to disable use of MSI-X interrupts by the driver via:

options mlx4_core msi_x=0

added to /etc/modprobe.conf. I also was told an official fix would be part of OFED 1.4.2
there is no rest for the wicked yet the virtuous have no pillows
Stefan John
Occasional Advisor

Re: SLES11 and Infiniband ConnectX on DL785G5

Hi Rick,

You made my day ;-)

Thanks a lot - that's it. Activating that option for the mlx4_core module and everything looks fine.
Believing in miracles can solve many, many problems.
Stefan John
Occasional Advisor

Re: SLES11 and Infiniband ConnectX on DL785G5

The best forum in the world ;-)
Believing in miracles can solve many, many problems.
rick jones
Honored Contributor

Re: SLES11 and Infiniband ConnectX on DL785G5

You are quite welcome. If you are interested and have the time, it might be worthwhile to run some netperf tests on your setup. The feedback links on www.netperf.org should reach me if you wish to discuss it.
there is no rest for the wicked yet the virtuous have no pillows
Stefan John
Occasional Advisor

Re: SLES11 and Infiniband ConnectX on DL785G5

Thanks Rick,

it would take some time to do real performance tests -
Believing in miracles can solve many, many problems.