<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Infiniband Part 2 ! in Operating System - Linux</title>
    <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258058#M81064</link>
    <description>Thanks to Tim I now have ib interfaces on my blade servers &lt;BR /&gt;&lt;BR /&gt;If I start a subnet manager on one of then I can ping them both so happy with that part !&lt;BR /&gt;&lt;BR /&gt;My issue is that I have to connect them into another switch in an exadata rack.&lt;BR /&gt;&lt;BR /&gt;So from the HP BLc 4X QDR IB Switch in the enclosure I have cables connected to the switch in the Exadata. &lt;BR /&gt;&lt;BR /&gt;These do not show an active link at either end&lt;BR /&gt;&lt;BR /&gt;Do I have to enable something on the switch for it to become active ? &lt;BR /&gt;&lt;BR /&gt;This is the first time dealing with IB and also blade enclosures so sorry if this is a noddy question !</description>
    <pubDate>Thu, 07 Oct 2010 07:31:46 GMT</pubDate>
    <dc:creator>KevB_1</dc:creator>
    <dc:date>2010-10-07T07:31:46Z</dc:date>
    <item>
      <title>Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258058#M81064</link>
      <description>Thanks to Tim I now have ib interfaces on my blade servers &lt;BR /&gt;&lt;BR /&gt;If I start a subnet manager on one of then I can ping them both so happy with that part !&lt;BR /&gt;&lt;BR /&gt;My issue is that I have to connect them into another switch in an exadata rack.&lt;BR /&gt;&lt;BR /&gt;So from the HP BLc 4X QDR IB Switch in the enclosure I have cables connected to the switch in the Exadata. &lt;BR /&gt;&lt;BR /&gt;These do not show an active link at either end&lt;BR /&gt;&lt;BR /&gt;Do I have to enable something on the switch for it to become active ? &lt;BR /&gt;&lt;BR /&gt;This is the first time dealing with IB and also blade enclosures so sorry if this is a noddy question !</description>
      <pubDate>Thu, 07 Oct 2010 07:31:46 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258058#M81064</guid>
      <dc:creator>KevB_1</dc:creator>
      <dc:date>2010-10-07T07:31:46Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258059#M81065</link>
      <description>If I am reading this right...sorry if I am shooting blind..&lt;BR /&gt;&lt;BR /&gt;My IB connections did not come ready/active link until I loaded the drivers and configured the interfaces.&lt;BR /&gt;&lt;BR /&gt;e.g. either manually looking at the lights or looking at the switch port status, there was no link lights until I configured the HCA from the OS.&lt;BR /&gt;&lt;BR /&gt;you can review the HCA status by either using your ibstat diags or by cat /sys/class/infiniband/mlx4_0/ports/*/state&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 08 Oct 2010 13:15:30 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258059#M81065</guid>
      <dc:creator>Tim Nelson</dc:creator>
      <dc:date>2010-10-08T13:15:30Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258060#M81066</link>
      <description>Tim &lt;BR /&gt;&lt;BR /&gt;Sorry havent got back sooner!&lt;BR /&gt;&lt;BR /&gt;Found issue was that when someone built the hardware for me they decided to put the cables in the switch upside down!&lt;BR /&gt;&lt;BR /&gt;Ok so next problem !! &lt;BR /&gt;&lt;BR /&gt;When I start up the IB on blade servers it causes the subnet manager to die on the switch with a mem segfault - but can run the subnet manager on one of the servers and it is all ok ? &lt;BR /&gt;&lt;BR /&gt;Would ideally like to sort this out &lt;BR /&gt;&lt;BR /&gt;Problem started when I moved cables from the blade switch to other switches in the exadata for resilience&lt;BR /&gt;&lt;BR /&gt;I have powered off/on all the switches involved &lt;BR /&gt;&lt;BR /&gt;Now it is getting annoying !&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 13 Oct 2010 09:26:18 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258060#M81066</guid>
      <dc:creator>KevB_1</dc:creator>
      <dc:date>2010-10-13T09:26:18Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258061#M81067</link>
      <description>I do not have the answer for you but just some thoughts.&lt;BR /&gt;&lt;BR /&gt;I initially was testing using a server as the subnet manager.  I then realized that if the server went down the whole IB network would stop..  not sure why anyone would want that.&lt;BR /&gt;&lt;BR /&gt;so I have my switches be the subnet manager ( voltaire 2046s.  and do not let the server(s) start a subnet manager.  disable with chkconfig opensm (i think)&lt;BR /&gt;&lt;BR /&gt;I would not think that starting a SM on a server connected to switches that already have a master / slave SM running would effect them but you never know.&lt;BR /&gt;&lt;BR /&gt;these are just thoughts.. and may only lead you to a solution..&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 13 Oct 2010 16:55:57 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258061#M81067</guid>
      <dc:creator>Tim Nelson</dc:creator>
      <dc:date>2010-10-13T16:55:57Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258062#M81068</link>
      <description>"When I start up the IB on blade servers it causes the subnet manager to die on the switch with a mem segfault - but can run the subnet manager on one of the servers and it is all ok ?"&lt;BR /&gt;&lt;BR /&gt;Is this the switch in the Exadata rack, or the HP switch?  &lt;BR /&gt;&lt;BR /&gt;Regardless, perhaps a tangent, but in my opinion, if indeed the subnet manager on the switch dies with a mem segfault, you should go ahead and exercise your support contract and get a defect filed.  Of course, the first question out of the support folks will probably be to ask if the switch is running the latest bits...</description>
      <pubDate>Wed, 13 Oct 2010 17:04:52 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258062#M81068</guid>
      <dc:creator>rick jones</dc:creator>
      <dc:date>2010-10-13T17:04:52Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258063#M81069</link>
      <description>Tim &lt;BR /&gt;&lt;BR /&gt;I dont start the SM on the server it is as soon as IB runs on the blades server it kills the SM on all 3 of the exadata switches so IB goes down so have to run it up on server to get the fabric up.&lt;BR /&gt;&lt;BR /&gt;If I take the IB down on the 2 blade servers the SM on the switches springs to life.&lt;BR /&gt;&lt;BR /&gt;So far I have tried 3 versions on ib on the blades and all of them cause the issue so at least it is consistent.&lt;BR /&gt;&lt;BR /&gt;Rick &lt;BR /&gt;&lt;BR /&gt;Dont think it is an issue with exadata unfortuantely as it works ok when the blades are not running IB &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Problem seems to have happened after I moved 2 of the blade cables from one of the exadata switches to put 1 in each of the other 2 switches in the exadata. &lt;BR /&gt;&lt;BR /&gt;Is there some sort of arp/routing table that needs resetting on the switch ?</description>
      <pubDate>Thu, 14 Oct 2010 07:39:17 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258063#M81069</guid>
      <dc:creator>KevB_1</dc:creator>
      <dc:date>2010-10-14T07:39:17Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258064#M81070</link>
      <description>I don't know much  about the switches here, but on first principles, a bit of software in one device (eg switch) should not segfault if another device is connected.  That sounds like a recipe for a denial of service.&lt;BR /&gt;&lt;BR /&gt;That said, I could see where say dueling subnet managers might cause one or another to decide to take themselves offline, but I see that as being very different from going down with a segfault.</description>
      <pubDate>Thu, 14 Oct 2010 16:11:15 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258064#M81070</guid>
      <dc:creator>rick jones</dc:creator>
      <dc:date>2010-10-14T16:11:15Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258065#M81071</link>
      <description>I don't have enough experience on these to come up with any other ideas..  sorry.. &lt;BR /&gt;&lt;BR /&gt;maybe check out the doc on the switches and exadata config ?&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 14 Oct 2010 16:11:39 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258065#M81071</guid>
      <dc:creator>Tim Nelson</dc:creator>
      <dc:date>2010-10-14T16:11:39Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258066#M81072</link>
      <description>Just to close this.&lt;BR /&gt;&lt;BR /&gt;found this util on one of the exadata servers - /opt/oracle.SupportTools/ibdiagtools/verify-topology &lt;BR /&gt;&lt;BR /&gt;when I ran this it showed an error that I had 2 external switches connected together ie the blade switch and the spine switch in the exadata &lt;BR /&gt;&lt;BR /&gt;I disconnected this connection and hey presto SM on the switches now working !!&lt;BR /&gt;&lt;BR /&gt;Thanks for all your input</description>
      <pubDate>Fri, 22 Oct 2010 07:17:27 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258066#M81072</guid>
      <dc:creator>KevB_1</dc:creator>
      <dc:date>2010-10-22T07:17:27Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband Part 2 !</title>
      <link>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258067#M81073</link>
      <description>You mean there was a loop?</description>
      <pubDate>Fri, 22 Oct 2010 16:51:23 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/infiniband-part-2/m-p/5258067#M81073</guid>
      <dc:creator>rick jones</dc:creator>
      <dc:date>2010-10-22T16:51:23Z</dc:date>
    </item>
  </channel>
</rss>

