Switches, Hubs, and Modems
1751755 Members
3915 Online
108781 Solutions
New Discussion юеВ

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

 
Udi Karni
Occasional Advisor

NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

We have 2 servers with NC522SFP 10Gbit cards running Windows 2003 SP2.

The 2 cards on the 2 servers are connected via crossover cables - so no network, latency, etc.

We can't seem to push more than 2-3 Gbit/sec even though it appears we are using the correct 64-bit drivers, etc.

??????????????
18 REPLIES 18
rick jones
Honored Contributor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

A litany of the usual sorts of boilerplate questions:

*) What sort of server hardware are you running?

*) are you using the latest firmware and drivers from www.hp.com? I think the driver is up to 4.0.505-1 - at least in the Linux space.

*) what are you using to measure transfer rate?

*) is any one CPU in your server(s) saturated when you are measuring transfer rates?

*) are there any packet losses reported for the protocol you are using for transfer? (eg tcp)

*) before you switched from 1 Gig to 10G, how saturated was your most heavily saturated CPU while doing transfers over the 1Gig link?
there is no rest for the wicked yet the virtuous have no pillows
Udi Karni
Occasional Advisor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

Thanks for picking this up.

The servers are DL785 running Win-2003 SP2 x64. They are running the latest firmware and drivers - 4.0.505.0 from 8/20/09.

We have Oracle 11g on both machines and we copy tables across an Oracle db-link. This is a CPU intensive process that maxes out one of the 32 CPUs while pushing about 35 MByte/sec. So - 3-4 of these max out a 1 Gbit connection, and since we want to run many of them concurrently - we are testing 10 Gbit. However - no matter how many we pile on - the curve turns horizontal past 2 Gbit and doesn't increase by much beyond that. We check the speed via perfmon - both for network and disk write speed. The disks on the receiving side can write well over 1 Gbyte/sec - so that's not a problem.

Also - every card has 2 ports - and the card in total - won't exceed 2-3 Gbit - with 2 ports - it just gets divided in half.

I don't think there are packets getting lost.

Before we switched we would run 4 concurrent jobs over 1 Gbit and see around 120 Mbyte/sec transfer rates - so that works flawlessly.

One thing to note - the card is on an X4 instead of the recommended X8 - we have no X8 available. However - we were led to believe that with an X4 we should still expect around 5 Gbit, but 2 Gbit is a little low.

Increasing Jumbo frames to 9000 added maybe 10%. Increasing the max number of Jumbo frames, number of receiving and sending buffers - and setting TCP parms in the registry - 1323=1 and Windowsize as high as 1/4GB didn't help at all.

help !
rick jones
Honored Contributor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

I've not had occasion to play with an NC522SFP in a DL785, but I have been playing with one recently on a DL585 G6 (which G is your 785 and which frequency?) using netperf. I am able to achieve rather more than 2-3 Gbit/s - 8 concurrent netperf TCP_STREAM tests have been getting 9+ Gbit/s using RHEL5.3 and the "505" driver from www.hp.com. I am using a x16 (PCIe1.1 IIRC) slot even though it is a x8 (PCIe 2.0) card. I'm using a 1MB requested socket buffer size and 64KB sends. The destinations are spread across a pair of other DL585 G6's with HP AD386A Single-port PCIe 10G NICs (those are only "supported" on Integrity servers.

PCIe1.1 x4 is "the same bandwidth" as a PCI-X 133 MHz slot. The very best you can expect from such a slot is ~7 Gbit/s.

Can you tell how many CPUs are taking interrupts from the NIC? Does Windows 2003 SP2 have support for MSI-X functionality to make that happen on more than one core? If not, then all the interrupts will be handled by one core and it won't matter at that point how many more concurrent connections you are running - that core will become the bottleneck.

In broad handwaving terms, within the context of the "technology" and not specifics of implementation, 10 GbE does little to nothing to make it easier to transfer data than it was for 1 GbE, or for that matter than it was for 100 BT or 10 BT. It still takes just as many CPU cycles to transfer a KB of data with each. Now, there certainly have been implementation details, outside the scope of the IEEE specs, that mean the 100 BT cards might have had better programming models than the 10 BT cards, and the 1 Gig cards have added CKO and TSO to make them "easier" than 100BT. The main thing that 10G NICs add beyond 1 Gig NICs is the ability to spread interrupts across more than one core. Hence the interest in 2003 SP2's support for "MSI-X" and whether or not interrupts from the NIC are being spread - especially on the receiver.

There is still latency when connected back to back - it should though be small :) It likley doesn't matter here since you have several connections, but even back to back, a TCP window size that could hit link-rate over 1 GbE may not be sufficient to hit link-rate over 10 GbE... but we aren't there yet with this situation :)
there is no rest for the wicked yet the virtuous have no pillows
rick jones
Honored Contributor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

Correction - my netperf testing thusfar has been with the NC522SFP in an x8 slot in the DL585 G6.
there is no rest for the wicked yet the virtuous have no pillows
Udi Karni
Occasional Advisor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

Rick - thanks. It's a DL785G5, and the slot is an X4 gen1.

I am kicking off 14 transfers - and each one saturates a processor - so on a 32-way - each fully used processor equals 3% CPU - so I do in fact see around 42%-43% CPU utilization. However - I am not seeing the 35 MByte/sec per process I would normally see if I kicked off only 1 or 2 or 3 of them.

All that doesn't really answer your question of whether the interrupts are handled by many CPUs or there's one bottleneck CPU that's taking all the hits. I'm still scratching my head how I look that up.

Are you an HP employee? We routinely have HP employees RDP to our servers so they can see for themselves - using the HP virtual room software. If you have the time or inclination - that would be fabulous. If you can't - I still appreciate any and all suggestions you have - and I will try and pursue them.
rick jones
Honored Contributor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

Yes, I am "HP" although almost all of what I've been saying thusfar has been in the guise of my alter-ego "Mr. Netperf." Your post caught my eye because I happen to be measuring NC522SFP performance on the DL585 G6 under Linux.

I don't think I am sufficiently Windows-savvy to make my RDP'ing into your system worthwhile, sorry.

Meanwhile, your mentioning having used an x4 slot rather than an x8 slot got me curious, so I've shifted my NC522SFP from an x8 slot (#5) of the DL585 G6 to an x4 slot (#6) and will see what comes of it. Sending to two separate systems seems to peak at about 5400 Mbit/s, the receive figures aren't measured yet. Since we have different HW and very different OSes YMMV :)
there is no rest for the wicked yet the virtuous have no pillows
Udi Karni
Occasional Advisor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

So shifting to an X4 cuts performance in half to 5Gbit? Before you were mentioning 7Gbit for X4 - so I'm trying to make sure I am not getting all confused with all the different numbers.

Also - since your server is a G6 - are these X4 gen 2? those are really like X8 gen 1, right? If so - that's even more curious because it would mean that the "4" after the "X" trumps the "gen". Am I getting all confused? I thought that an X8 gen 1 should give you full performance and by extension - X4 gen 2 should as well.

?????????????????????????
Udi Karni
Occasional Advisor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

If this is in fact our card -

http://forums.techarena.in/web-news-trends/1161416.htm

Then it would appear that it supports MSI-X and spreads the interrupt load around. No?
rick jones
Honored Contributor

Re: NC522SFP 10Gbit card only pushes 2-3 Gbit - Win-2003 x64

There is what the slot can do, and then what a card does in the slot.

My PCIe 1.1 x4 ~= PCI-X 133 handwaving arriving at 7 Gbit/s was based on a different PCI-X 10G card getting 7 Gbit/s in an Integrity rx4640 and that maxing-out the PCI-X 133 MHz slot on that system.

The first couple inbound tests on my DL585 G6 RHEL 5.3 setup have completed, and they have hit ~6200 Mbit/s. That is getting closer to the 7 Gbit/s handwaving I did.

Also, depending on where things are running on your DL785, and which slot is in use by the NIC, the NC522SFP may have to reach as far as 4 HT hops away to fetch the data to transmit (one from the HT2PCIe chip to the first CPU and as many as three to reach memory hanging from another CPU). In the DL585 G6, since it is only a 4P configuration, the most it has to reach-out is three HT hops. The farther away the memory, well you get the idea. That Integrity rx4640 was a bus-based system so there was only ever one "hop" essentially.

As for G6 being PCIe 2.0. It is my understanding that the Intel-based G6 systems have PCIe 2.0 slots (the DL380 G6 at least, I have one on internal order for just that very reason) I *do* know that a DL785 G6 is still PCIe 1.1. I ass-u-me but do not know a DL585 G6 also remains PCIe 1.1, the DL585 G6 Quickspecs seem to be mum on the topic.
there is no rest for the wicked yet the virtuous have no pillows