Operating System - OpenVMS
1752590 Members
3828 Online
108788 Solutions
New Discussion юеВ

tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

 
Tom O'Toole
Respected Contributor

tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

I'm trying to use an AB287a pci-x NIC which supposedly gets 900MB/s with hp-ux. I have tried both multinet and tcpip services (NFS server, cifs, vanilla ftp and other other clients) with varying degrees of success from lackluster to horrible. The best I have been able to push has been around 200MB/s, with multiple ftp clients writing out to different systems. I'm using a 2 core rx2660 with vms 8.4 and when trying to get big throughput the system seems capped at CPU, in particular interrupt stack time of over 100% (out of 200% available)...

 

(doesn't seem to be the nic driver --for what it's worth, I ran a single ftp client with tcpip services and did sda prf sampling with the following result:

tcpip services 38%
x-frame driver (the NIC) 8%
system_primitives 12.3%
system_synchronization_min 15%)

 

Anybody have any ideas, or just confirm/unconfirm they get bad performance too?? I can try jumbo frames, but It's a hassle to have it set on all the switches, if it isn't going to make a difference.

I'm  dissappointed because we bought this nic on the renew market and it wasn't cheap...

 

 

 

 

Can you imagine if we used PCs to manage our enterprise systems? ... oops.
7 REPLIES 7
Steven Schweda
Honored Contributor

Re: tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

   With my antique collection, I barely have GB Ethernet on my VMS
hardware, so I know nothing, but ...

> [...] for what it's worth, I ran a single ftp client with tcpip
> services [...]

   It's not worth much to me.  This is one system talking to itself, or
what?  FTP doing what, exactly?  I'd expect an ASCII transfer to have
more overhead than a binary one, for example.

> [...] I can try jumbo frames, [...]

   My dim recollection is that that helped when I installed the GB cards
in my XP1000 systems.  It'd be one of the first things I'd do.  (But
what do I know?)  You weren't using jumbo frames, with the old GB
interfaces?

> [...] It's a hassle to have it set on all the switches, [...]

   My switches are all unmanaged/stupid, so that part was easy here.

Hoff
Honored Contributor

Re: tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

Ring up HPE support and ask.   What little documentation is available here is fairly old, unfortunately.  (There are also some discussions and links to Linux performance, too.)

Device interrupts for a particular device always go to a single core.   Interrupts for different devices can be spread out across cores. If you're at 100% from one device, you've maxed out what OpenVMS and its driver and the device can do.  There's not much you can do to make that faster, short of offloading other activity and/or trying to avoid or to reduce the interrupts with (for instance) larger packets.

But I wouldn't be disappointed in the NIC itself тАФ that very clearly works at close to its rated speed тАФ and rather with OpenVMS and the driver and IP stack here.   OpenVMS IP networking performance can be substantially slower than RHEL per some comp.os.vms discussions, for instance.

FTP?   FTP isn't a network performance test tool.  It's тАФ among other things, and ignoring its insecurity тАФ a decent test of storage I/O performance.  Not network performance.

Usual recommendations: check and upgrade firmware, apply OpenVMS patches, confirm the duplex settings all match, work through the available tuning to ensure pool and buffers and related are available, and then contact HPE support.   And consider avoiding FTP as a performance test tool, as that adds disk I/O into the mix and disk I/O can throttle the aggregate available network transfer speeds..

Tom O'Toole
Respected Contributor

Re: tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

 

Thanks so much for your responses. I don't think (but don't know for certain either) it is interrupt servicing that is swamping it, the stuff in the driver code and primitives are a fraction of it, I think there is other non-device stuff running with elevated IPL  that is the primary culprit ..

reason to use ftp because it did, of all the things I tried, the fastest thorughput I could attain. I was doing multiple client outward pushes to different systems, and got shy of 200MB/S... my storage is wheey faster than that, esp... with multiple threads pulling... that is not the bottleneck, but .. agree ftp is for sure testing a bunch of things not just network. .. probably I should use some sort of app like tcpip equivalent of decnet mirror...

I probably will follow up with hpe, but my main aim with posting here is: to see what kind of throughput vms people are getting on 10gbE hardware, (and to stimulate discussion...) If nobody is getting good numbers... how come?

I am inclined to be disappointed in the tcpip stack itself, that appears to be where a lot of cycles are going.

This project is basically about replacing tape with network-based backup, and this same system with just two CPUs routinely transfers sustained 1GB/s using four lto4 drives... with idle time to spare. It rocks.  <200 MB/s = nowhere close to cutting it, I basically need this to run the card at a decent % of its rating speed....

Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Hoff
Honored Contributor

Re: tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

Why folks aren't getting performance with OpenVMS?  AFAICT, the folks at HPE have not done any systemic performance-related work in the IP and driver stacks.   Not in some years.   That, and the comparatively complex OpenVMS network driver stack and kernel design isn't all that helpful when going fast is required.   Interrupts, buffer copies and even caching can all hammer performance.

For some details on what previously-speed-appropriate operating systems are now encountering with newer network hardware and newer requirements, have a look at the network performance info linked earlier, and specifically at the Linux info that was linked from there (such as this) тАФ Linux has seen more than a little work in this area, and some of what is being encountered is discussed.  

From conversations with some of the developers, VSI is starting to look at some of this area, and has a separate VSI IP network stack in development.

There's a whole pile of NAS-related support missing from OpenVMS, and I expect that's on the list of potential new features over at VSI.

FTP should be expunged from all systems.

iperf and jperf ports are around.

Tom O'Toole
Respected Contributor

Re: tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

 

yeah... Judging from the lack of "I'M GETTING GREAT PERFORMANCE",  nobody is.   (or nobody reads this, or nobody cares :-)

My very quick and exceedingly filthy "FTP profiling" (yes, for what it's worth) makes me think the inefficiencies are primariliy in the TCPIP stack, and maybe only secondarily in the device driver area... I will attempt to try/profile iperf (not expecting much different since it's IP)... I may have a chance to test MSCP using 10G soon and if I find anything interesting, will post it here. (MSCP also is using storage, but storage I will test == speedy)> I will also test decnet (mc ncp loop node xxx)... time permitting..

Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Jack Trachtman
Super Advisor

Re: tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

I have gone through a similar process some months ago: trying to substitute a disk-to-disk backup scheme for our present disk-to-tape scheme.   Wound up buying a 10Gbe card to test on one of our rx2800-i2 systems since the throughput using the native 1Gbs hardware was abysmal. (OpenVMS V8.4, latest patches)

Unfortunately, I was seeing the same max throughput that you are seeing: about 200MBs.

I was testing with both FTP & NFS, and had some cases open with HP.  I was told that NFS was much less efficient than FTP, but FTP wasn't that great either.

Maybe VSI's new TCPIP stack will be more efficient.

Tom O'Toole
Respected Contributor

Re: tcpip performance using openvms with 10gbe - anybody getting good throughput? I'm not.

I still haven't tested mscp, and probably won't - I have too much other stuff going on. I do think anything using tcpip is going face a bottleneck there, both multinet and tcpip services...

After more than enough "due diligence", I ended up going "in another direction" as they say... I'm presenting SAN snapshots of the VMS storage as block devices on a linux server. The backup software the company is using can deduplicate blocks on the linux client, so that's a big win for incremental backups...  (block-based backups after the first one appears to be sending only changed blocks across). The throughput through the 10Gbe is also way higher, I'm hitting 700MB/s with around 3-4 parallel jobs... Of course to do a restore we would have to restore to identically sized volume(s) presented to the/a linux box, which would  then be moved over to the vms system. That seems OK enough for me.

A restore would be only used in the event of, "I need a file back from a week ago", or some sort of very bad data corruption incident that would have been propagated to DR replication at remote site..

 

 

Can you imagine if we used PCs to manage our enterprise systems? ... oops.