network connection that transpired between 2 servers

apple · ‎11-24-2008

dear HPUX gurus,
how to know what transpired between 2 servers? the replication is running between these 2 servers in the same network zone.
when we run the replication from production site, the replication rate is doubly faster. we use the same program to run at disaster site, it much slower. the server capacity is the same. how to check on network from server. tcpdump? would appreciate you can shed some lights. thank you

Hein van den Heuvel · ‎11-24-2008

You should probably focus on the network and seek help from your network support guys as to what to expect.
Have you done the fundamental tests?
Base transfer speeds and latencies between all nodes involved? Everything pinging and singing at the rigt pitch? No surpise routes?

Please refine 'same network zone'.

Is that replication from 'adjacent' production nodes, and adjacent DR nodes, or from production to DR over some significant distance? How far?

fwiw,
Hein.

Bill Hassell · ‎11-25-2008

Unless your disaster site is located in the same building, or you have enormous amounts of money to purchase a very high speed link, you cannot expect replication to ever run at the same speed as it does on a local network. Your local network is probably 100 Mbit which is about 100 times faster than a typical T1 or DSL (about 1-1.5 MBit) link. Perhaps you have a T3 link which is a lot more expensive but will give you only 1/2 the speed of your 100 Mbit link. And if your local network is running at 1000 Mbit, then you will see a horribly big speed difference. In order to get them to run at the same speed, the slowest speed in the network between your two sites must be at least as fast as your local network.

Bill Hassell, sysadmin

apple · ‎11-25-2008

dear sir,
thank you.
same network zone is no firewall restriction. but our network infra is not the same between these 2 locations.
i tried to sftp one file between 2 servers that having slow replication:
/tmp/toss_dbv_oracle.ksh 100% 486 0.5KB/s 0.5KB/s 00:00
Max throughput: 0.5KB/s

what do u think of the throughput. during the replication is running, we will request network guy to capture the network log, but sir during the replication is running what is the best to capture from server? no bad route defined in the routing table. hope to hear from you. thank you

apple · ‎11-25-2008

at headquaters site also having the same 0.5KB/s. sir, do i need to transfer big file to know the latency between these servers. the file that i tried to transfer is 486 kbyte. thank you

Suraj K Sankari · ‎11-25-2008

Hi,

Did you check these things
1.network card speed/duplex
2.switch port where this server is connected speed/duplex

same for the other server.

Suraj

Bill Hassell · ‎11-25-2008

0.5 KB/sec is modem speed, about 4800 baud. This is so slow that you need to call your network department and ask about the link between your disaster site and your local site. There is nothing you can do on the computer to improve this.

Bill Hassell, sysadmin

rick jones · ‎11-26-2008

One of the limits to the performance of a TCP connection is the window size divided by the round-trip time. So, if your TCP connection had say a window size of 32768 bytes and a round-trip time of half a second, the TCP connection would never go any faster than 32768/0.5 or 65536 bytes per second. No matter how much CPU horsepower you had, no matter what the "link speed" was between the two end points.

There are other factors. For example the "effective" window size will be the least of that TCP window, the 'SO_SNDBUF' size on the sender, the "congestion window" on the sender, and how much data the application puts-out into the connection at one time.

If there are packet losses between the two endpoints, that will reduce the "congestion window" and so while the TCP receiver may advertise a window of 32768 bytes, the TCP sender may not send more than say 26280 bytes at a time so the throughput would be only 52560 bytes per second.

Similarly if the application only sends say 8KB of data at one time without waiting for a response from the remote, the throughput would only be 8192/0.5 or 16384 bytes per second.

A tcpdump trace taking at both the sending and receiving systems would be useful. From it one could discern the window sizes, if there was packet loss, the round-trip-time (latency) and the like. It would be best if the trace were only of the connection of interest, if you know the port number in advance you can use that in the tcpdump filter statement. I don't know the well-known port number for an sftp data connection, but you can look it up I suspect.

Otherwise, you might install netperf (http://www.netperf.org) on both ends and use that instead of sftp for testing your link. You can experiment with different window/socket buffer sizes and you can set the port number for the data connection explicitly.

So, something like:

Pick a port number like oh 12345 then

On what will be the sending system do

tcpdump -i -w /tmp/sending.raw port 12345

on the receiving system do

tcpdump -i -w /tmp/recving.raw port 12345

then while those are running, on the sending system do:

netperf -H -l 10 -- -P 12345

When the netperf command completes save the output, and terminate the two tcpdump commands.

You can then "post process" the raw tcpdump traces with either tcptrace or even wireshark or tcpdump itself:

tcpdump -r -v > .cooked

Depending on how fast the netperf runs, the file may be larger than should be posted here. We will want to see the "SYN" segments from connection establishment - the ones with "S" in the cooked output (that is why tcpdump must be started first on both ends) and then you will need to go through the "cooked" files looking at the sequence numbers - that is how TCP identifies the bytes it is sending and retransmitting. You will see an initial large value shown for the sequence (and ACK) numbers on the SYN segments, and then tcpdump will "normalize" those to zero and show much smaller numbers for each as the data flows. You want to see if the sequence numbers repeat or overlap in the trace - that indicates retransmissions.

Another way to go without going to tcpdump first is to run ping and/or a netperf TCP_RR test between the endpoints to get the round-trip-time, and then run netstat -s -p tcp before your netperf command (or sftp) and then again after:

netstat -s -p tcp > before
netperf ...
netstat -s -p tcp > after

and then run before and after through 'beforeafter' from ftp.cup.hp.com under dist/networking/tools/

It won't tell us the window size (but netperf will report the socket buffer sizes, which will be the same thing if this is UX on both ends) and we can see if there are reported retransmissions in the netstat statistics

there is no rest for the wicked yet the virtuous have no pillows

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

network connection that transpired between 2 servers

network connection that transpired between 2 servers

Re: network connection that transpired between 2 servers

Re: network connection that transpired between 2 servers

Re: network connection that transpired between 2 servers

Re: network connection that transpired between 2 servers

Re: network connection that transpired between 2 servers

Re: network connection that transpired between 2 servers

Re: network connection that transpired between 2 servers