Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Telnet/SSH intermittent hanging on new Itanium servers

 
roose
Regular Advisor

Telnet/SSH intermittent hanging on new Itanium servers

Hi Guys,

Just wanted to check the forum whether somebody else have already experienced our current issue with our new Itanium servers, and if they were able to resolve it, what they did.

We are currently in the midst of migrating our Alpha ES80 servers (OVMS 7.3-2) to rx7640 servers (OVMS 8.3-1H1, TCPIP v5.6ECO5). Our rx7640 servers are in a 4-node cluster configuration, whereas currently, our 4 ES80 servers are in a 2-node cluster configuration. Also, our Alpha servers have 100mbps NICs whereas our rx7640 server have 1gbps NICs. We have also implemented LAN failover on our rx7640 with 2 NICs on our LLA0 device, whereas we did not have such in our Alphas.

Problem now is that whenever we do direct telnet or ssh to our new Itanium servers, we are experiencing intermittent 1-2 seconds session hangs, then after that, our sessions will continue. Sessions are never disconnected so far. However, if we do telnet first to one of our Alpha servers, then do a set host from there to one of our Itanium servers, we are not seeing this problem.

We have already set our network switch ports to full-duplex 1gbps, and also on our Itanium servers to be the same, but still we are seeing this problem.

I have already opened a case with HP since last month, but we are still doing problem isolation as of today.

Has anyone encountered a similar problem before with a similar setup like ours? If so, where you able to resolve it? What did you do to resolve it?
16 REPLIES 16
roose
Regular Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Here is one of our Alpha server's LANCP info.
Volker Halle
Honored Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

roose,

from your description, this sounds like a problem affecting TELNET and SSH, but not SET HOST. For further isolation, you would need to disclose, over which network interfaces you're running which protocols, if you believe this problem to be related to NICs.

Do you - by chance - have any NICs configured with TCPIP, but without connecting them to the network ?

Can you reproduce the problem using $ TCPIP PING/NUMBER_PACKETS=0 (continuous stream of ping packets) ?

If you use TELNET (instead of SET HOST) from your Alphas, does the problem also occur ?

Volker.
Oswald Knoppers_1
Valued Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Can you also show the output of 'mc lancp sho dev/counter'?

Oswald
roose
Regular Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Volker - Yes, we do have some additional NICs that are not connected to the network. I tried TCPIP PING command you gave, I was not able to reproduce the problem. Telnet from Alpha, we are also encountering this problem.

Oswald - I attached the result of the mc lancp show dev/counter for one of the node.
J Asson
Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Is this upon login only or during normal telnet sessions.

Not specific to your setup but I have seen issues with the config of name servers where delays were introduced to telnet based logins only.

In some cases the lack of an operational name server (meaning hosts had to be locally defined) presented quicker telnet login and when using a name server there were delays.
roose
Regular Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

J - this is during normal telnet session already, not during login. What do you mean by name server? Is it the DNS server? We did verify already with our DNS server admin and checked that our DNS entries are correct.
J Asson
Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Yes - Name Server = DNS.

The only time I have ever seen such delays with in-progress telnet sessions was when we had a faulty switch. What was odd in my case was that remote access was faster than those at the site (despite the last meter of cabling being common) and the issue was then isolated to the switch.

Since you see the issue across all new systems could you isolate the new kits and confirm the same results with direct network connections to each. I mean no other connections to the systems (one by one) other than your laptop or PC with no fail-over connections live either. As you can see I am all for breaking it down into simple chunks.

I assume the usual network port checks on your servers have been performed with no issues identified.

Oswald Knoppers_1
Valued Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

The counters look clean to me.

But the 1-2 second delay does indicate packet loss. And this only happens when you use the direct path from your wokstation to the new system (not via the Alpha).

Somewere in the new path this packet loss happens. So let your network people check the path from your workstation to the new system for errors.

Oswald
Volker Halle
Honored Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Oswald,

the TELNET session hangs ALSO happen between the Alpha and the Itanium - according to the response from roose ! And they do NOT happen, when using SET HOST from Alpha to Itanium.

This indicadtes, that there must be some TCPIP or TELNET related problem in the path to the iatnium system !

rooose,

please report the output of TCPIP SHOW INT from the Itanium system. I'd like to see, on how many/which interfaces, TCPIP is configured. If there is more than one, is any of them NOT connected to the LAN ?

Volker.
roose
Regular Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Volker - Yes, we currently have 6 network ports on the server (3 combo cards), and only 3 of them (EIB0, EID0 and EIF0) actually has physical connection to the LAN. The other 3 (EIA0, EIC0 and EIE0) are open. I am attaching the tcpip show int command output for your reference.
Volker Halle
Honored Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Roose,

o.k. - only ONE interface (LLA0, LAN Failover) configured for TCPIP. And I guess you're using the SAME interface for DECnet as well (please check with $ MC NCL SHOW CSMA-CD STAT * COMM PORT - should show LLA). Then this pretty much rules out any physical or data link layer problems.

If you login to your Itanium systems via SET HOST from the Alpha and then do a TELNET LOCALHOST on the Itanium system, do you see the hangs as well ?

Volker.
Cass Witkowski
Trusted Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Itaniums support auto-negotiation. Please make sure that your LAN switch is configured to auto-negotiate.

roose
Regular Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Volker - I have attached NCL command result and it does show LLA. I tried to do what you recommended of doing a set host from Alpha then doing a Telnet localhost, but the response time is serverely degraded that I am no longer able to observe if the session hangs at all.

Cass - Let me clarify: Are you suggesting that we leave the network interfaces to be at Autonegotiate? Currently, we have asked our Network admin to set the switch ports where our Itanium servers are connected to to be full-duplex, 1Gbps. This is due to our experience before with our Alpha's wherein autonegotiate for the switch port and network interfaces introduced some session hangs as well for interactive processes. Next week, I am asking HP engineer to come in and help us statically set the NIC cards on the servers to 1Gbps, full-duplex from the console level.
Volker Halle
Honored Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

roose,

a TELNET localhost on your rx7640 shows bad performance ? Could you try this from the OPA0: console terminal oof the rx7640 to find out, if this problem is a local TCPIP issue ?

Note that you have configured DECnet to run on 4 LAN interfaces in parallel, according to your previous responses EIB, EID and EIF are connected to the network. So whenever you connect to that system using SET HOST, you really don't know, which LAN interfaces are being used ! For TCPIP you know it's only LLA0 ! This makes troubleshooting a little bit more complicated, becasue there could be a problem on LLA0, which you might not see when using DECnet, which may use another interface !

Have you configured LAT on the rx7640 ? If not, why not start LAT on the rx7640 on just ONE lan interface (start with LLA0: with DEF LAT$DEVICE LLA before @LAT$STARTUP), then SET HOST/LAT from the Alpha, then try TELNET localhost. You could then easily try each of the other LAN interfaces with LAT as well.

Volker.
Steven Schweda
Honored Contributor

Re: Telnet/SSH intermittent hanging on new Itanium servers

> [...] This is due to our experience before
> with our Alpha's [...]

Were those gigabit Ethernet interfaces? I
thought that auto-negotiation was a required
feature in gigabit Ethernet.

> Please make sure that your LAN switch is
> configured to auto-negotiate.

I'm with him. Same for the VMS systems. If
you actually observe a gigabit Ethernet link
failing because of an auto-negotiated
mis-match, _then_ you might think about
trying to out-smart the automation. Until
then, I'd trust the automation.
Proliant VMS San Mgrs
Frequent Advisor

Re: Telnet/SSH intermittent hanging on new Itanium servers

Roose,

What is the switch at the other end?

I agree with Steven & Case. The NICs used in AlphaServers are often based on a very old Digital chip, and that had serious problems with auto-negotiation; Alpha's were almost always configured with 100mbps/Full Duplex at both ends.

I have always had good luck with autonegotiation for any gigabit interface, and I thought that setting was required. I'm somewhat surprised that your network staff did not discuss this with you in detail...

Let us know if you test the autonegotiation for gigabit, and if that fixes the problem. Thanks.

Carl
Problems Solved