cancel
Showing results for 
Search instead for 
Did you mean: 

remsh hang after first run

SOLVED
Go to solution
jerry1
Super Advisor

remsh hang after first run

We have a sun box that has moved out of our
network to another corp network.
When trying to do a remsh host1 -l user1 It runs on the first try then
hangs on the second to the HP server. Only
the HP servers have this problem. The suns
here still work with remsh over the WAN to
each other but from sun to hp it does not.
Anyone have this problem or know what is
causing it?
Nothing has changed except communication over
the WAN. Telnet, smtp, ftp work fine. But not
remsh. Looking for r-commands patches.

22 REPLIES
Ganesan R
Honored Contributor

Re: remsh hang after first run

Hi,

Make sure the port 514 is opened between the servers and switch/router/firewall's.

If you need other services like rexec, rlogin you should open these ports also.

Port 512/tcp : rexec (passwd reqd)
port 513/tcp : rlogin (passwd reqd)
port 514/tcp : remsh (no passwd reqd)

you can find more details on /etc/services. Confirm with your network people's.
Best wishes,

Ganesh.
jerry1
Super Advisor

Re: remsh hang after first run

Oh, this problem is evident with any server
here running 11.0 11i.


James R. Ferguson
Acclaimed Contributor
Solution

Re: remsh hang after first run

Hi:

You probably need to add the '-n' switch:

# remsh host -l user -n thecommand

See the manpages for more information.

Regards!

...JRF...
Jim Butler
Valued Contributor

Re: remsh hang after first run

hp to sun - use remsh
sun to hp - use rsh.

if its a permissions thing, you may need to look at either your netgroups file (check authentication, yp, etc). or simply /etc/hosts.equiv.

Since you mentioned network changes, if IP changes were in order, it sounds like it may be the netgroup thing. Names in netgroup need to match to the IP for the group to trust. (did any dns changes take place?).

good luck
Man The Bilge Pumps!
jerry1
Super Advisor

Re: remsh hang after first run

I tried using different switches in
inetd.conf for remshd. But they did not work. It does however work with the -n
on the command line like you said. Why??

Why does communicating over the WAN
affect remsh? This has been working okay
for over 10 years.
jerry1
Super Advisor

Re: remsh hang after first run

Jim, you need to read closer. This is not
a failure every time due to denial. Only on
the successive tries after the first until the first socket is finally closed.

James R. Ferguson
Acclaimed Contributor

Re: remsh hang after first run

Hi (again):

> It does however work with the -n
on the command line like you said. Why??

You haven't divulged what constitutes the . If you look at the manpages, as I suggested:

/* begin quote */

By default, remsh reads its standard input and sends it to the remote command because remsh has no way to determine whether the remote command requires input. The -n option redirects standard input to remsh from /dev/null. This is useful when running a shell script containing a remsh command, since otherwise remsh may use input not intended for it. The -n option is also useful when running remsh in the background from a job control shell, /usr/bin/csh or /usr/bin/ksh. Otherwise, remsh stops and waits for input from the terminal keyboard for the remote command.

*/ endof quote */

Regards!

...JRF...
jerry1
Super Advisor

Re: remsh hang after first run

James, it doesn't matter what the command is.

remsh host1 -l user1

or

rsh host1 -l user1


They all hang after the first successful
remsh or rsh.

truss output on the sun shows its waiting
for a response from the HP.
HP server is waiting for a response back
from the Sun. I can see the SYN_SENT in
netstat. It does timeout after about a
minute. Then you can run another remsh
successfully after that socket connection
is gone. If you try to run one after
that and there is still a TIME_WAIT on that
port. Then remsh hangs.

If there was a switch I could put into
the server to fix it. That would be great.
Now we have to look at putting -n into
the code running on a lot of sun boxes that
access the HP server for info.


Peter Nikitka
Honored Contributor

Re: remsh hang after first run

Hi Jerry,

- it matters, in which environment you call the command
- it doesn't matter if its SUN or HP - my Solaris10 acts identically.

This does NOT work without '-n', because the 'remsh/rsh' "eates" all input - you just see the first host. The option '-n' prevents rsh from opening stdin and the loop is executed as expected.

ef3nip00@ansbach[111] uname -a
SunOS ansbach 5.10 Generic_118833-36 sun4u sparc SUNW,A70

ef3nip00@ansbach[112] cat /tmp/fh
forth
ansbach
frankfurt
ef3nip00@ansbach[113] while read h
> do rsh $h 'uname -n; uptime'
> done forth
7:34pm up 21 day(s), 14:02, 0 users, load average: 0.25, 0.58, 0.55
ef3nip00@ansbach[114] while read h
> do rsh -n $h 'uname -n; uptime'
> done forth
7:35pm up 21 day(s), 14:03, 0 users, load average: 0.93, 0.73, 0.61
ansbach
7:35pm up 56 day(s), 13 hr(s), 1 user, load average: 0.55, 0.76, 0.59
frankfurt
7:35pm up 21 day(s), 14:02, 0 users, load average: 0.44, 0.47, 0.48

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
jerry1
Super Advisor

Re: remsh hang after first run

We are still looking at the problem but
everyone responding to this forum question
is not ready it correctly.

It works the first time you issue remsh
from Sun to HP. It does not work on the
second try of the same command and will
not work until the connection is torn
down.

We are not going from host 1 to host 2 to
host 3 from a single host. Just the same
host to the same host. Always from Sun to
HP.

We are suspecting the WAS we have on both
ends of the WAN, but will have to
corrdiante a turnoff to verify if that is
causing the problem.

It does matter. I can issue the same
commands from Sun to Sun and it works
everytime. Just not from Sun to HP
everytime.


Just saying use -n as a fix all is not an
option for us. The code is FAA controlled
and it is a long drawn out process to get
changes done.

Would still would like to find out what is
causing the problem by just moving it from
LAN to WAN. We think there is a latence
causing a problem with the teardown of the
initial connection. Perhaps due to the WAS.



OldSchool
Honored Contributor

Re: remsh hang after first run

Jerry1 said:

"remsh host1 -l user1

or

rsh host1 -l user1


They all hang after the first successful
remsh or rsh."

I take it this was from the command line (Since you didn't say so).
If that is correct, then what happens when the "-n" switch is used?

====================================================================================

"Just saying use -n as a fix all is not an
option for us. The code is FAA controlled
and it is a long drawn out process to get
changes done."


That may be, but doesn't mean anything as far as identifying and resolving the problem.

You seem to be able to reproduce the problem at will (see the first quote). If so, it can be readily tested as well...

the "-n" switch *may* be your only option. discarding without testing is, well,......
jerry1
Super Advisor

Re: remsh hang after first run

Of course it does not solve the problem.
Come on.

We are looking at the wireshark output
and the connection is trying to use the
same port again and it is busy until it
times out.
Dennis Handly
Acclaimed Contributor

Re: remsh hang after first run

>We are looking at the wireshark output and the connection is trying to use the same port again and it is busy until it times out.

If you think there is something wrong with networking, you should be talking to the Response Center.
Peter Nikitka
Honored Contributor

Re: remsh hang after first run

Hi jerry1,

one last try:

- do you execute the rsh commands as commands at a shell prompt?
- which shell?
- what is the time difference between the execution of the two commands?
- does the hang show for different from-hosts or the same to-host or vice versa?
- what, if you use a HP as the from-host as well?
- what is the difference in the wireshark output when you supply '-n'?

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
jerry1
Super Advisor

Re: remsh hang after first run

- do you execute the rsh commands as commands at a shell prompt?
- which shell?
- what is the time difference between the execution of the two commands?
- does the hang show for different from-hosts or the same to-host or vice versa?
- what, if you use a HP as the from-host as well?
- what is the difference in the wireshark output when you supply '-n'?

1. The commands are executed from command
line and from within a binary program.
Both fail the same way.
2. Fails in any shell.
3. 1 sec via shell script or program.
1-5 sec manually.
4. Sun to Sun okay over wan.
HP to Sun okay over wan.
Sun to HP not okay over wan.

HP10.20 to HP11i from remote wan not okay.
HP11i to HP10.20 on remote wan okay. ???

Any system is okay on the LAN.


I also get this in syslog:

Apr 28 16:00:22 dbsvr1 remshd[10055]: connect second port: Connection timed out

And this in netstat on a good connection:

tcp 0 1 dbsvr1.1020 ictp4.1022 SYN_SENT
tcp 5 0 dbsvr1.shell ictp4.1023 ESTABLISHED
tcp 0 0 dbsvr1.1020 ictp4.1022 TIME_WAIT
tcp 0 0 dbsvr1.shell ictp4.1023 TIME_WAIT
tcp 0 0 dbsvr1.1020 ictp4.1022 TIME_WAIT
tcp 0 0 dbsvr1.shell ictp4.1023 TIME_WAIT


I get this on a hung session:

cp 0 0 dbsvr1.shell ictp4.1023 ESTABLISHED
tcp 0 1 dbsvr1.1020 ictp4.1022 SYN_SENT
tcp 0 0 dbsvr1.shell ictp4.1023 ESTABLISHED
tcp 0 1 dbsvr1.1020 ictp4.1022 SYN_SENT
tcp 0 0 dbsvr1.shell ictp4.1023 ESTABLISHED
tcp 0 1 dbsvr1.1020 ictp4.1022 SYN_SENT
tcp 0 0 dbsvr1.shell ictp4.1023 ESTABLISHED
tcp 0 1 dbsvr1.1020 ictp4.1022 SYN_SENT
tcp 0 0 dbsvr1.shell ictp4.1023 ESTABLISHED
tcp 0 1 dbsvr1.1020 ictp4.1022 SYN_SENT
tcp 0 0 dbsvr1.shell ictp4.1023 ESTABLISHED




Have not really done a wireshark on both
ends yet. Did one here. Each host is waiting
for a response from the other.
Which is why there is a SYN_SENT repeating
for about a minute.


jerry1
Super Advisor

Re: remsh hang after first run

See docid NR0503KBRC00012549 that describes
this problem.
jerry1
Super Advisor

Re: remsh hang after first run

From all the testing I have done. It
is narrowed down to a problem with
the HP-UX 11.x TCP stack I believe.
I tried three other servers running 11.0
and 11i. HP 10.20 to HP 10.20 with remsh
works fine over the WAN without the -n
option.

For now, so production can do work, The
program jumps over to a linux box with
remsh and executes the command which runs
an sql script there that connects to remote
oracle server(HP-UX11i)to get the data.

Corp has decided to go with Sun as their
global platform. They are not going to
recompile their code to fix HP's bug.
And getting HP to fix or even look at this
problem would be impossible. Wish the
developers had not used remsh when they
developed this custom program.
Wish I could find the diffs between hp10.20
and hp11i that is broke.
Dennis Handly
Acclaimed Contributor

Re: remsh hang after first run

>See docid NR0503KBRC00012549 that describes this problem.

This mentions the -m option of remshd. Did this help you? If not, it may not be related.
OldSchool
Honored Contributor

Re: remsh hang after first run

"Wish the developers had not used remsh when they developed this custom program."

especially if you're subject to IT audits. Auditor usually usually remote services to be disabled. in your case, sounds like that would also break the app.
jerry1
Super Advisor

Re: remsh hang after first run

No, It did not work. I had high hopes.
I don't think it is related either.

I took the remshd binary from an HP-UX
10.20 box that does not have the problem
and put it on the HP-UX 11i box.
It still exibits the same hang after the
second attempt of:

# remsh host1 pwd


Has to be a TCP stack problem or other.

I am not that good at reading wireshark
output either. I can't see anything the
sticks out other than two statements
when I run the remsh with the -n option
when it works okay.

TCP ZeroWindow
TCP Window Full
jerry1
Super Advisor

Re: remsh hang after first run

No, the -m did not help.

jerry1
Super Advisor

Re: remsh hang after first run

Turns out to be the WAS servers on
either end of the WAN modifying the packets
for data compression over the WAN.

The NOC put in filters for the two IP
addresses on either end that had the
problem.