1833784 Members
3710 Online
110063 Solutions
New Discussion

Re: rcp questions

 
Scott Linden_1
Occasional Contributor

rcp questions

HP-UX 11.11
2 Node MC Serviceguard clusters

I have a user who has a script that uses rcp to copy a file from server B to Server A. The script is being run on server A as the user oracle. Then it does a remsh from server A to server B and removes the original file.

The problem is when he runs the rcp for the first time he receives an error "rcmd:Lost Connection" then an r-service command he runs after will work. Even if it is another rcp.

There are no messages in the local or remote syslog.log file. The .rhosts and hosts.equiv files are setup properly.

Does anyone have any ideas?
6 REPLIES 6
RAC_1
Honored Contributor

Re: rcp questions

When it fails for the first time, do you get anything in server B's syslog.log file??

Also make sure /etc/hosts.equiv is configured properly. for normal users (except root) r commands go through /etc/hosts.equiv and then to .rhosts. .rhosts can override /etc/hosts.equiv.

Also make sure the entried for rlogin, exec, shell are in place in /etc/inetd.conf and /etc/services.

Anil
There is no substitute to HARDWORK
A. Clay Stephenson
Acclaimed Contributor

Re: rcp questions

The puzzling aspect of this is the first-time failures with subsequent successes. The first thing that I would do is enable logging on inetd. That may give you some clue. Are you running normal remshd or are you using tcp_wrappers to invoke remshd?

My first cut at this would be to apply PHNE_40333 on both boxes.

Although this sounds strange; do an ftp put/get on a 10MB file or so in both directions and note the transfer rates. Are the values reasonable? A duplex mismatch would work just well enough that telnets and small copies would work just fine but larger transfers will be severely impacted. This is just whacky enough to be your problem.
If it ain't broke, I can fix that.
Scott Linden_1
Occasional Contributor

Re: rcp questions

RAC,
There are no messages in the syslog.log file on either machine.

Here is my hosts.equiv:

clust1a.bluebunny.com oracle
clust1b.bluebunny.com oracle
clust2a.bluebunny.com oracle
clust2b.bluebunny.com oracle
wbbprd.bluebunny.com oracle

clust1a/b are the node(s) the script is being run from.
clust2a/b are the node(s) the script is connecting to and have the hosts.equiv listed above.
wbbprd is a package/alias on the clust1a/b node(s).

Here is oracle's .rhosts file on clust2a/b:

clust1a.bluebunny.com oracle
clust1b.bluebunny.com oracle
clust2a.bluebunny.com lawson
clust2a.bluebunny.com oracle
clust2b.bluebunny.com lawson
clust2b.bluebunny.com oracle
testapp1.bluebunny.com oracle
wbbprd.bluebunny.com oracle

I realize that some of this is redundant.

The entries for login, exec and shell are in place and activated in the /etc/inetd.conf and /etc/services files.

A. Clay Stephenson,
Ho do you setup logging on inetd?

We are not using tcp_wrappers.

We do not have the PHNE_40333 patch installed and when I try to search for it on the ITRC patch site I do not get any hits. Where can I get this patch?

I did ftp puts and gets with a 13MB file from and to both nodes and had times of less then 2 seconds on the puts and less then 1 second on the gets.

Thanks for your responses.
Brian Hackley
Honored Contributor

Re: rcp questions

Scott,

Check out ITRC doc 4000045712, it speaks to this issue possibly being related to kernel params, and of a fix in PHNE_27777 "R-commands" patch to address the defect:

JAGae38108
remshd fails, dumps core if maxssize and maxssize_64bit
set to 4000000

There are a couple of other documents out there that speak to some other kernel params being possibly related, but this appears to be the "typical cause".

Hope that helps,

Brian Hackley
Ask me about telecommuting!
Scott Linden_1
Occasional Contributor

Re: rcp questions

Brian Hackley,
Thanks for your response.

Both systems already have patch PHNE_27777 installed and both the maxssiz and maxssiz_64bit are set to 31250000.

I could not find anything in the ITRC Knowledge Base that pertained to this problem.

Scott
Brian Hackley
Honored Contributor

Re: rcp questions

Scott,

Bummer, well that was the easy stuff...?

Maybe you need to tusc inetd until you get a failure.
http://hpux.cs.utah.edu/hppd/hpux/Sysadmin/tusc-7.7/
Attaching to a running process e.g. 2340
example ./tusc -flvp -T "" -ccc -o /var/tmp/tusc.out 2340

Prints the current syscall at entry time if any at attach time
If no other calls are traced you're either in userspace spin or
hung/looping/sleeping in the kernel.
syscall summary at end if -c option (one or more c's) is used
Look for large gaps in wall clock time
Look for unusaully large spikes in syscall system CPU time for a particular
syscall
Look for excessive numbers of a particular syscall in the summary
Look for error replies to syscalls....some are normal, some not
Tusc writes a trace entry after 2 secs if the syscall has not returned just
to let you know what it is still waiting on/in. The return code will say
[sleeping]. This is not a new system call for the process.

Hope that helps,

-> Brian Hackley

Ask me about telecommuting!