1836453 Members
2715 Online
110101 Solutions
New Discussion

Re: rpcd failure

 
SOLVED
Go to solution
Jerome Salyers
Advisor

rpcd failure

yesterday on a B2000 the date was set back a month and then forward again, due to an NNM problem... then the machine was rebooted... since then, the rpcd won't start... the /etc/rc.log doesn't tell me anything...

any suggestions on what to check or, better, yet, what can be done?

thanks
jerome
10 REPLIES 10
Steven Sim Kok Leong
Honored Contributor

Re: rpcd failure

Hi,

What happens when you run rpcd with the -f option? Running rpcd with the -f starts the dced or rpcd process in the foreground. The default is to run in the background.

Hope this helps. Regards.

Steven Sim Kok Leong
Michael Tully
Honored Contributor

Re: rpcd failure

Hi,

One thing databases really hate is
having the date changed. To a lesser
extent this can apply to certain
operating system processes. Try to
restart the daemon(s) concerned if
you can from the /sbin/init.d/
directory. Hopefully running the
coomand even though in the foreground
may give you some indication of the
problem.

-Michael
Anyone for a Mutiny ?
Jerome Salyers
Advisor

Re: rpcd failure

hi, again...

running rpcd -f gives me no errors, in fact, no output whatsoever, and no results...

i tried booting into single user mode, then init 1 and then init 2... by init 2, it hangs on trying the RPC deamon if needed... nothing happens... i have to reboot...

maybe this sheds more light... dunno...

jerom
Steven Sim Kok Leong
Honored Contributor

Re: rpcd failure

Hi,

rpcd -f doesn't exit to the prompt with an error?

A quick workaround is to execute:

# nohup rpcd -f &

Check your startup scripts
- /sbin/init.d/dce
- /sbin/init.d/Rpcd
- /sbin/rc2.d/S570dce
- /sbin/rc2.d/S590Rpcd

# grep START_RPCD /etc/rc.config.d/Rpcd
# If START_RPCD is 1, the DCE RPC daemon (/opt/dce/sbin/rpcd) will
START_RPCD=1

Try to insert debugging print/echo statements in /sbin/init.d/Rpcd and /sbin/init.d/dce to identify at which phase was startup was it hung.

Hope this helps. Regards.

Steven Sim Kok Leong
Alex Glennie
Honored Contributor

Re: rpcd failure

check here =>

a) /var/opt/dce/svc/ : check the timestamps on the 4 log files and look at any that have been written to recently.

b) check your syslog.log too !

c) ps -ef | grep rpc ?
ps -ef | grep dce ?

d) look around for a core file : possibly in /opt/dce/sbin ?

post the details .....
Alex Glennie
Honored Contributor

Re: rpcd failure

Also of use :

Check that dced and/or rpcd are no longer running:

Have a back-up before doing this of at least the 2 .db files listed below :

cd /opt/dcelocal/var/dced

rm Ep.db & rm Llb.db

k. Start up rpcd/dce again ?

Jerome Salyers
Advisor

Re: rpcd failure

first of all, thanks for the help so far...

here is some of the information that i've been able to find:

startup scripts are okay, and nohup rpcd -f & doesn't give me any output...

dce isn't running on the machine, but it's not running on any of the machines we are using, and they rpcd runs just fine...

no core files in /opt/dce/sbin found...

i think the telling thing is that there are NO .db files to be found in the /opt/dcelocal/var/dced directory...

i'm going to see what i can do about that...

i'll let you all know and assign points as appropriate just as soon as i figure out what worked and what didn't... =)

jerome
Jerome Salyers
Advisor

Re: rpcd failure

okay... found out that it deletes the .db files upon startup... the error it gives in the /var/opt/dce/svc/fatal.log file is the following:

dced FATAL dhd general main.c 719 0x7f5b25a0 Cannot use '*all*' protocol sequence, No such file or directory...

any ideas on how to proceed?

thanks
jerome
Alex Glennie
Honored Contributor
Solution

Re: rpcd failure

IF the entry in the fatal log is current : see timestamps then ....

Problems with rpcd/dced Startup - some known problems:
-----------------------------------------------
-> No network is available, faulty network cable, faulty switch, ...
Corresponding error message (eg. in fatal.log) looks like:
199.09.20.14:23:48.325 .... dced FATAL dhd general main.c 710 0x7afbbe78
Cannot use '*all*' protocol sequence, No such file or directory.
=> Check your network connection and network components,
to solve this problem.

Another process is already using Port 135
Corresponding error message looks like:
1997.05.19.20:16:13.959 .... dced FATAL dhd general main.c 699
0x7afb6e00 Cannot use "*all*" protocol sequence, Address already in use

=> Use Programs like "ps" and "lsof", to find out which program is already using Port 135 on your system.

The working directory for dced/rpcd is /var/opt/dce/dced. Check that the directory exists and has the right permissions.

# ls -ald /var/opt/dce/dced
drwxr-xr-x 2 root bin 1024 May 21 00:01 /var/opt/dce/dced
Jerome Salyers
Advisor

Re: rpcd failure

in the end, god help me, it was a faulty cable... bah... the switch was telling me that everything was fine, but in the end, it wasn't...

thank you all for your help and time... i learned an enormous amount today that will be very helpful in the future and will give points accordingly...

jerome