1832552 Members
5920 Online
110043 Solutions
New Discussion

Re: system down

 
A Pandey
Frequent Advisor

system down

i am getting the following on my console...no telnet/ssh access, nothing with ping, etc

May 18 08:36:11 wschp sendmail[18658]: NOQUEUE: SYSERR(UID0): fill_fd: disconnew

and it keeps repeating every 5 minutes

thanks for your help in advance
11 REPLIES 11
Steven E. Protter
Exalted Contributor

Re: system down

Shalom,

Perhaps you can log onto the console and run some diagnostics.

tail -f syslog.log
bdf
ping anotherhost
I suggest you restart the system, if possible, cleanly at the console. If not possible, power switch it.

Watch the startup, check rc.log and run bdf, looking for full filesystems.


SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Andrey Tumanov
Frequent Advisor

Re: system down

1. Check cabling
2. Verify that your netadmin haven't changed your switch port configuration
3. Make sure inetd is running
4. Make sure there is no duplicate IP configured in your network
5. Review the last changes you made
6. Check /etc/rc.config.d/netconf to make sure it's properly configured

Any of those could help... good luck.
Peter Godron
Honored Contributor

Re: system down

Hi,
from what you are writing the server is not actually down, but trying to re-start?
If so, bring the system to single-user mode and comment out any sendmail parts of the rc?.d scripts.

Secondly:
You can also force skip any rc script that may be stuck with CNTRl+\ on the console, but handle with care.
Florian Heigl (new acc)
Honored Contributor

Re: system down

I'd just suggest You do create a system dump in case You have to fully reset the system.

(Issue TC at the GSP prompt)

if the problem persists, this will allow You or HP do gather more information about it.

yesterday I stood at the edge. Today I'm one step ahead.
Jaime Bolanos Rojas.
Honored Contributor

Re: system down

Hello there!

Do this please check all your files for resolution on the system, like the hosts file, and if you have DNS, please check those configuration files too.
Try a nslookup with ip address
nslookup with hostname.
If they do not resolve just right it might be a misconfiguration on a resolution file.
Can you ping the ip of the server or telnet that equipment?
Work hard when the need comes out.
Robert-Jan Goossens
Honored Contributor

Re: system down

Hi,

Check this doc.

Title: HP-UX Sendmail messages related to HDD performance issue
Document ID: 4000075001

US
http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000079989949

Europe
http://www5.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000079989949

Regards,
Robert-Jan
A Pandey
Frequent Advisor

Re: system down

The system which went down last evening and we are running iPlanet 6.0 sp5. This main college web server and so things got a bit hairy for a bit. Thanks to all those who responded. I will summarize when I know more.

As of now I have reset/rebooted the server and things seems fine. i am sending .out files to hpux support.

i got the following iPlanet error from last night. Would someone enlighten me about it and I too will search google for it.





[17/May/2006:18:03:36] failure (15732): Error accepting connection -5970, oserr=23 (PR_SYS_DESC_TABLE_FULL_ERROR)
Matti_Kurkela
Honored Contributor

Re: system down

I don't know exactly iPlanet's error messages, but "PR_SYS_DESC_TABLE_FULL_ERROR" sounds like your system might have hit the maximum number of opened files (or network sockets)- i.e. the kernel parameter nfiles.

When you run out of nfiles, it is a server-wide problem: each process in the server will receive an error whenever it tries to open a file or use a network socket. Many programs don't handle this type of error well. This may cause things like sshd/inetd/syslogd to die, which makes the recovery more complicated.

Web server processes usually start up a large number of processes and/or threads to cope with loads of simultaneous requests. You should check the iPlanet's configuration to see when it will start limiting the number of requests. The maximum number of processes/threads multiplied by the number of files each thread needs to access must always be a number that is less than the nfiles value. If not, you must either increase the nfiles value or decrease the web server's load limits.

A public Web server is an easy target for Denial-of-Service attacks, many of which are based on creating a huge number of requests. You should always set the various limits so that the web server will hit the application-specific limits first, so there will be some reserve against overrunning the system-wide limit. When the web server hits its own internal limit, it should start replying to requests with HTTP error 503 (Out of resources) and recover gracefully when the load gets lighter again.
MK
Florian Heigl (new acc)
Honored Contributor

Re: system down

I also suggest You raise nproc and/or nfiles - this might do the trick, on a web server a fairly high nfiles value (>4096) might be needed.
yesterday I stood at the edge. Today I'm one step ahead.
A Pandey
Frequent Advisor

Re: system down

yes it seems that i need to raise nfiles from my current 3300 to 4100+

the formula is:

(16*(NPROC+16+MAXUSERS)/10+32+2*(NPTY+NSTRPTY+NSTRTEL))

Are there any repercussions to increasing any of these parameters, especially

NPTY (number of pseudo ttys)
NSTRPTY (max number of stream based pty)
NSTRTEL (number of telnet session device files)

Regards,

Abhimanyu.

Florian Heigl (new acc)
Honored Contributor

Re: system down

Ummm...

there are no backinfluences from these formulas.

and these are only relevant to interactive users.
NPTY (number of pseudo ttys)
NSTRPTY (max number of stream based pty)
NSTRTEL (number of telnet session device files)

make nfile about three times the maximum amount of web sessions You want to support.
(each socket connection uses up a file handle or so)

You could also decrease some tcp parameters using ndd, mostly the *WAIT* ones, but this is unsupported by HP for some reason.
yesterday I stood at the edge. Today I'm one step ahead.