Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
Operating System - Tru64 Unix
cancel
Showing results for 
Search instead for 
Did you mean: 

ALPHA getting tomuch time at startup

Rashid Ashraf
Occasional Advisor

ALPHA getting tomuch time at startup

Hi all,

We have couple of COMPAQ ALPHAServer DS20E. One of these machines are taking hours to boot up after a normal shutdown. We normally don't shutdown the servers but some times need to restart due to different reasons.
Can u tell me if it is writing the log files some where for error diagnosis?? The messages file has information only after system bootup. We are using Compaq Tru64 UNIX V5.1A.

Looking for help
Thanks
RMA
25 REPLIES
Ivan Ferreira
Honored Contributor

Re: ALPHA getting tomuch time at startup

You should identify the moment where it stops. Maybe, some service configured to start at boot is not working correctly.

What is the last message do you see after the "long delay"?

Log files are in /var/adm/syslog.dated
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Khairy
Esteemed Contributor

Re: ALPHA getting tomuch time at startup

hi rashid,

you could observe the monitor during the server booting up especially when it starting the services. Is ntp is set or do you have nfs mounts?

let us know.

DCBrown
Frequent Advisor

Re: ALPHA getting tomuch time at startup

Messages and binlog entries are internally cached in a fifo until the boot process is far enough along to dump out the information. If the fifo is too small, then the messages file picks up late or after the boot has essentially completed.

You can increase the internal buffering for messages (and binlog entries, if needed) by changing the /etc/sysconfigtab generic section. For example,

generic:
binlog-buffer-size = 128000
msgbuf_size = 512000

Pieter 't Hart
Honored Contributor

Re: ALPHA getting tomuch time at startup

you can try setting the console variable "console=serial" and capture the output with a terminal emulator connected through the serial console port.
This displays more info from the boot process than the graphics console wich only starts after memory tests and "graphics init" on the OCP-display.
Rob Leadbeater
Honored Contributor

Re: ALPHA getting tomuch time at startup

Hi,

Slow boots are frequently caused by the server looking for something on the network that is no longer present... NFS mounts are a common culprit...

Cheers,

Rob
Pieter 't Hart
Honored Contributor

Re: ALPHA getting tomuch time at startup

apart from the messages file there is(are)
/var/adm/syslog.dated/

You can also try "sysman event_viewer"
wich combines, messages file, binary eventlog and evmlog's
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Thanks to all for their values input.

Today noon I restarted the server to give the infroamtion that it shows to update this thread. I restarted the server at 1205hrs and now it is 1750hrs and the time to go home. Till this time, it hasn't started. Following is the information on the screen:

Testing the System.
Testing the Memory.
Testing the Disks.
Testing ei devices.

xxxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx

Loading vmunix...
Loading at xxxxxxxxxxx (Memory locations)

Sizes:
Text: 6891200
Data: 1304384
bss: 1786400

Starting at xxxxxxxx (Memory loacation)

Loading vmunix Symbol table [1706632]

The above is the last line that I am getting. There is a heavy hard disk activity going on but :-(

Please help.

RMA
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Please remember that I am not using NFS and DHCP for IP assigning. Also the previous suggested commands are not sowing anything unusual.

Thanks
RMA
Pieter 't Hart
Honored Contributor

Re: ALPHA getting tomuch time at startup

could be a file system full or something.

Try booting into single-user mode
boot -fl "s"
see if this boots in a reasonable time
next step do a
bcheckrc
to mount the base filesystems (/ /usr and /var as r/w)
check if you have enough free space on these filesystems.
mount -a
to mount all other filesystems named in fstab (ignore the messages about / /usr and /var)
init 2
to enable network support (from this host not to this host).
init 3
to boot to multi-user mode
or manually execute all scripts /sbin/rc.d/S one at a time
to record what takes so much time
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Hi

Thanks for the quick reply.
This is my test server and has more that 39GB left on the / drive. /usr and /var are directories and not mount points. Now it is 1845hrs and the information on the screen is the same.

Regards

RMA
Rob Leadbeater
Honored Contributor

Re: ALPHA getting tomuch time at startup

Hi,

If its hanging at:

Loading vmunix Symbol table [1706632]

then the kernel is only just being loaded, so you can probably rule out the previous suggestions...

I would guess that you've got a broken or nearly broken boot disk... What's the hardware configuration ?

Cheers,

Rob
Vladimir Fabecic
Honored Contributor

Re: ALPHA getting tomuch time at startup

Can you boot single user mode normally?
In vino veritas, in VMS cluster
Ivan Ferreira
Honored Contributor

Re: ALPHA getting tomuch time at startup

Where is directed your console?

From OS run:

consvar -g console

You should see more startup messages.

Run also:

dsfmgr -v

And post the output of:

hwmgr status component

And also attach /var/adm/messages
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Hi

Thanks for the replies.
Finally on 30th at about 1am, my Server started :-D. It took more that 37 hours to bootup and don't know if it would ever restart again, so till I don't have too, I am not restarting it. Attached are the messages file and the output of hwmgr. Don't see anything special in them. Just to reduce the size of messages file, I removed the messages before the last three dates in it.

Thanks and regards

RMA
Vladimir Fabecic
Honored Contributor

Re: ALPHA getting tomuch time at startup

Mar 26 19:42:59 test vmunix: /: file system full
Mar 26 20:49:42 test vmunix: /: file system full


Please post output of:
# df -k
In vino veritas, in VMS cluster
Pieter 't Hart
Honored Contributor

Re: ALPHA getting tomuch time at startup

from the output of hwmgr and messages:

65: test online available Unconfigured-device-()-at-pci3slot3

pci3 (primary bus:0 subordinate bus:3) at pci0 slot 9
Mar 26 12:04:02 test vmunix: ee1: Autonegotiated, 100 Mbps full duplex
Mar 26 12:04:02 test vmunix:
Mar 26 12:04:02 test vmunix: i2o0 at pci3 slot 4

It looks like a combo-card with an i2o controller. is this a raid-controller ?
whats connected to this device?
The unconfigured device may be the network (ee1)? or is there more on this combocard?
wich the system don't recognizes.

please post output from "hwmgr -view hierarchy"
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Hi

Attached is the file with the information about hardware hierarchy.
About the harddisk space, on 26th the diskspace issue came when I was trying to copy some data. Following is the current situation:

# df -k
Filesystem 1024-blocks Used Available Capacity Mounted on
root_domain#root 123604088 83445432 39601128 68% /
/proc 0 0 0 100% /proc
usr_domain#usr 3169360 2232686 867640 73% /usr
usr_domain#var 3169360 45202 867640 5% /var

Please keep in mind that the harddisk remains busy during most of the boot time and seems reading something.

Thanks and regards

RMA
Pieter 't Hart
Honored Contributor

Re: ALPHA getting tomuch time at startup

i would advice to further investigate the "unconfigured device"

from the output I conclude the ee1-interface is on a combo-card, but another one then the i2o card
25: connection pci0slot8
52: bus pci2
53: connection pci2slot4
59: scsi_adapter itpsa1
60: scsi_bus scsi5
75: tape bus-5-targ-4-lun-0 tape0
55: connection pci2slot5
61: network ee1
57: connection pci2slot6
62: graphics_controller comet0
27: connection pci0slot9
63: bus pci3
64: connection pci3slot3
65: unconfigured_hardware Unconfigured-device-()-at-pci3slot3
66: connection pci3slot4
68: i2o_controller i2o0

possibly the i20-card has multiple busses on wich one is faulty, resulting in an original i2o0 thats now missing and an i201 thats now renamed to i2o0.
maybe you have documentation of this system to verify this.
If these are designed to be a redundant path to the same disks or if the controller was configured with raid-members on both busses you do have a hardware problem.
possibly the OS is able to resolve this, but needs lots of time to resolve io to these devices.
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Hi

Thanks alot for ur efforts.
I have another alpha server with exactly same configration. We bought both of them togather, one live server and other for test purposes. I am getting problems in the test server, live server is running fine. Attached is the hardware hierarchy of live server. It shows the same Unconfigured device. Also our server doesn't have RAID controller builtin, it has a SCSI controller. Harddisk is also attached directly to the scsi controller. It has a PCI card for the external storage cage. It is not in use in both servers and installed at slot 3 (counting from my right side while seeing from back).

again, I really appreciate your efforts.

Thanks
RMA
Pieter 't Hart
Honored Contributor

Re: ALPHA getting tomuch time at startup

thanks for the feedback
your info is inconsistant.

with the last post and the hwmgr output i conclude your production server has 3 disks
7: connection pci1slot7
13: scsi_adapter itpsa0
14: scsi_bus scsi0
69: disk bus-0-targ-0-lun-0 dsk0
76: disk bus-0-targ-1-lun-0 dsk5
77: disk bus-0-targ-2-lun-0 dsk6
9: connection pci1slot8

while your testserver only has one disk ?
7: connection pci1slot7
11: scsi_adapter itpsa0
12: scsi_bus scsi0
69: disk bus-0-targ-0-lun-0 dsk0
9: connection pci1slot9

so the hardware is different.

your info :
/usr and /var are directories and not mount points.
and :
# df -k
Filesystem 1024-blocks Used Available Capacity Mounted on
root_domain#root 123604088 83445432 39601128 68% /
/proc 0 0 0 100% /proc
usr_domain#usr 3169360 2232686 867640 73% /usr
usr_domain#var 3169360 45202 867640 5% /var
shows you do use mountpoints.

could you please give the correct info about the test-system compared to the production-system?
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Hi

Sorry for it.

Yes, My live system has three HDD while test has only One. Also the location of Optical Network Card is different in both.

About the file system configration, I sent the output of "df -k" from the test system. It has /var and /usr as mount points. I just confused it with my another server.

Regards

RMA
Ivan Ferreira
Honored Contributor

Re: ALPHA getting tomuch time at startup

What I can see is SCSI messages with Hard error detected for HWID=75 COMPAQ SDT-10000 (Tape drive).

Mar 18 15:21:21 test vmunix: MEDIUM ERROR - Nonrecoverable medium error
Mar 26 12:04:00 test vmunix: Alpha boot: available memory from 0x226c000 to 0x3ff52000

As you can see in those messages, at 15:21 you had the last "MEDIUM ERROR", after that, at Mar 26 12:04, the "Alpha boot:" message.

So, what I would try first is to remove the SCSI tape, or check it's cabling and termination.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Rashid Ashraf
Occasional Advisor

Re: ALPHA getting tomuch time at startup

Hi

Thanks for the reply.
During the boot time, there is no visible activity on tape drive. After boot, Tape drive works fine and I took backup yesterday on two data tapes.
It seems doing some media recovery during the startup. Harddisk lights remain on most of the time while system bootsup.

Regards and thanks

RMA
Pieter 't Hart
Honored Contributor

Re: ALPHA getting tomuch time at startup

to determine what's wrong during the boot process yo do have to reboot!.

from all the posts till now you can't determine if it's a hardware or software problem.
Software is most likely. this can be confirmed by first doing a minimal startup.
like boot into single-user mode!
folowed by step-by step activating the startup.
It could be a wrong setting of a parameter in the sysconfigtab wich frustrates the system.

So again i advice the boot to single user mode. if this also takes unreasonably long abort this startup and try to boot genvmunix (in single user mode). to detect if something went wrong with the running kernel
you can also boot from unix-cd to exclude you have another problem not related to the bootdisk.

As your first post asked about logfiles other than the messages file.
I mentioned "sysman event_viewer" as a tool to read info from different logfiles.
i mentioned the /var/adm/syslog.dated/*/* logfiles and the binary errorlog.
have you tried these options?
the binary errorlog is read with "uerf -R |more" or "dia -R|more" if you have decevent installed