Operating System - Tru64 Unix
1826399 Members
4097 Online
109692 Solutions
New Discussion

Re: bootstrap code problem

 

bootstrap code problem

Hello,

I made myself some trouble, and I would really appreciate if there would be some guru who whould help me out.

I wanted to mount an old Ultrix UFS root partition on my Tru64.

When I plugged in the disk the box freezed.
Then I realized that the ID swithes on the old disk are swapped and the SCSI id of the disk already existed (so freezing was a normal reaction)

Then I halted the box and set up the Ultrix disk's id properly.

Started the boot... but there were some strange SCSI errors regarding disk 5 (the Ultrix one) and the box was extremelly slow.
I switched of the box.

Then I disconnected the Ultrix drive and whated to start my previously perfectly working Tru64 system... without any success :(

It can not boot any more. It halts with the following message:

initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code
can't open osf_boot

halted CPU 0

halt code = 5
HALT instruction executed
PC = 20000030
>>>

Seems it has overwritten the bootstrap code somehow. How can I restore my old bootstrap code?

Any idea would be appreciated.
Thank you in advance.

Regards,
Zoltan Arpadffy
19 REPLIES 19
Michael Schulte zur Sur
Honored Contributor

Re: bootstrap code problem

Hi,

you could boot from cdrom and check/restore the system drive. Maybe due to the new disk the disk names have changed.

greetings,

Michael
Johan Brusche
Honored Contributor

Re: bootstrap code problem


Zoltan,

First, it would help us to respond more precisely if you had specified the Tru64 version.

Second, the displayed message shows that primary bootblock is still intact, but the the secondary bootstrap file osf_boot in the a-partition is apparently lost.

As Michael said in prev entry, you should boot from CD, try to mount the filesystem in the a-partition of the bootdef_dev and use the commands verify and fixfdmn to see if can be recovered, ifnot vrestore is the only option.

BTW,under V5.x disknames are based on WWID's and stay the same whether you change the drive-slot or add another drive.

Rgds,
__ Johan.

_JB_

Re: bootstrap code problem

Hello,

thank you for prompt response... unfortuantely I was unable to fix so far.

I was able to boot from the install CD. Unfortunately, install procedure could not recognize the old root partition.

But, I am able to start a shell from the install procedure.

I wish I would have enough knowledge to perform the task that you described - but I do not know even how to access my dsk0 from this state.

You can find below the one year old information from the box (actually it had more than one year uptime last night Saturday when I sc*d up)

# uerf -R -o full
uerf version 4.2-011 (122)


********************************* ENTRY 1. *********************************

----- EVENT INFORMATION -----

EVENT CLASS OPERATIONAL EVENT
OS EVENT TYPE 300. SYSTEM STARTUP
SEQUENCE NUMBER 0.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Sat Feb 26 21:36:19 2005
OCCURRED ON SYSTEM tru64
SYSTEM ID x0006000D CPU TYPE: DEC 7000
SYSTYPE x0000002A
MESSAGE Alpha boot: available memory from
_0x1166000 to 0xfffe000
Compaq Tru64 UNIX V5.1B (Rev. 2650);
_Thu Feb 24 00:03:25 CET 2005
physical memory = 256.00 megabytes.
available memory = 238.59 megabytes.
using 943 buffers containing 7.36
_megabytes of memory
Firmware revision: 7.0
PALcode: UNIX version 1.46
AlphaStation 255/233
DECchip 21071
82378IB (SIO) PCI/ISA Bridge
pci0 (primary bus:0) at nexus
Loading SIOP: script 800000, reg
_82040000, data 4116c000
scsi0 at psiop0 slot 0 rad 0
isa0 at pci0
gpc0 at isa0
gpc1 not probed
ace0 at isa0
ace1 at isa0
lp0 at isa0
fdi0 at isa0
tga0 at pci0 slot 13
tga0: depth 8, map size 4MB, 1024x768
tga0: ZLXp-E
tu0: DECchip 21040: Revision: 2.4
tu0 at pci0 slot 14
tu0: DEC TULIP (10Mbps) Ethernet
_Interface, hardware address:
_00-00-F8-21-CF-61
tu0: console mode: selecting 10BaseT
_(UTP) port: half duplex
kernel console: tga0
dli: configured
NetRAIN configured.
Random number generator configured.

Filesystem 1k-blocks Used Available Use% Mounted on
root_domain#root 393216 188572 196384 49% /
usr_domain#usr 5949912 2158962 3760072 36% /usr
var_domain#var 2150400 29977 2113720 1% /var

#iostat
tty dsk0 dsk1 cpu
tin tout bps tps bps tps us ni sy id
0 8 160 11 0 0 20 0 12 68

Thank you very much taking this case seriously.

Best regards,
Z
Johan Brusche
Honored Contributor

Re: bootstrap code problem


Z,

First check with "hwmgr show scsi" that you see the disk on SCSIbus#0 target? lun#0, eg PATH [0/0/0] or [0/1/0]

Then run "dsfmgr -vVF" to check all special device files are present.

Now, cd /etc/fdmns
mkdir tmproot
cd tmproot
ln -s /dev/disk/dsk0a
cd /
mount tmproot#root /mnt

If the root fileset is badly damaged, it migth not mount and cause AdvFS panic.
In that case you will have to recreate it,
with the commands mkfdmn and mkfset.

If you do not have another running Tru64, you still can consult the manpages on:
http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V51B_HTML/REF_LIB.HTM
Or on your laptop, via the doc-CD that was delivered with Tru64 V5.1B.

Good luck,
__ Johan.

_JB_
Stiwi Wondrusch
Trusted Contributor

Re: bootstrap code problem

Hi Zoltan

After you boot from CD ROM:
Give us output from:
hwmgr show scsi
disklabel dskx x=Number of Boot Disk

rgds Stiwi

Re: bootstrap code problem

Hello,

thank you for providing good ideas, but I was not able to restore my boot disk.

Here are the output of comands:
(I type by hand, therefore the information might not be exactly as it appears on the screen)

#hwmgr sh scsi

HWID DEVICEID TYPE SUBTYPE OWNER PATH
31 0 disk none 0 1
32 1 disk none 0 1
33 2 cdrom none 2 1


DEVICE VALID
FILE PATH
dsk0 [0/1/0]
dsk1 [0/3/0]
cdrom0 [0/4/0]

#disklabel dsk0
#/dev/rdisk/dsk0c
type: SCSI
... lot of other data - I guess you are interested in partitions.

8 partitions:
# size fstype
a: 786432 AdvFS
b: 786432 swap
c: unused
d: unused
e: unused
f: unused
g: 4300800 AdvFS
h: 11899836 AdvFS

Also checked my root_domain

# ls /etc/fdmns
.advfslock_fdmns root_domain

# ls /etc/fdmns/root_domain
dsk0a

#mount root_domain#root /mnt
root_domain#root on /mnt: Device does not contain a valid ADVFS file system

I just simply can not believe that plugging in a new disk could have such disasterous consequences.

Thank you for you help. Please do not give up - I have not.

Best regards,
Z
Stiwi Wondrusch
Trusted Contributor

Re: bootstrap code problem

Hi Zoltan

_Maybe_ "man salvage" is your last hope.

rgds stiwi
Michael Schulte zur Sur
Honored Contributor

Re: bootstrap code problem

Hi,

please post
disklabel -r dsk0
ls -l /etc/fdmns/root_domain
/sbin/advfs/advscan dsk0

thanks,

Michael

Re: bootstrap code problem

hello,

here is the requested data:

#disklabel -r dsk0
#/dev/rdisk/dsk0c
type: SCSI
... lot of data - I guess you are interested in partitions (please specify if other)

8 partitions:
# size fstype
a: 786432 AdvFS
b: 786432 swap
c: 17773500 unused
d: 5400212 unused
e: 5400212 unused
f: 5400212 unused
g: 4300800 AdvFS
h: 11899836 AdvFS

#ls -l /etc/fdmns/root_domain
dsk0a->/dev/disk/dsk0a

#/sbin/advfs/advscan dsk0
Scanning devices /dev/rdisk/dsk0

Found domains:

*unknown*
Domain ID some number
Created Feb 23 2005
Domain volumes 1
/etc/fdmns links 0
Actual partition found dsk0g*

*unknown*
Domain ID some number
Created Feb 23 2005
Domain volumes 1
/etc/fdmns links 0
Actual partition found dsk0h*

Thank you.

Regards,
Z

Re: bootstrap code problem

Does anybody have an idea... or I need to reinstall the whole box? :(

Thank you.

Regards,
Z
david puzycki
New Member

Re: bootstrap code problem

Please help, I am in the same mess as described in this thread. I also swapped disk drives and now encountered the same error message "can't open osf_boot".

I have booted under cdrom and have run "hwmgr show scsi":

HWID Device ID
59 0 disk dsk0 [0/1/0]
60 1 disk dsk1 [0/2/0]
61 2 disk dsk2 [0/3/0]
62 3 disk dsk3 [0/4/0]
63 4 disk cdrom0 [0/1/0]

The command /sbin/advfs/advscan dsk0
displays: "can't open device rdsk0a"

disklabel on rdisk shows 8 partitons, but disklabel on dsk0 or dsk0a errors.

If anyone can please help me out of this jam it would be appreciated!

Thanks Dave

Re: bootstrap code problem

Dave,

I do not want to disappoint you, but non of the ideas listed above have helped.

I was forced to reinstall the box from scratch and restore data from my latest backup.

Sorry, about the bad news. I could not believe either that this could be the only solution, but nothing else worked. :(
jim owens_1
Valued Contributor

Re: bootstrap code problem

First check to see if you have an ID conflict:

59 0 disk dsk0 [0/1/0]
63 4 disk cdrom0 [0/1/0]

The 0/1/0 is the bus/target/lun and can not be the same for real scsi devices. However, this would be OK if the cdrom is an IDE device because then hwmgr is just reporting a fake id.

Use the console "show config" to see.
david puzycki
New Member

Re: bootstrap code problem

Jim,

Interesting, the working system shows the following:

HWID Device ID
59 0 disk dsk0 [0/0/0]
60 1 disk dsk1 [0/2/0]
61 2 disk dsk2 [0/3/0]
62 3 disk dsk3 [0/4/0]
63 4 cdrom cdrom0 [1/0/0]

So there is apparently a conflict between dsk0 and the cdrom on the non-working systems, can you help me resolve it?

Best Regards...Dave

Re: bootstrap code problem

Dave,

try to plug cdrom device to another bus or change (with the jumpers on the cdrom) the scsi id to 5 or 6.
This should relove your problem.

Regards,
Z
Johan Brusche
Honored Contributor

Re: bootstrap code problem


There is no conflict at all, the cdrom is on bus #1, the other disks are on bus#0.

Get your latest vdump tape foor root, boot from CD and mkfdmn root_domain, mkfset root + vrestore.

Rgds,

___ Johan.

_JB_
Michael Schulte zur Sur
Honored Contributor

Re: bootstrap code problem

Johan,

the second output was the one that is working. The first output looks a bit different. There it seems there is a collision.

Michael
david puzycki
New Member

Re: bootstrap code problem

Yes I believe there is a scsi conflict based on the hwmgr output of the system that is working.

How can I resolve my scsi ID conflict? Is there a jumper on the cdrom to set a specfic ID or should I physically move the cdrom to a different location?

Tks..Dave
jim owens_1
Valued Contributor

Re: bootstrap code problem

The answer to change slot/cable/jumper depends on the hardware configuration and you have not told us what it is, not even the system model. I really want to see the full console "show config" output.

With no info to go on, I might guess that the easiest option is to move dsk0 to another slot so its scsi id changes.