1839199 Members
3023 Online
110137 Solutions
New Discussion

Re: Cannot form DSA0:

 
SOLVED
Go to solution
FOX MULDER_2
Frequent Advisor

Re: Cannot form DSA0:

Hi,
I am sending the output.


Thanks
Jur van der Burg
Respected Contributor

Re: Cannot form DSA0:

00000000 0000353A 04000180 00000000

The 04000180 is a flags longword, and it indicates that the dumpfile write has not completed. It could be that a full dump instead of a partial dump was selected (system parameter dumpstyle) and that the dumpfile is too small. You may want to check that bit 0 is set in dumpstyle (enable partial dumps), and/or you may need to increase the dumpfile size.

Jur.

Andreas Vollmer
Valued Contributor

Re: Cannot form DSA0:

Hello Fox,

The used boot flags (fl) are not common, at least not for me.
Please explain the meaning of these.

Pls. try following:
b -fl 0,1 dkb0
- see what happened.
- if the system crashes again try following:

b -fl 0,1 dkb0
at the sysboot prompt:
show /cluster
show shadow

If it is possible disable shadowing - for troubleshooting purpose -
set shadowing 0
set startup_p1 "min"
cont

If the system is now bootable then we have to check the system parameters

The attachment contains an example of the mentioned system parameter settings.

--------------------------
here are some references for the SRM boot options.

http://h18002.www1.hp.com/alphaserver/docs/userguide/WebHelp/boot_flags_settings_table.htm

http://h18002.www1.hp.com/alphaserver/docs/userguide/WebHelp/boot_osflags.htm

Flags_ Bit Meaning
Value Number
1 0 Bootstrap conversationally
(enables you to modify SYSGEN parameters in SYSBOOT).
2 1 Map XDELTA to a running system
4 2 Stop at initial system breakpoint.
8 3 Perform diagnostic breakpoints.
10 4 Stop at the bootstrap breakpoints.
20 5 Omit header from secondary bootstrap file.
80 7 Prompt for the name of the secondary bootstrap file.
100 8 Halt before secondary bootstrap.
10000 16 Display debug messages during booting.
20000 17 Display user messages during booting.

Regards
Andreas
OpenVMS Forever!
Volker Halle
Honored Contributor

Re: Cannot form DSA0:

Fox,

04000180

This indicates:

ERRLOGCOMP - errlog write complete
WRITECOMP is not set - dump write not completed
DUMPSYTLE = 1

Did you halt the system while the dump was being written ?

Volker.
FOX MULDER_2
Frequent Advisor

Re: Cannot form DSA0:

Hi,
No,halt was not issued anytime.

The sysgen parameters are:
SYSGEN> SH DUMP
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
DUMPSTYLE 1 0 0 -1 Bitmask
DUMPBUG 1 1 0 1 Boolean
SYSGEN>

Thnaks
Steve-Thompson
Regular Advisor

Re: Cannot form DSA0:

Hi Fox

(good choice of name, you've got us all foxed).

We're going to have to introduce a "variable" here to see if we can get a different clue.

The system was working .... it crashed ... and since then, nothing!

What architecture is the Alpha and disk system. I'll assume it's "old", Say a A1200, given you are using 6.2.

Did you try ...
Booting from a different system disk and mounting your "problem pair" as a shadow?
(See Wim's addition about experiencing a similar crash under 7.3. This test would avoid change parameters at "sysboot" time.)

Failing that, heres some ideas, that may help you ...
Did you add any other HW to the system beforehand?
Reduce the disks on the bus to a minimum. (the 2 VMS members)!
Check the SCSI bus termination!
Can you put the disks on a "different" bus (a different system would be better) and retest?
Does a Backup of the/both disks work? (Ie. No dodgy tracks).

Something to keep you going over the weekend.

Good luck
FOX MULDER_2
Frequent Advisor

Re: Cannot form DSA0:

Hi,

The system is Alpha 2100 and have HSZ40 controller(2).

It was working fine till it crashed.
Tried booting with the second disk also,did not work.

Next keeping the second member offline,restored system disk image from tape to primary disk.The restoration was successful but same error was encountered.
Also,booted from cd and tried
Mount/over=(id,shadow) DKB0:
........
But same error....

No hardware.no changes to the system was done recently.

When booted using
shadow_sys_disk 0
the system is working perfectly.
Only when shadow_sys_disk is set to 1,this malfunction is seen.
As the system is on remote location,
reducing it to 2 member and other hardware changes may be difficult.

As this is a production system,not much experimentation is not possible either.

Planning for patching this wekend,but not yet approved.

Thanks


Volker Halle
Honored Contributor

Re: Cannot form DSA0:

Fox,

from your console printout of the crash, it looks like writing the dump has worked. But the WRITECOMP bit is not set, so SDA complains. I have a program to re-validate dumps, maybe we could try that, if everything else fails. The key for identifying the problems is in the dump.

You could also set DUMPSTYLE=3, this would give more console output from the crash (including registers and stack).

Volker.
Volker Halle
Honored Contributor

Re: Cannot form DSA0:

Fox,

there are one or two known possible problems, which may lead to a SSRVEXCEPT crash in STACONFIG.EXE

Please do a DIR/SIZ=ALL DKB0:SYSMGR.DIR - what is the size of this directory ?

Please do an ANAL/IMAGE/INT of all SYS$LIBRARY:*$ICBM.EXE files and check the system version array information (1st page). Any Current System value greater than Image ?

Volker.
John Koska
Advisor

Re: Cannot form DSA0:

A long shot, but boot from DKB0: and then do $ANAL/DISK/REPAIR on it to make sure the disk structure is ok.

Then defrag the disk using either online defrag, if available, or boot from CD and do tape backup/image and restore to defrag.

Boot from DKB0: again and mount the other system disk shadow member as an unshadowed disk and likewise $ANAL/DISK/REPAIR to see if there is consistent disk structure.

Then boot to try to form shadowed system disk. If still a problem, boot from DKB0: and see if SYSMAN> IO AUTO runs clean and to completion.

Cleaning up the system disk may provide for a bit different timing in configuring devices and what not, to get by the problem.

If it does not fix, at least you have cleaned up and defrag'd system disk, such that things will run a bit better after you find the root cause.

:) jck
Volker Halle
Honored Contributor

Re: Cannot form DSA0:

jck,

Fox has reported that the copy of the system disk as restored from tape showed the same problem. This is as good as defragmenting the disk.

ANAL/DISK is a good check for any unusual problems in the file system.

Volker.
John Koska
Advisor

Re: Cannot form DSA0:

My bad on missing the tape resort in the thread.

I suppose a fresh operating system install on one of the unshadowed disks, and copying over all the site-specific files from the other would be a possibility also, but a lot of work.

:) jck
FOX MULDER_2
Frequent Advisor

Re: Cannot form DSA0:

Hi,
The output is :
Directory DKB0:

SYSMGR.DIR;1 130/162

Total of 1 file, 130/162 blocks.


I have also tried anal/disk/repair
but no luck.

Thanks
Ian Miller.
Honored Contributor

Re: Cannot form DSA0:

have you checked on the HSZ40 to see it is has reported any errors?

RUN FMU
SHOW LAST ENTRY

(or something like that)
____________________
Purely Personal Opinion
Volker Halle
Honored Contributor

Re: Cannot form DSA0:

Fox,

>>> SYSMGR.DIR;1 130/162 <<<

That's one of the possible culprits ! SYSMGR.DIR exceeding 128 blocks !

Try to get rid of logfiles and temporary files in that directory and compress it using DFU !

Volker.
Volker Halle
Honored Contributor

Re: Cannot form DSA0:

Fox,

some background information about STACONFIG:

STACONFIG is used to dynamically load the class drivers for MSCP-served disks and tapes. It also loads port drivers when in a cluster or booting from a HBS shadowed system disk. It loads PEDRIVER and LAN drivers when required to join a cluster. It uses the primitive file system.


Assuming you're not running in a cluster, this would explain, why you don't get the crash when booting with SHADOW_SYS_DISK=0 - STACONFIG would not be used then.

The primitive file system drivers in OpenVMS Alpha V6.2 seem to have a problem with handling directory file sizes beyond 128 blocks. This would also explain, why the problem persists after restoring the system disk from your backup tape and also why the problem has 'suddenly' shown up (once the SYSMGR.DIR exceeded 128 blocks).

Just hoping that this scenario fits your problem.

Volker.
Andreas Vollmer
Valued Contributor

Re: Cannot form DSA0:

Hello Fox,

A assume Volker is right.
A longtime ago I experienced that kind of problem but with OpenVMS V6.2-1H4.
The DFU was my lifebelt.
I thought that problem was fixed!
Therefore it didn't striked my mind about this problem.
Here is the link for downloading:
http://www.digiater.nl/dfu.html

As Volker mentioned, purge the files and remove the log files.
You must have a spare disk, if so install the DFU on that disk.
Mount the system disk with MOUNT /OVER=(ID,SHADOW) $1$DKB0:
Then use DFU to defragment the directory file
SYSMGR.DIR.
The DFU contains a self explanatory interface. But, as always, do a security backup of that disk - just in case.
Regards
Andreas
OpenVMS Forever!
Jur van der Burg
Respected Contributor

Re: Cannot form DSA0:

DFU may be an option, but if you don't have it installed you can defrag the dir by hand:

$ crea/dir/own=system [sys0.sysmgr1]/prot=(gr:re,wo:re,ow:rwe,sy:rwe)
$ rename [sys0.sysmgr]*.*;* [sys0.sysmgr1]
$ set prot=wo:rwed [sys0]sysmgr.dir
$ delete [sys0]sysmgr.dir;
$ rename [sys0]sysmgr1.dir sysmgr.dir

Don't do this with activity on the system.

Jur.
Andreas Vollmer
Valued Contributor

Re: Cannot form DSA0:

Hi Fox,

In general absolute correct sugestion from Jurg - but...
In case you have some sub-directories SYSMGR these might NOT be renamed to the new directory.
The DIREC/FULL respectively DIREC/SEC provides you more detailed information about the SYSMGR directory. This helps you to 'very' the "repair" actions below.
Therefore I suggest:

$ DIRECTORY /FULL [sys0]sysmgr.dir
$ DIRECTORY /SECURITY [sys0]sysmgr.dir
$ create /direc /owner=system [sys0.sysmgr1] /prot=(gr:re,wo:re,ow:rwe,sy:rwe)
$ rename [sys0.sysmgr...]*.*;* [sys0.sysmgr1]
$ set prot=world:rwed [sys0]sysmgr.dir
$ delete /LOG /CONFIRM [sys0]sysmgr.dir;
$ rename /LOG [sys0]sysmgr1.dir sysmgr.dir
$ DIRECTORY /SECURITY [sys0]sysmgr.dir
$ DIRECTORY /FULL [sys0]sysmgr.dir

Regards
Andreas
OpenVMS Forever!
Jur van der Burg
Respected Contributor

Re: Cannot form DSA0:

>In case you have some sub-directories SYSMGR these might NOT be renamed to the new directory.

Eh? Not true. The rename command will move eventual subdirectories as well. You only might run into a protected subdir, so if you do it from the system account or an account with bypass priv it will work.

Jur.