Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Backup Issues

 
SOLVED
Go to solution
Hoff
Honored Contributor

Re: Backup Issues

An errant MOUNT or an errant BACKUP command risks either not making a proper BACKUP, or risks overwriting existing disk data.

This is one of the most arcane and cryptic areas of OpenVMS, and there are very few "blade guards" here; data can get clobbered. BACKUPs can get trashed.

For purposes of a hand-entered BACKUP command, you need not use the logical name on the MOUNT command.

The logical name parameter on MOUNT is useful within a command procedure however, as it can be used as the target for all subsequent device references within the procedure. MYCORP_TARGET and MYCORP_SOURCE could be used as the logical names for the output and input devices, for instance.

The volume label must be the assigned volume label found on the volume, or you must use the MOUNT /OVERRIDE, or you're using (as is often the case) MOUNT /FOREIGN.

Are there any bound-volume sets here?

Again, this area is data-hazardous. Please take the time to read and understand the DCL command syntax, and please consider practicing in a testing configuration where an errant command can trash data without harm to production. Do also look at the BACKUP command examples in the back of the BACKUP manual, as these provide many examples of the various sequences with BACKUP, and you can find and choose the particular command associated with what you want to do.

There are example BACKUP command procedures around that can be used as starting points, as well.

As for another potential hazard here, I see Rdb referenced. (Rdb itself isn't hazardous, but there are specific RMU commands needed to perform a successful and restore-able backup of an Rdb database. You can't use OpenVMS BACKUP directly on an Rdb database and expect to restore the database.)

And FWIW, the existing backup archives here potentially (probably?) contain silent data corruptions, too. (Those file interlocks that are being overridden were implemented for a reason, after all. Not because the engineers wanted to force folks to use another qualifier keyword on BACKUP.)

This whole area is comparatively ancient technology -- and not all that much past what RSX11M+ implemented. The UI and the tools are such that it is accordingly very easy to unintentionally corrupt critical data.

Stephen Hoffman
HoffmanLabs LLC
odwillia
Frequent Advisor

Re: Backup Issues

This is what I get when I do a show dev d:

DPA300: (NRCAVB) MntVerifyTimeout 3598 DEVDISK1 3758040 39 1
DPA301: (NRCAVB) MntVerifyTimeout 3598 DEVDISK2 646806 15 1

Volker Halle
Honored Contributor

Re: Backup Issues

So you're running 'HP RAID Software for OpenVMS '. DPA devices are software RAID devices.

You need to use appropriate $ RAID SHOW commands to find out about the structure of your RAID sets and the physical disks involved.

You can use $ RAID ANALYZE/ERROR_LOG to find RAID-related errlog entries.

There should also be a SYS$MANAGER:RAID$DIAGNOSTICS_nodename.LOG file with diagnostic messages.

Volker.
odwillia
Frequent Advisor

Re: Backup Issues

I have attached the show raid results.
Volker Halle
Honored Contributor

Re: Backup Issues

This looks like a RAID 0 array configured over 6 disks of a Mylex controller, which has been partitioned into 3 virtual units.

All the units have had a lot of errors reported against them.

Please look at the diagnostics file mentioned earlier and try to find out what happened when.

The HP RAID Software for OpenVMS - Guide to Operations can be found here:

http://h30266.www3.hp.com/odl/vax/sysman/raidv30/raid_ops_guide.pdf

You will at least need to dismount the DPA: devices, which are in MntVerifyTimeout, then re-mount them with the MOUNT commands to be found somewhere in your system startup procedures. But first try to find out what happened.

Consider to obtain qualified help to prevent damage to your data, if you're uncertain what to do...

Volker.
odwillia
Frequent Advisor

Re: Backup Issues

$ SEARCH SYS$STARTUP:*.COM MOUNT,DEVSCRATCH1/MATCH=and
%SEARCH-I-NULLFILE, file SYS$SYSROOT:[SYSMGR]ADDOPER.COM;2 contains no records
%SEARCH-I-NULLFILE, file SYS$SYSROOT:[SYSMGR]ADDSYS.COM;2 contains no records

I got this error when I attempted to remount.

******************************
SYS$SYSROOT:[SYSMGR]NRC_MOUNT_DISKS.COM;21

$ Mountxx/noass/sys/rebuild DPA302: DEVSCRATCH1 DEVSCRATCH1

******************************
SYS$COMMON:[SYSMGR]SYSHUTDWN.COM;18

$ dismountxx/abort/over=check DISK$DEVSCRATCH1 !dpa302:
$ Mountxx/noass/sys/rebuild DPA302: DEVSCRATCH1 DEVSCRATCH1
%MOUNT-F-MEDOFL, medium is offline
Volker Halle
Honored Contributor

Re: Backup Issues

This may be a software or hardware problem. Use RAID ANAL/ERR, OPERATOR.LOG and SYS$MANAGER:RAID$DIAGNOSTICS_nodename.LOG to find out, when and how this problem started. This may tell you what has failed and when.

Can you currently access DPA300: and DPA301: without problems ? Is only DPA302: giving you problems ?

Did you look at the drives behind the Mylex controller. Any yellow or red lights ?

Volker.
odwillia
Frequent Advisor

Re: Backup Issues

No yellow or red lights. I'm looking for the error logs now.
Volker Halle
Honored Contributor

Re: Backup Issues

When re-reading your previous replies, I see that all 3 DPA devices are/were in MntVerifyTimeout. This most likely indicates a problem with the Mylex controller irself or maybe the shelf the disks are in (power-fail ?).

As you are using a partitioned RAID 0 stripeset, the failure of ANY physical DRA disk will cause the whole array to become inoperative !

First make sure to check for your last GOOD backup of these 3 DPA: devices !

Volker.
Guenther Froehlin
Valued Contributor

Re: Backup Issues

I recommend to do a DISMOUNT/CLUSTER/ABORT for all three DPA devices.

If that succeeded do a RAID UNBIND of the array.

Did all DRA devices dismount? If not issue DISMOUNTs for the DRA devices not dismounted yet.

Mount all DRA devices with MOUNT/OVER=ID/NOASSIST. If that fails...fix a DRA underlying problem.

If that worked for all DRA devices dismount them and do the RAID BIND command (parameters are somewhere in SYSTARTUP:*.COM - hopefully).

/Guenther