Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Shadowing on a system disk.

 
Hoff
Honored Contributor

Re: Shadowing on a system disk.

TKelly, please post the errors.

Correctly-functioning shadowing does not log errors, and initializing disks for inclusion into a shadowset is not particularly relevant to the generation of errors.

Please post the errors.

If the errors are on the system disk or a critical data disk, please ensure you have a restorable copy created from an off-line system disk; from the distro media or a spare system disk.

Please post the errors.

This could be a simple case of an incorrectly-terminated SCSI bus or a bad cable or such, or there could be a disk error or controller error or other lower-level issue lurking. Shadowing does put a substantial I/O load on the hardware, and does tend to expose lower-level hardware problems.

Please post the errors.

Volker Halle
Honored Contributor

Re: Shadowing on a system disk.

Jim,

BACKUP/PHYSICAL requires downtime, as the source disk needs to be mounted /FOREIGN.

TKelly,

INIT/ERASE is not generally necessary before adding a new disk to a shadowset, an INIT disk: SCRATCH before adding it to a shadowset might always make sense.

In this case, where you do not trust DKA0: anymore, INIT/ERASE DKA0: SCRATCH would at least write to ALL the blocks of the replacement disk to make sure the disk is o.k.

Volker.
Jim_McKinney
Honored Contributor

Re: Shadowing on a system disk.

> BACKUP/PHYSICAL requires downtime, as the source disk needs to be mounted /FOREIGN.

Are you sure about this? Seems to work ok here on a 7.3-2 system...

A$ moun/fore $1$DKE300
A$ back/phys/igno=inte sys$sysdevice: $1$DKE300:
^T
XXXA::MCKINNEYJ 08:21:49 BACKUP CPU=00:00:03.06 PF=5197 IO=13918 MEM=960
Input: physical LBN 241727 (of 17773524)
Output: saveset volume 0, block 0 (33040 byte blocks)


Volker Halle
Honored Contributor

Re: Shadowing on a system disk.

Jim,

in your case, $1$DKE300: was not mounted and you therefore can mount it foreign. That works.

But TKelly wanted to add a new member to the system disk shadowset and to do a BACKUP/PHYSICAL to that new member first, it would require the system-disk to be dismounted to allow a MOUNT/FOR.

Or is there some misunderstanding ?

Volker.
Jim_McKinney
Honored Contributor

Re: Shadowing on a system disk.

Only the target disk need be mounted foreign. The currently mounted system disk DSA device can be used a source and the foreign mounted new member can be the target. Once the backup is complete, the target disk can be dismounted and then placed into the shadow set (after scramling the SCB). The following works for me... VMS 7.3-2.

$ sh dev sys$sysdevice

Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
DSA0: Mounted 0 ALPHA73 12188700 615 1
$1$DKA100: (XXXX) ShadowSetMember 0 (member of DSA0:)

$ mou/for $1$dka200:
$ back/phys/ign=inter dsa0: $1$dka200:
^T
XXXX::MCKINNEYJ 09:22:26 BACKUP CPU=00:00:07.93 PF=6103 IO=182807 MEM=1024
Input: physical LBN 23551 (of 17773524)
Output: saveset volume 0, block 0 (33040 byte blocks)
Volker Halle
Honored Contributor

Re: Shadowing on a system disk.

Jim,

sorry, you're right.

The manual says: For a BACKUP/PHYSICAL operation, the input disk needs to be mounted /FOREIGN or you must have LOG_IO or PHY_IO privilege.

I tried MOUNT/FOR on the already mounted input disk and this - of course - failed.

Volker.
TKelly_1
Occasional Advisor

Re: Shadowing on a system disk.

Hello all,

Thanks for the valuable input. I only have a limited window each day allocated to this so apologies for my delays replying.

Ok, first a small clarification, as I can see getting all the detail 100% correct is very important. The machine is a DS20E running openvms7.3-1. Apologies if that makes a material difference.

Re init/ERASE:
I ran an init/ERASE on DKA0 yesterday. It took around 30 minutes. No errors were reported and the error count on the disk did not increase. So I guess that's positive.

Re "Posting the Errors":

This morning I ran the following mount command (and received the OS responses that follow):


$ mount/system/confirm dsa0: /shadow=($1$dka0:) alphasys
%MOUNT-F-SHDWCOPYREQ, shadow copy required
Virtual Unit - _DSA0: Volume Label - ALPHASYS
Member Volume Label Owner UIC
_$1$DKA0: (T1) ALPHASYS1 [IT,TAYLODGE]
Allow FULL shadow copy on the above member(s)? [N]:y
%MOUNT-I-MOUNTED, ALPHASYS mounted on _DSA0:
%MOUNT-I-SHDWMEMCOPY, _$1$DKA0: (T1) added to the shadow set with a copy operatn
%MOUNT-I-ISAMBR, _$1$DKA100: (T1) is a member of the shadow set

...so that part looks good.

The copy operation starts, gets to about 3% and then the Error Count on the disk starts rising rapidly (in the first couple of minutes the Error Count on the disk jumps from 46407 to 55000). See below:

T1:SYS> sh dev dka0

Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
DSA0: Mounted 0 ALPHASYS 5143917 729 1
$1$DKA0: (T1) ShadowCopying 55036 (copy trgt DSA0: 3% copied)
$1$DKA100: (T1) ShadowSetMember 0 (member of DSA0:)

The Error Count continues to rise...

I then did a "dia/cont" while the errors are being registered, and chose one of the many DKA0 errors I saw (they appear to all be more or less the same). Here it is:

**** V3.4 ********************* ENTRY 42 ********************************


Logging OS 1. OpenVMS
System Architecture 2. Alpha
OS version V7.3-1
Event sequence number 26472.
Timestamp of occurrence 13-NOV-2009 09:54:21
Time since reboot 24 Day(s) 19:20:08
Host name T1

System Model COMPAQ AlphaServer DS20E 666 MH

Entry Type 1. Device Error


---- Device Profile ----
Unit $1$DKA0
Product Name BD0096826B
Vendor COMPAQ

-- Driver Supplied Info -
Device Firmware Revision HPB4
VMS SCSI Error Type 5. Extended Sense Data from Device
SCSI ID x00
SCSI LUN x00
SCSI SUBLUN x00
Port Status x00000001 NORMAL - normal successful completion
SCSI Command Opcode x2A Write (10 byte command)
Command Data
x00
x00
x08
x3E
xE1
x00
x00
x7F
x00

SCSI Status x02 Check Condition
Remaining Byte Length 18.

--- Device Sense Data ---

Error Code x70 Current Error
Segment # x00
Information Byte 3 x00
Byte 2 x00
Byte 1 x00
Byte 0 x00
Sense Key x0B Aborted Command
Additional Sense Length x0A
CMD Specific Info Byte 3 x00
Byte 2 x00
Byte 1 x00
Byte 0 x00
ASC & ASCQ x4700 ASC = x0047
ASCQ = x0000
SCSI Parity Error
FRU Code x03
Sense Key Specific Byte 0 x00 Sense Key Data NOT Valid
Byte 1 x00
Byte 2 x00

----- Software Info -----
UCB$x_ERTCNT 16. Retries Remaining
UCB$x_ERTMAX 16. Retries Allowable
IRP$Q_IOSB x0000000000000000
UCB$x_STS x18020810 Online
Software Valid
Volume is Valid on the local node
Unit supports the Extended Function bit
IRP$L_PID x8288A910 Requestor "PID"
IRP$x_BOFF 0. Byte Page Offset
IRP$x_BCNT 65024. Transfer Size In Byte(s)
UCB$x_ERRCNT 63212. Errors This Unit
UCB$L_OPCNT 404302. QIO's This Unit
ORB$L_OWNER x00010004 Owners UIC
UCB$L_DEVCHAR1 x1C4D4008 Directory Structured
File Oriented
Sharable
Available
Mounted
Error Logging
Capable of Input
Capable of Output
Random Access


I can see various pieces of information that look like errors here, i.e.:
VMS SCSI Error Type 5
SCSI Parity Error
Sense Key Data NOT Valid
... but unfortunately those errors don't mean a lot to me at the moment.

Thanks and regards,
Tom.
Volker Halle
Honored Contributor

Re: Shadowing on a system disk.

Tom,

the error is on the destination disk (DKA0), so that explains why you shadow-copy gets stuck there ! Looks like a SCSI parity error on a WRITE command.

So there is some hardware problem with this disk or the SCSI path to this disk.

Volker.
TKelly_1
Occasional Advisor

Re: Shadowing on a system disk.

Gah - hardware! Folks, many thanks for taking the time to read through this. I've learned a bit (Thks Jim/Volker for the discussion on the 'mount' above).

I'll pass the errors etc. on to the lad that installed the disk in the first place.

Thanks,
Tom.
Highlighted
TKelly_1
Occasional Advisor

Re: Shadowing on a system disk.

APOLOGIES!

Shoot. We run a points system on a forum in work, and it's "points out of ten". Total. For a thread. I assumed that was the same here...

I allocated points above to 'divide up my ten available points'... only afterwards when I saw that I could still assign points to the remaining posts did I think I may have made a mistake.

One quick search later reveals the itrc points system:
N/A: The answer was simply a point of clarification to my original question
o 1-3: The answer didn't really help answer my question, but thanks for your assistance!
o 4- 7: The answer helped with a portion of my question, but I still need some additional help!
o 8-10: The answer has solved my problem completely! Now I'm a happy camper!

I'm sorry. I really appreciate you all taking the time to answer. Allocating 1 point here and 2 there above wasn't meant to indicate a lack of appreciation for the replies, I just misunderstood the points system.

Regards to all,
Tom.