- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Trying to simulate a hardware error on OpenVMS ser...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-12-2010 09:06 AM
тАО04-12-2010 09:06 AM
We are using the HP Operations Manager for Windows VMS Smart Plug In (SPI) to monitor hardware on our VMS servers.
I'm trying to confirm the SPI works by increasing the error count on one of the devices - a tape drive.
Is anyone aware of a command I can run on a VMS server to increase the error count for a device?
I know "set device/reset=(error,operation)" can be used to clear the error count so I'm hoping there's something similiar to increase it so I can simulate a hardware issue on our VMS server's tape drive.
Any assistance would be greatly appreciated.
Thanks in advance.
Tom Wolf
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-12-2010 09:22 AM
тАО04-12-2010 09:22 AM
Re: Trying to simulate a hardware error on OpenVMS server
http://www.digiater.nl/lddriver.html#LD%20V9.4
It know can inject errors for the LDdriver, and I assume it can do so for for the LMdriver.
hth,
Hein
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-12-2010 09:38 AM
тАО04-12-2010 09:38 AM
Re: Trying to simulate a hardware error on OpenVMS server
You could use zdec:
http://www.decuslib.com/decus/vmslt02a/vu/zdec-src.txt
or likely better, clear_errors:
http://www.decuslib.com/decus/freewarev80/clear_errors
Or use SDA on the console and locate the error count for a device in virtual address space and halt the box and "bomb core", err, deposit and continue from the SRM console.
Or use the SDA data to generate a targeted version of this brute-force tool:
http://labs.hoffmanlabs.com/node/815
Or briefly pull the Ethernet connection and plug it back in.
Or load an older magtape and perform a BACKUP.
All of these assume you have a testing server. While unlikely to crash, I would not suggest any of these on a production server.
Some of these (such as the halt-continue) are specific to (most) Alpha boxes and will not operate on Integrity.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-12-2010 09:42 AM
тАО04-12-2010 09:42 AM
Re: Trying to simulate a hardware error on OpenVMS server
do you know how this piece of software monitors 'hardware errors' on OpenVMS ?
Just by looking at the device error counts ? Or maybe by watching the ERRLOG.SYS file or declaring an error log mailbox ?
There is no OpenVMS command to increase the error count on a device.
Using LD (or LM), you can induce QIO errors, but I doubt that you can increase the error count of LD (or LM) devices.
Would this software monitor the link state change of a LAN interface ? Maybe shortly unplug one of the LAN cables ?
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-12-2010 10:17 AM
тАО04-12-2010 10:17 AM
Re: Trying to simulate a hardware error on OpenVMS server
I'd hope that this SPI widget tapped into the OpenVMS error reporting mechanisms and the system service API for that, but I'd tend to assume not. That the tool polls the error count displays is more likely.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-12-2010 09:23 PM
тАО04-12-2010 09:23 PM
SolutionLM does not allow one to induce an error (yet).
If you want to set an arbitrary count you can do this:
thealp> sh dev dk
Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
$12$DKA0: (THEALP) Mounted 0 THEALP_V83 1773616 368 1
$12$DKA1: (THEALP) Online 0
$12$DKA3: (THEALP) Online 0
$12$DKA400: (THEALP) Online wrtlck 0
thealp> ana/sys
OpenVMS system analyzer
SDA> sh dev dka0
$12$DKA0 [THEALP$DKA0] RZ28 UCB: 81C7A780
Device status: 18021810 online,valid,unload,lcl_valid,exfunc_supp,fast_path
Characteristics: 1C4D4008 dir,fod,shr,avl,mnt,elg,idv,odv,rnd
01010201 clu,nnm,nlt,scsi
SUD Status 00000001 path_available
DK Flags 1430401A first_attn_seen,disconnect,synchronous,hbs_check,port_cmdq,cmdq,port_autosense,clusq
DK Flags 2 00000030 sectors_via_ms,trk_cyl_via_ms
Owner UIC [000001,000004] Operation count 6010 ORB address 81C7ACC0
PID 00000000 Error count 0 DDB address 81C7A580
Alloc. lock ID 0100007D Reference count 129 DDT address 81992DA0
Alloc. class 12 Online count 1 SUD address 81C7ABC0
Class/Type 01/80 Retry cnt/max 16/16 VCB address 81CB9740
Def. buf. size 512 BOFF 00000A00 CRB address 81C7A600
DEVDEPEND 0A231063 Byte count 00000200 I/O wait queue 81C7A838
DEVDEPND2 00000000 SVAPTE FFDFC148
DEVDEPND3 01000001 DEVSTS 00000004
FLCK index 3A
DLCK address 81C7A680
Preferred CPUDB 81C6B680
Preferred CPUID 001
-- Device Path Information --
UCB: 81C7A780 Path: PKA0.0
*** PORT I/O queue is empty ***
*** DEVICE I/O queue is empty ***
*** I/O request queue is empty ***
Press RETURN for more.
SDA> ev ucb+ucb$l_errcnt
Hex = FFFFFFFF.81C7A898 Decimal = -2117621608 UCB+00118
SDA> Exit
thealp> r sys$share:delta
OpenVMS Alpha DELTA Debugger
Exit 00000001
80088F18! LDQ R28,#X0008(SP) 1;m
00000001
10001:FFFFFFFF81C7A898/00000000 10
exit
thealp> sh dev dk
Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
$12$DKA0: (THEALP) Mounted 16 THEALP_V83 1773616 368 1
$12$DKA1: (THEALP) Online 0
$12$DKA3: (THEALP) Online 0
$12$DKA400: (THEALP) Online wrtlck 0
So, at the delta prompt enter this:
1;m
10001:ffffffff81c7a898/
then enter the new value followed by a return.
And don't try it on a production system unless you really know what you do.
Jur.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-13-2010 09:22 AM
тАО04-13-2010 09:22 AM
Re: Trying to simulate a hardware error on OpenVMS server
Volker Halle:
Or maybe by watching the ERRLOG.SYS file or declaring an error log mailbox ?
<<<<<<<<<<
If it watches ERRLOG.SYS, as does WEBES (SEA) via ELMC, install ELMC and use the ELMC test. Both can be downloaded from http://www.compaq.com/support/svctools/webes/webesdownloads.html
HTH,
Michelle
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2010 04:58 AM
тАО04-14-2010 04:58 AM
Re: Trying to simulate a hardware error on OpenVMS server
I can confirm that the VMS SPI works very well (in our environment ... VMS v8.3 on ES47's) and has notified on hardware events such as a network switch rebooting (the ethernet device error count went up). If you use Volume Shadowing (and the devices are setup in the config file), it "notices" when shadow membership is reduced ... I use it to tell the daily backups are proceeding because the notifications of missing shadow disks get automatically acknowledged and are removed from the display.
Cheers,
Art