Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

troubles for uninstall webspheremq 5.3 on open vms

 
SOLVED
Go to solution
lalo57
Advisor

troubles for uninstall webspheremq 5.3 on open vms

I want to know what other task should i do for
uninstall webshpere client 5.3 on my open vms system manually. I´ve tried to use the PCSI "PRODUCT REMOVE" command but when i used it, the system hang completely.

This server works with other alpha sharing a disk cluster. When I use the "product remove" instantly both server were hung.

Our OS is OpenVMS 7.3-2. My idea is to uninstall the current mq client(v.53) and install instead MQ Client for OpenVMS v.51
is that possible?

Thanks for your ideas,
Eduardo
2 REPLIES 2
Paul Gotkin
Occasional Visitor
Solution

Re: troubles for uninstall webspheremq 5.3 on open vms

Hi Eduardo, don't know much about MQ deinstalls, but the fact that your other node hangs indicates an issue with the lock manager, probably locking a common SYSUAF or RIGHTSLIST, a root lock on a shared disk or if the other node is also running MQ, a shared MQS lock resource block. I suspect a hung UAF/RIGHTSLIST or a shared system disk root lock is your problem as these are the most common cause for remote node hangs.

When your PRODUCT REMOVE hangs, do you know what it is running (probably an image)? I suspect one of the deinstallation images is locking your UAF/RIGHTSLIST or acquiring the volume allocation lock and is hanging. I would guess that you are booting both nodes from a common system disk where MQS is installed?

Can you Control/Y out of the PRODUCT REMOVE session (make sure you have all privs and set control=y)? If control/y, control/c and control/t all fail, you are stuck in the lock manager above supervisor mode, probably either EXEC or KRNL mode leading to your cluster hang. It would be most useful to know what image you are running at the PRODUCT REMOVE that leads to the hang.

If you are stuck on a shared uaf or rightslist lock, you will be able to execute various dcl commands up to the point that you need uaf/rightslist access. So commands like show system will hang (need authorization info). Also, any command that creates a subprocess will hang since process creation needs the uaf/rightslist.

Cluster hangs are always caused by the distributed lock manager (barring usual suspects such as quorum loss, hardware, etc). These hangs track back to shared authorization file locks or a common device lock (such as a disk/io issue or locked volume). If MQS is on a shared system disk and you lock the root volume, all nodes booted from that system disk will hang waiting for the root device lock to DEQ below PW or some other non-shared lock manager state. You are either locking a shared volume at low-level, or the issue is a UAF/RIGHTSLIST lock.

If this an authorization issue, one way to avoid the 2nd node hang is to use a local SYSUAF & RIGHTSLIST. If the issue is a lock on a shared UAF file, this should allow the 2nd node to continue running when the 1st node deinsalling MQ hangs. You may then be able to use clusterwide system services to see what the hung process is doing on the remote node via $GETJPI. If you are using a shared system disk and the MQ deinstall is locking the boot disk, you are stuck unless you boot the 2nd node from another system disk.

The key now is to find out what in the PRODUCT REMOVE is hanging that node. This will point you to the root cause of the cluster hang. If you know how to use SDA, get into SDA from another privileged process on the local PRODUCT REMOVE node before attempting the pcsi remove and use SDA to see what files are open in the process removing MQS, If you are lucky and your SDA session does not get hung, set context to the PROUCT REMOVE process and issue the command SHOW PROCESS/LOCK which will definitively show you whether you are waiting for a uaf or a device lock. You can also do this from the remote node if you can find any hung process on the remote node, so I would advise running SDA from both nodes befote attempting the PRODUCT REMOVE again.

If you are still stuck, let me know, I may have an image or two that will get you closer to the solution. Also, don't forget AMDS if you are running it (if I remember the name correctly). AMDS may allow your remote node to get unhung if used correctly.

Happy hunting, Paul G
lalo57
Advisor

Re: troubles for uninstall webspheremq 5.3 on open vms

thanks