Operating System - Tru64 Unix
1828872 Members
2601 Online
109985 Solutions
New Discussion

Panic during backup

 
Hazem Tolba_1
Advisor

Panic during backup

Hi All,

We have an environment of Alpha GS140 and ES40 running Tru64 Unix V5.1A PK6. We have an EVA5000 and SAN. Backup is done using SAN-connected library MSL6060 with Data Protector 5.1.
Recently, we faced some panics for different servers during the scheduled night backups.
The following message found in messages file and binary.errlog file:
"panic (cpu 4): lw_remove: light weight wiring(s) found"
Also, a lot of CAM-SCSI errors noticed on the tape just before the crash.
One of the servers was upgraded to V5.1B-3, and it faced the same problem but with different message:
"panic (cpu 3): u_map_deallocate: u_map_delete failed while deallocating map"

We tried to swap the library between different FC switches "edge switch 2/24" and different SFPs, but no change.

Both the crashed servers are running Oracle DB.

Could any one help us with the same if already faced similar problem?
Thanks.
4 REPLIES 4
Michael Schulte zur Sur
Honored Contributor

Re: Panic during backup

Hi,

you may want to set
new_wire_method to zero in /etc/sysconfigtab on 5.1A.

Michael
Erich Wimmer
Valued Contributor

Re: Panic during backup

Hi,
Although it could be helpful to know, if somebody else has faced the same problem, you should report the problem to your local HP service representative (hotline) too. Because Tru64 V5.1b-3 has the latest patch kit, the crash data and log files has to be examined by an expert.
The CAM-errors may be the cause for the kernel malfunction and the panic, therefore you could try to avoid these errors by changing tape library hardware, tape media etc., but in my opinion this is only a bypass.

Regards, Erich
Hazem Tolba_1
Advisor

Re: Panic during backup

Hi,

I have checked the new-wire-method on both systems. Actually it is already set to 0 on V5.1B-3 which is the important now as the other systems will be upgraded to such version by next week.
So, the issue now is mainly for the ES40 with V5.1B-3 as mentioned. The panic message is different as I descriped.
Again, the ES40 faced the same problem on both versions but with different panic messages. Any more help?
Regarding HP, there is no hotline for alpha servers as far as I know.
Erich Wimmer
Valued Contributor

Re: Panic during backup

Hi,
a colleague of mine has given me following hint:
We had a similar case at another customer site that might match your problem:
After a hardware problem with the media changer device the "Use direct library access" flag set for the tape drives in Dataprotector caused the Dataprotector Media Agents to try to directly access the jukebox. This is done using a programm called "devbra". This caused the crash. Of course Tru64 Unix should not crash in this case, but we found a quick workaround to disable "use direct library access" in Dataprotector for the tape drives (go to the tape drives properties/settings/advanced and uncheck the box).

You will recognize that this is the problem when the current process in /var/adm/crash/crash-data.x is "devbra" or there are some devbra processes running.

If you unset "use direct library access" for a drive, this causes the systems backing up to the drive to not try to failover the access to the jukebox mechanism to the local host. This might be a disadvantage if the host that is defined as owner for the library does not have access to the jukebox mechanism. It does not influence that the media agents write directly to the tape, just the jukebox manipulation (loading and unloading tapes) is then done only by one

To get a patch preventing this crash you have to contact an official HP service representative. If you are interrested, I can make the connections, but need at least some contract data, name, address and so on. To keep anonymity you can send this data to supportteam.austria@hp.com for this problem, they will tell you a responsible service center.

Hope this helps, Erich.