StoreEver Tape Storage
1748198 Members
2631 Online
108759 Solutions
New Discussion юеВ

Re: TLB Exception causes E1200 to reboot

 
SOLVED
Go to solution
David Ruska
Honored Contributor

Re: TLB Exception causes E1200 to reboot

Bernhard,

You mention that HP has not been able to resolve your issue. Have you provide trace logs to HP for analysis?

If you update to 5.6.69 and still see issues related to network connectivity, I'll be happy to have the engineering team review the router logs to see if we can determine the cause.
The journey IS the reward.
David Ruska
Honored Contributor

Re: TLB Exception causes E1200 to reboot

Ian,

I've confirmed that the TLB you had on 1/8/05 with 5.4.25 has been fixed in 5.6.06.

The second TLB (different issue) with 5.6.06 has not been reported before. If that problem reoccurs (with 5.6.69) we would be interested in trace logs so we can investigate further.
The journey IS the reward.
Ian Grobler
Frequent Advisor

Re: TLB Exception causes E1200 to reboot

David, Many thanks for all the effort on this so far. I have upgraded the router that was on 5.4.25 to 5.6.06 about 2 weeks ago and so far this has been stable.
I will upgrade the other router that is on 5.6.06 to the new 5.6.69 firmware as soon as possible and give it a test. After the two SCSI bus resets on the 5.6.0.6 router occurred earlier in April we removed one of the 'dodgy' tapes from the backup cycle and gave the drive a good cleaning and it has been holding up since.
Will do the upgrade and we take it from. Much appreciated!! Ian
David Ruska
Honored Contributor

Re: TLB Exception causes E1200 to reboot

Ian,

That's good to hear. Keep us posted if you have any issues with 5.6.69.

Here's the problem that was fixed in 5.6.06:

* Previously, if the buffered tape write option was used and the write command was aborted just before the drive responded with an error, then the router might reboot with a TLB message recorded in the router Event Log. This issue has been resolved.

A bad tape or other issue causing drive write problems needed to occur to expose the potential for that TLB.
The journey IS the reward.
Ian Grobler
Frequent Advisor

Re: TLB Exception causes E1200 to reboot

I upgraded the router from 5.6.06 to 5.6.69 this morning after another TLB exception last night.
I have also attached the router report file done just after the upgrade. I will monitor and let you know how it goes. Thanks for the feedback so far.
Marino Meloni_1
Honored Contributor

Re: TLB Exception causes E1200 to reboot

Hi Ian,
Have you run aa "acceptance test" on your ultrium drive, it report sense code 04 44 00 that seems to be an internal error, it may be the source of your trouble

marino
Ian Grobler
Frequent Advisor

Re: TLB Exception causes E1200 to reboot

Using the new 3.5 SR2 LTT tools I ran a "LTO Drive Assessment Test" and it passed the drive without any errors. I also ran a "Connectivity test" and this passed without any errors as well.
I am quite happy that the drive is ok at this stage. The occasional errors being experienced occur during high backup I/O and the fact that the E1200 reboots is the primary concern. I am hoping the new 5.6.69 firmware helps resolve this, even when a problem occurs writing to the tape the router should be rock solid and not reboot. We will be re-using a suspect tape this evening just to give it a good go ;-)
David Ruska
Honored Contributor

Re: TLB Exception causes E1200 to reboot

Ian,

As Marino said, the 0x04/0x4400 sense data from the drive says "Internal Target Failure", which typically means the drive encountered some hardware error, or a hardware condition (e.g. interrupt) that it was not expecting. There were a few cases of that error being reported, with the most recent at 04/21/2005 00:24:55. That does not indicate a restart of the router around that time.

One possible cause of this drive error is mutliple hosts attempted to communicate to the drive at the same time. This can happen if the drives are mapped to more than one host, and a host not performing the backup has an application or service running that does polling. There's an issue with win2003 that cases test unit ready commands to be sent down. Do you have any win2003 systems on the SAN that see the library? Windows RSM should also be disabled for these devices (unless needed by the backup app on the backup server).

The report page you provided did show a unit restart on 04/20/2005 22:19:43, but there were no errors prior to that. Is that the reboot you are referencing?

It's possible that the previous trace log would have captured some useful info on this, but the firmware update cleared it out.

Should you have another failure, capturing the full router report page and collecting an LTT support ticket from the drive as soon as possible, would allow us to have the best info to help identify the cause.
The journey IS the reward.
Ian Grobler
Frequent Advisor

Re: TLB Exception causes E1200 to reboot

There are three Win2003 hosts that have access to the library (as mapped on the router). All 3 hosts have the registry keys set to disable TUR and the RSM services are stopped and disabled (The backup application is Data Protector 5.1). One of the Win2003 hosts is a NAS4000 appliance which may/may not have some additional components on it causing this - it's configured "as is" as shipped by HP. Otherwise there should not be any additional access and the one other host (SMA) is zoned not to see the library.
Yes, the unit restart on 04/20/2005 22:19:43 is the latest reboot I was referencing.
There was no problem last night and if another issue occurs I will capture the full router report page and collect a LTT support ticket as requested. Thanks for all the help on this so far.
Marino Meloni_1
Honored Contributor

Re: TLB Exception causes E1200 to reboot

Another component that usualy is polling the SAN and can disrupt the backups are the Insight Agents Versions 7.20 (actualy the latest) have the possibility to stop polling tapes with all other agents active. (in previous one you should disable all the FC component or use specific registery manipulation)
So I suggest if you have the agents installed to upgrade to last release and to stop the related component