StoreEver Tape Storage
1752598 Members
5456 Online
108788 Solutions
New Discussion юеВ

Re: Driver issue with Ultrium 448 SAS

 
SOLVED
Go to solution
Curtis Ballard
Honored Contributor

Re: Driver issue with Ultrium 448 SAS

You might have picked up another patch with a nasty change. See http://support.microsoft.com/kb/932755 for details and look for "BusyRetryCount". Microsoft added a new registry setting for how systems should behave when a device is busy processing another command and that new setting causes problems. The default value is 20 but 75 is a better setting.
Hernan Ruiz
Advisor

Re: Driver issue with Ultrium 448 SAS

I haven't got the "storport" folder in that location in the registry, not even after installing the hotfix. Do I have to create the folder and the three registry key values mentioned in it?

Thanks a lot!
Lenny

------------------------------
You can configure the following registry key values:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Enum\SCSI\\\DeviceParameters\Storport\
Value - BusyRetryCount
Type - DWORD
Data - 20 Decimal (default)
Range - number of retries

Value - BusyPauseTime
Type - DWORD
Data - 250 Decimal (default)
Range - number of milliseconds

Value - QueueFullWaitIoPercentage
Type - DWORD
Data - 25 Decimal (default)
Range - 1 to 100 percentage of time
Curtis Ballard
Honored Contributor

Re: Driver issue with Ultrium 448 SAS

If you can't find a Storport registry entry then you probably aren't running the storport driver and shouldn't be effected by the problem with the BusyRetryCount.
Hernan Ruiz
Advisor

Re: Driver issue with Ultrium 448 SAS

Ok, so what else can I try?
Thanks.
Hernan Ruiz
Advisor

Re: Driver issue with Ultrium 448 SAS

I still get the same error. Is there anything else I can do? When I first installed the server it worked perfectly for six months. Formatting the server is not an option. What can I do? Help!!!!!!
Thanks.
Curtis Ballard
Honored Contributor

Re: Driver issue with Ultrium 448 SAS

The error you post decodes to SAS I/O aborted (host issued ABORT TASK task management function) after the command was sent to the target.

That means that a command timed out and the host sent an abort to clear the command. What command timed out is the question.

Most of the time when I have seen this error it is because two different software applications tried to use the tape drive at the same time. One application is read/writing or some other command that takes a little time to finish and another application sends a command like a log sense that it expects to come back quickly. The log sense or other command is sent with a short timeout and times out before it ever gets processed.

Do you have anything else running on the system that might be talking to the tape drive? RSM is disabled? Plug-and-play polling for the tape drive is disabled? If you have any storage management software, even the HP storage agent, try shutting it down and see if that helps.

If you can't track it down that way then we'll probably have to get a SCSI bus activity trace. There are some software tools that can do that.
Hernan Ruiz
Advisor

Re: Driver issue with Ultrium 448 SAS

First of all, thanks for your reply.
I've been trying to identify whether there's some software trying to access the tape drive while I'm running the backup.
I can't disable rsm cause I use it in order to do the backup. This has never represented a problem before.
The problem is that I don't know how to be certain that no application is interfering.
I've been looking for a program that can get a SCSI bus activity trace, as you suggested, but I couldn't find any. I've downloaded the software for the hba that is available at the same url where I can also get the drivers for the hba, but apparently it doesn't do that.
Do you know the name of any particular software that would provide me with the trace?
Thank you.
Hernan Ruiz
Advisor

Re: Driver issue with Ultrium 448 SAS

I changed the server to a DL320 G5. I moved the HBA. So it's the same Ultrium and same HBA. In order to set everything up, I installed 4 executable files downloaded from hp website:
-the latest driver for the hba
-the latest driver for the tape drive
-the newest firmware upgrade for the hba
-the newest firmware upgrade for the tape drive.

I get exactly the same error.

Event Type: Error
Event Source: Lsi_sas
Event Category: None
Event ID: 11
Date: 12/28/2008
Time: 2:28:59 PM
User: N/A
Computer: OPERA2
Description:
The driver detected a controller error on \Device\RaidPort0.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0010000f 00680001 00000000 c004000b
0010: 31140000 00000000 00000000 00000000
0020: 00000000 00000000 00000000 00000000
0030: 00000000 c004000b

and immediately after,

Event Type: Error
Event Source: PlugPlayManager
Event Category: None
Event ID: 12
Date: 12/28/2008
Time: 2:28:59 PM
User: N/A
Computer: OPERA2
Description:
The device 'Hewlett Packard LTO Ultrium-2 drive' (SCSI\Sequential&Ven_HP&Prod_Ultrium_2-SCSI&Rev_T61D\5&610f00f&0&000500) disappeared from the system without first being prepared for removal.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00000000

I wasn't able to install the firmware for the hba again. On the old server the problem was the operating system (64bit). This time I'm running w2003ser 32bit, but the application's not correctly programmed, cause it says there's no lsi/sas device while actually it is very clear to me that there is. HP should improve the firmware upgrade for the hba available on their website at
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=es&cc=ar&prodNameId=3263294&prodTypeId=329290&prodSeriesId=3263289&swLang=8&taskId=135&swEnvOID=1005
because it doesn't work.

The idea of a different application trying to access or communicate with the tape drive at the same time while I'm doing the backup seems unlikely now. There were no tape-related programmes previously intalled in the new server.

But why do I get the same error on two different servers (same hba and same tape drive)?
I've thought of three plausible explanations:

FIRST HYPOTHESIS
----------------------------
In both cases I installed the same corrupt driver/firmware upgrade.
***Argument for: I did try to install the firmware upgrade for both the tape drive and the hba before even trying to run the backup once. It wasn't very clever of me to do this. I can still try with a brand new server. Another dl320. Maybe this program spoils the registry by creating a ruddy key value and I just can't fix it.
***Argument against: In the first server, the error started before installing these particular drivers or firmware upgrades.
Although the origin of a problem was a Windows update, that it might have contained one of these executables I've run on the second server as well.
Besides, I remember the error wasn't exactly the same in the beginning. I always got event id 11 (the controller error in the hba) but I also used to get event id 129 at the same time:

Event Type: Warning
Event Source: Lsi_sas
Event Category: None
Event ID: 129
Date: 12/7/2008
Time: 4:55:01 AM
User: N/A
Computer: OPERA-WEB
Description:
Reset to device, \Device\RaidPort2, was issued.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0010000f 00680001 00000000 80040081
0010: 00000004 00000000 00000000 00000000
0020: 00000000 00000000 00000000 00000000
0030: 00000500 80040081

SECOND HYPOTHESIS
---------------------------------
Though I think it's the most unlikely among all the other possible explanations, I think I should consider the remote possibility of a hardware issue in the hba or (even more unlikely) in the tape drive or (not so unlikely) in the sas cable. As a matter of fact, it does say the tape drive unexpectedly disappeared from the system without first being prepared for it. Despite the fact that this error comes right after the one reporting a controller error on the port with source: lsi_sas.
***Argument for: It is quite clear that firstly, there is a problem with the hba, and then the tape drive disappears - because of the problem with the sas card the server doesn't see the tape drive anymore. I think it is possible that a hardware defect in the sas card is causing the problem.
***Argument against: The failure seems extremely regular to be a hardware issue. It takes between 5 and 30 minutes to fail. If there were a hardware problem, I should have managed to complete a few backups once in a while.
Of course, I haven't got an extra sas card, tape drive or sas cable to check whether I get the same errors (I would've already tried this if I had). So if I get to the conclusion that there's something wrong with any of these, I should be quite certain which piece of hardware is not working properly before I buy a new one, as they are quite expensive.

THIRD HYPOTHESIS
-----------------------------
Some sort of witchcraft.
***Argument for: Everything seems to go wrong for me lately.
***Argument against: The "argument for" is not really an argument.


The idea of an application sending a request to the tape drive and this request timing out because the tape drive is busy, seems reasonable only if the application which is interfering is one of the four files I've installed. Is it possible that a firmware upgrade might do this?
I'll move the sas card again to a different server and just install the drivers that came in the cd (which worked fine in the beginning) and I won't do any firmware upgrade.
I'll post the results.

If you can think of any other hypotheses, or if you have any suggestions, I'd be very grateful to hear them.
THANKS A LOT FOR YOUR HELP!!!
Hernan Ruiz
Advisor

Re: Driver issue with Ultrium 448 SAS

I moved the hba to a different server again (another dl320).
I installed the drivers for the hba (version 1.28.2.1) and for the tape drive (version 1.0.4.0).
Exactly the same problem.

Event Type: Error
Event Source: Lsi_sas
Event Category: None
Event ID: 11
Date: 1/1/2009
Time: 10:51:35 PM
User: N/A
Computer: MFFSERVER
Description:
The driver detected a controller error on \Device\RaidPort0.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0010000f 00680001 00000000 c004000b
0010: 31140000 00000000 00000000 00000000
0020: 00000000 00000000 00000000 00000000
0030: 00000000 c004000b


Event Type: Error
Event Source: PlugPlayManager
Event Category: None
Event ID: 12
Date: 1/1/2009
Time: 10:51:35 PM
User: N/A
Computer: MFFSERVER
Description:
The device 'Hewlett Packard LTO Ultrium-2 drive' (SCSI\Sequential&Ven_HP&Prod_Ultrium_2-SCSI&Rev_T61D\5&610f00f&0&000500) disappeared from the system without first being prepared for removal.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00000000

I ran the backup again and I got the 4 errors below. This has also happened before - In the previous servers too, if I tried 3 times in a row, I didn├В┬┤t get exactly the same errors in the event viewer.

Event Type: Error
Event Source: Lsi_sas
Event Category: None
Event ID: 11
Date: 1/1/2009
Time: 11:22:23 PM
User: N/A
Computer: MFFSERVER
Description:
The driver detected a controller error on \Device\RaidPort0.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0010000f 00680001 00000000 c004000b
0010: 31140000 00000000 00000000 00000000
0020: 00000000 00000000 00000000 00000000
0030: 00000000 c004000b

Event Type: Error
Event Source: hplto
Event Category: None
Event ID: 15
Date: 1/1/2009
Time: 11:22:23 PM
User: N/A
Computer: MFFSERVER
Description:
The device, \Device\Tape0, is not ready for access yet.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0068000f 00bc0001 00000000 c004000f
0010: 00000104 c000009d 00000000 00000000
0020: 00000000 00000000 00064181 00000000
0030: ffffffff 00000000 0a000040 00050000
0040: 120620fe 00000108 00000000 00000960
0050: 00000000 88599cb0 00000000 88621e48
0060: 00000000 00000000 00000001 00000000
0070: 00000000 00000000 00000000 00000000
0080: 00000000 00000000 00000000 00000000

Event Type: Error
Event Source: hplto
Event Category: None
Event ID: 15
Date: 1/1/2009
Time: 11:22:23 PM
User: N/A
Computer: MFFSERVER
Description:
The device, \Device\Tape0, is not ready for access yet.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0068000f 00bc0001 00000000 c004000f
0010: 00000104 c000009d 00000000 00000000
0020: 00000000 00000000 00064181 00000000
0030: ffffffff 00000000 0a000040 00050000
0040: 120620fe 00000108 00000000 00000960
0050: 00000000 885e6e48 00000000 88621e48
0060: 00000000 00000000 00000001 00000000
0070: 00000000 00000000 00000000 00000000
0080: 00000000 00000000 00000000 00000000

Event Type: Error
Event Source: PlugPlayManager
Event Category: None
Event ID: 12
Date: 1/1/2009
Time: 11:22:23 PM
User: N/A
Computer: MFFSERVER
Description:
The device 'Hewlett Packard LTO Ultrium-2 drive' (SCSI\Sequential&Ven_HP&Prod_Ultrium_2-SCSI&Rev_T61D\5&610f00f&0&000500) disappeared from the system without first being prepared for removal.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00000000

What can I do????????????????????
Please, help!!!
Is it a hardware problem or a software problem?
Either possibility sounds unlikely to me.
If it's hardware, i'd say it's the hba. And if it's software, I'd say it's the driver for the hba - given that I always had to download it from hp website as the card didn't come with a cd, to my knowledge.
Should I buy a new sas card? I've got to be really sure that this will solve the problem if I tell my boss we have to buy a new card.
What do you think?
Thanks!
Curtis Ballard
Honored Contributor

Re: Driver issue with Ultrium 448 SAS

It's sounding like there may be a hardware problem somewhere. I'd suspect the cable before the HBA although it could be the HBA. Can you try a different cable?