Operating System - HP-UX
1819681 Members
3639 Online
109605 Solutions
New Discussion юеВ

Problem with ioctl function

 
Samy_4
Frequent Advisor

Problem with ioctl function

Hi all
I have msl5060 configured on SAN with windows and hp-ux systems. The two drives of the library are configured on hp-ux as /dev/rmt/7mn and /dev/rmt/8mn. I'm using Data Protector 5.1 as backup softawre. When trying to backup, I got error when unloading the cartidge from the drive. In the log file I got the following :

4/08/05 12:04:54 BMA.14092.0 ["ma/dev/devseq.c /main/dp51/r51_fix/17":503] A.05.10 bPHSS_31964/DPSOL_00115
SeqOp: (/dev/rmt/7mn): ioctl(MTIOCTOP, mt_op=6, mt_count=1) fails: {5}

Any idea to resolve this problem

Thanx

16 REPLIES 16
Stephen Keane
Honored Contributor

Re: Problem with ioctl function

It appears to be trying to rewind and put off-line /dev/rmt/7mn, which is a no-rewind device, which might be part of the problem. Error 5 is EIO which is an i/o error.

What does

mt -f /dev/rmt/7mn status

give you?
Samy_4
Frequent Advisor

Re: Problem with ioctl function

May be that's it. I will try to reconfigure drives to /dev/rmt/7m and /dev/rmt/8m
Samy_4
Frequent Advisor

Re: Problem with ioctl function

I've changed /dev/rmt/7mn to /dev/rmt/7m but I still got the same error.
Anthony Lennan
Valued Contributor

Re: Problem with ioctl function

Hi Samy,

I have a few questions:
1. Is this a new setup or has been working fine up until now?
2. Is the tape actually unloading successfully or is it becoming stuck in the drive?
3. Is this error that you're seeing in the actual backup log file or are you seeing it in the debug.log file?
4. Are there any hung BMA processes hanging around on the server with the library attached?
5. Have you tried using the uma tool to load and unload tapes in the drives? Using uma do you get any errors?

/opt/omni/lbin/uma -ioctl /dev/picker

Within uma, try loading a tape into a drive (move S1 D1) This will load a tape from slot1 to drive1. Once the tape is in the drive, use the mt command to take the drive offline.
mt -f /dev/rmt/7m offl

Then use uma to unload the tape from the drive: (move D1 S1). Does this work? Do you see any errors in any of the omniback logs?

Rgds,
Anthony
Samy_4
Frequent Advisor

Re: Problem with ioctl function

It's a new setup. The tape is not unloading succefully and the cartidge is blocked in the drive. This error is in the debug.log of the client (/var/opt/omni/log). The debug.log of the cell (on windows 2003) contain the following :


08/04/2005 10:59:19 MMA.2912.3516 ["ma/spt/sctl_NT.c /main/dp51/r51_fix/10":1833] A.05.10 bDPWIN_00099
SCTL_Read: (scsi fd=0) error. [0] (fixedBit=00)


08/04/2005 10:59:19 MMA.2912.3516 ["ma/dev/devseq.c /main/dp51/r51_fix/17":2202] A.05.10 bDPWIN_00099
SeqRead: (Tape0:0:0:1C): read()=-1 fails: {0}
dev->blkSize=65536

08/04/2005 10:59:22 DEVBRA.3764.3780 ["ma/spt/sctl_NT.c /main/dp51/r51_fix/10":4335] A.05.10 bDPWIN_00099
SCTL_ModeSense_NT: SCTL_Inquiry failed: 6

08/04/2005 10:59:22 DEVBRA.3764.3780 ["ma/spt/sctl_NT.c /main/dp51/r51_fix/10":1536] A.05.10 bDPWIN_00099
SCSI_NT_CMD: RETURN: ERROR (GetLastError()=6)

08/04/2005 10:59:22 DEVBRA.3764.3780 ["ma/spt/sctl_NT.c /main/dp51/r51_fix/10":1536] A.05.10 bDPWIN_00099
SCSI_NT_CMD: RETURN: ERROR (GetLastError()=6)

08/04/2005 10:59:22 DEVBRA.3764.3780 ["ma/devtool/devlist_nt.c /main/dp51/r51_fix/2":110] A.05.10 bDPWIN_00099
GetSystemInquiryData: DeviceIOControl failed [error = 6]

08/04/2005 10:59:22 DEVBRA.3764.3780 ["ma/devtool/devlist_nt.c /main/dp51/r51_fix/2":516] A.05.10 bDPWIN_00099
CreatePhysicalDeviceList_NT: GetSystemInquiryData for scsi4:0:0:0 failed


For uma utility, I didn't use it yet. Can you explain what command have I to do (CLIReference.pdf is not very clear to explain how to use it)

Thanx
Anthony Lennan
Valued Contributor

Re: Problem with ioctl function

Hi Samy,

Unfortunately as you can see the error messages that get logged in the debug.log file are usually extremely cryptic and meaningless.

If this is a new setup I guess its quite possible that the tape got stuck in the drive on the first attempt and all the errors you have seen since are due to the tape still being stuck in the drive. Just a guess.

On the hpux server that has the library attached you should hopefully have the uma binary under /opt/omni/lbin. The uma binary is what data protector uses to control the robotic arm in your library. Fortunately you can use this command interactively outside of dataprotector for troubleshooting.

When you do an ioscan on your server you should hopefully see your robotic arm. It should be claimed by something like autoch or schgr. The device file should be something like /dev/picker or /dev/rac/c?t?d?

You use the uma command as follows:

/opt/omni/lbin/uma -ioctl /dev/picker

This should drop you at the uma prompt>

From here you can use uma commands to view the drives and slots in the library and move the tapes around.
>stat (or status? This should list the drives and slots and will show which ones contain tapes.
>move S1 D1 (To move a tape from slot1 to drive1)

Use uma to move the tape stuck in the drive back into and empty slot. ie. Before you do this make sure that the drive is offline.
Use the normal hpux command mt to do this:
# mt -t /dev/rmt/?m offl

Now use uma to move the stuck tape out.
>move D1 S1

Once you get the tape out, try running a backup again and please report back any errors that you see reported by the backup. Ignore the errors in debug.log for now.

Rgds,
Anthony
Samy_4
Frequent Advisor

Re: Problem with ioctl function

Ok I will try those commands. But don't forget that in this Data protector cell, the robotic will be command by the cell server which is a windows 2003 version (and no HP-UX version) May be the error is in the communication between 2 process from 2 different OS.

Regards
Samy_4
Frequent Advisor

Re: Problem with ioctl function

I've done some operations with UMA utility (moving tapes from slot to drive and from drive to slot) I've done backup with tar utility and everything is gonna alright. So there's no error when loading, unloading and writing data on the tape
Then, I've done a first backup with data protector and unfortunately, I got the same error
Here is the backup session details

[Normal] From: BMA@srvcent2 "HP:Ultrium 1-SCSI_1_srvcent2" Time: 18/04/2005 12:05:01
STARTING Media Agent "HP:Ultrium 1-SCSI_1_srvcent2"

[Normal] From: BMA@srvcent2 "HP:Ultrium 1-SCSI_1_srvcent2" Time: 18/04/2005 12:05:03
Loading medium from slot 9 to device /dev/rmt/7m

[Normal] From: VBDA@srvcent2 "TestSauvegarde - giga.dat" Time: 18/04/2005 12:05:35
STARTING Disk Agent for srvcent2:/tmp "TestSauvegarde - giga.dat".

[Normal] From: VBDA@srvcent2 "TestSauvegarde - giga.dat" Time: 18/04/2005 12:06:18
COMPLETED Disk Agent for srvcent2:/tmp "TestSauvegarde - giga.dat".

[Normal] From: BMA@srvcent2 "HP:Ultrium 1-SCSI_1_srvcent2" Time: 18/04/2005 12:06:31
/dev/rmt/7m
Medium header verification completed, 0 errors found

[Major] From: BMA@srvcent2 "HP:Ultrium 1-SCSI_1_srvcent2" Time: 18/04/2005 12:06:31
[90:135] Cannot eject medium. ([5] I/O error)

[Normal] From: BMA@srvcent2 "HP:Ultrium 1-SCSI_1_srvcent2" Time: 18/04/2005 12:06:31
/dev/rmt/7m
Tape Alert [10]: You cannot eject the cartridge because the tape drive
is in use. Wait until the operation is complete before ejecting the cartridge.

[Major] From: BMA@srvcent2 "HP:Ultrium 1-SCSI_1_srvcent2" Time: 18/04/2005 12:06:32
[90:64] Cannot unload exchanger medium (Details unknown.)

[Normal] From: BMA@srvcent2 "HP:Ultrium 1-SCSI_1_srvcent2" Time: 18/04/2005 12:06:32
ABORTED Media Agent "HP:Ultrium 1-SCSI_1_srvcent2"


The tape is left in the drive. I executed #mt -f /dev/rmt/7m status that gives me

Drive: HP Ultrium 1-SCSI
Format:
Status: [41114000] BOT online compression immediate-report-mode
File: 0
Block: 0

Then with #mt -f /dev/rmt/7mn offl I got
offline 1 failed: I/O error

Trying with UMA utility (move D1 S9), no chance too; I got

move: Medium removal prevented


Finally, in the system log (/var/adm/syslog/syslog.log) I've the following:

Apr 18 12:11:44 SRVCENT1 EMS [2227]: ------ EMS Event Notification ------ Value: "SERIOUS (4)" for Resource: "/storage/events/tapes/SCSI_tape/1_0_2_0_0.1.7.255.0.0.1" (Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 145948680 -r /storage/events/tapes/SCSI_tape/1_0_2_0_0.1.7.255.0.0.1 -n 145948677 -a

When executing the command

CURRENT MONITOR DATA:

Event Time..........: Mon Apr 18 12:11:44 2005
Severity............: SERIOUS
Monitor.............: dm_stape
Event #.............: 102476
System..............: SRVCENT1

Summary:
Tape at hardware path 1/0/2/0/0.1.7.255.0.0.1 : Software configuration
error


Description of Error:

The device was unsuccessful in processing the current request message. The
request was not processed.

For furthur isolation of the problem, it may be possible that the
combination of values of Sense Code, Sense Qualifier and Sense Key may be
vendor specific. In that case, please contact the manufacturer of the
device. It may also be possible that the combination of SCSI Sense
Code/Qual/Key was not intended to be used by this type of device.

Probable Cause / Recommended Action:

The error most likely indicates that the device is not fully supported by
the current driver. This may or may not cause a problem in the operation
of the device.

Additional Event Data:
System IP Address...: 192.168.3.11
Event Id............: 0x426395f000000000
Monitor Version.....: B.01.05
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_dm_stape.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
0x426395f000000000
Additional System Data:
System Model Number.............: 9000/800/rp7410
OS Version......................: B.11.11
System Firmware Version.........: 17.8
System Serial Number............: DEH442568S
System Software ID..............: -1928436960
EMS Version.....................: A.04.00
STM Version.....................: A.43.00
System Current Product Number...: A6752A
System Original Product Number..: A6752A
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/scsi.htm#102476

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v



Component Data:
Physical Device Path....: 1/0/2/0/0.1.7.255.0.0.1
Inquiry Vendor ID.......: HP
Inquiry Product ID......: Ultrium 1-SCSI
Firmware Version........: E38W
Serial Number...........: HU73M00950

Product/Device Data:

Logger ID.........: stape
Product Identifier: SCSI Tape
Product Qualifier.: HPUltrium
SCSI Target ID....: 0x00
SCSI LUN..........: 0x01

I/O Log Event Data:

Driver Status Code..................: 0x00000005
Length of Logged Hardware Status....: 26 bytes.
Offset to Logged Manager Information: 32 bytes.
Length of Logged Manager Information: 30 bytes.

Hardware Status:

Raw H/W Status:
0x0000: 00 00 00 02 70 00 05 00 00 00 00 0E 00 00 00 00
0x0010: 53 02 00 00 1C 01 00 00 00 00

SCSI Status...: CHECK CONDITION (0x02)
Indicates that a contingent allegiance condition has occurred. Any
error, exception, or abnormal condition that causes sense data to be
set will produce the CHECK CONDITION status.

SCSI Sense Data:

Undecoded Sense Data:
0x0000: 70 00 05 00 00 00 00 0E 00 00 00 00 53 02 00 00
0x0010: 1C 01 00 00 00 00

SCSI Sense Data Fields:
Error Code : 0x70
Segment Number : 0x00
Bit Fields:
Filemark : 0
End-of-Medium : 0
Incorrect Length Indicator : 0
Sense Key : 0x05
Information Field Valid : FALSE
Information Field : 0x00000000
Additional Sense Length : 14
Command Specific : 0x00000000
Additional Sense Code : 0x53
Additional Sense Qualifier : 0x02
Field Replaceable Unit : 0x00
Sense Key Specific Data Valid : FALSE
Sense Key Specific Data : 0x00 0x1C 0x01

Sense Key 0x05, ILLEGAL REQUEST, indicates that there was an illegal
parameter in the command data block or in the additional parameters
supplied as data for some commands (FORMAT UNIT, SEARCH DATA, etc.).
This sense key may also indicate that an invalid IDENTIFY message was
received.

The combination of Additional Sense Code and Sense Qualifier (0x5302)
indicates: Medium removal prevented.

SCSI Command Data Block: (not present in log record)

Manager-Specific Information:

Raw Manager Data:
0x0000: 0F 00 35 5D 00 00 00 00 00 00 00 02 00 00 00 00
0x0010: 0F 00 01 16 FE 00 00 06 1B 00 00 00 00 00


for the SAN configuration, drives (for each client) were configured automatically by data protector (selecting clients and searching for each one). The robotic is controlled by the cell server (wich is on windows 2003)

Hope this help you

Thanx
Stephen Keane
Honored Contributor

Re: Problem with ioctl function

Usually, the only thing that stops a removable media from being removed is a lock held by some other process. The media can't be removed until the lock is released. The SCSI sense sata is showing an illegal request (probably trying to unload the media). Could it be that the Windows system is holding a lock on the unit whilst the unix system is trying to eject it?
Anthony Lennan
Valued Contributor

Re: Problem with ioctl function

Hi Samy,

I agree with Stephens current thinking at this point that a another process may be locking the drive.

In one of my previous posts I mentioned BMA processes. When Omniback/Dataprotector needs to load/unload a tape it spawns a bma process (Backup Media Agent I believe) to perform that task. Once the tape movement has completed the bma process should then end.

Unfortnately I have seen on a number of occasions in the past where these bma processes hang (especially on Windows machines for some reason). They can then lock up a drive until they are killed off.

Have a look on all you machines running Dataprotector to see if there are any bma processes hanging around and kill them off.

Another thing that you might want to also look at running "omnidbutil -free_locked_devs" to just to make sure that Dataprotector doesn't have it locked in its database either. I think its very unlikely that this will be the problem but its good to cover all the possibilities.

The thing that has me scratching my head at the moment is that fact that you can't take the drive offline with the mt command (maybe because another process is preventing it but maybe not). If Omniback can't take the drive offline then its definitely not going to be able to unload the tape.

Check for the processes and let us know how you go.

Best regards,
Anthony
Anthony Lennan
Valued Contributor

Re: Problem with ioctl function

Hi Samy,

Just a couple of other things...

- Power cycle your tape drives and then try again.
- Check your patches. I had a quick search for the error messages your getting above and I got a number of hits back all pointing to media agent patches. Unfortunately none of them were for Dataprotector 5.1 but its still worth keeping in mind once you've exhausted all other possibilities. As a habit whenever I install Omniback I always upgrade it immediately with the latest patches. There's normally a truckload of them.

Rgds,
Anthony
Samy_4
Frequent Advisor

Re: Problem with ioctl function

Hi Anthony,
When I got these error, the tape is blocked in the drive. It can't be unload with uma, mt or the control panle of the library . So, each time I have to power cycle the library and then, with the control panel, I move it from the drive to the slot. And, after that, any backup won't work (same error)
For patches, I've installed the required system patchs for Data Protector and Data Protector patchs themselves (for windows and HP-UX) : Core, media agent, cell manager, disk agent. No results
Anthony Lennan
Valued Contributor

Re: Problem with ioctl function

Hi Samy,

You mentioned earlier that that you could use uma successfully before you tried to do any backup didn't you. But after you've tried a backup the fact that you can't even unload the tape manually without powercycling sounds more and more like there's a process hanging around thats got a stubborn hold on the drive.

Have a look for any bma processes and then try again. If there aren't any bma processes, try stopping all Dataprotector processes on every host and then try again.

Is it possible that there's some other backup software or processes running on any hosts on the SAN that some how might have grabbed hold of the drives? I assume that this issue effects both drives?

Besides the two drives in the library are there any other drives in your environment that you can run test backups to?

Rgds,
Anthony
Anthony Lennan
Valued Contributor

Re: Problem with ioctl function

Hi Samy,

Just another quick thought. From memory I think its possible to configure a backup so that it doesn't bother doing a header check at the end of the backup. Can you try and configure a test backup so that it doesn't do this check. Another theory I have came up with after looking at those patches is that maybe the tapes not rewinding properly at the end of the header check. Just a theory but you never know :)

Rgds,
Anthony
Samy_4
Frequent Advisor

Re: Problem with ioctl function

Finally, the problem was with the Removable Storage Manager (RSM) service. Its database was aletered.So, the windows driver for the robotics was loaded each time. I've restored RSM database backup, and I've disabled the windows driver for the robotics. Everything is going right

Thanx everyone
Samy_4
Frequent Advisor

Re: Problem with ioctl function

Finally, the problem was with the Removable Storage Manager (RSM) service. Its database was aletered.So, the windows driver for the robotics was loaded each time. I've restored RSM database backup, and I've disabled the windows driver for the robotics. Everything is going right

Thanx everyone