Operating System - Tru64 Unix
1839313 Members
3057 Online
110138 Solutions
New Discussion

Re: Strange(?) disklabel output

 
Nick Bishop (Kiwi)
Frequent Advisor

Strange(?) disklabel output

I have a Seagate disk that is giving me inconsistent disklabel output, and was wanting to know whether this disk can be trusted.

First thing first: The system is an Alpha DS10 with two disks within, a Compaq and a Seagate. We noticed that there would be a CPU panic whenever there was heavy I/O on the Seagate disk (problems went away when we shuffled all files off that disk, and disabled swap on that disk. Problems returned when we ran /usr/field/fsx on that disk).

We've also consistently noted that the disklabel output (attached) says
> cylinders: 25238
but then lists partitions that go to cylinder 88612.

What do you make of that?

Attached text file has output from disklabel, sizer, and uname.
10 REPLIES 10
Nick Bishop (Kiwi)
Frequent Advisor

Re: Strange(?) disklabel output

I should also point out that we have now added another disk to the system. We physically removed dsk1 from the system, reseated a couple of jumpers, then added it back to the system along with a Seagate 18.2GB barracuda.

Therefore we now have three disks in the system:

hwid 47, root disk (Compaq)
hwid 48, Seagate 73GB
hwid 59, New Seagate 18.2GB

The other thing to note is that (despite running zeero and diskx (exerciser) on the Seagate 73GB) we've had no further system panics since we've nuked the disk.

My next move is to create some AdvFs filesystems on the 73GB disk and try using the fsx (exerciser) on it, to see if the panic problems return (while I wait for your thoughts).

Nick.
Nick Bishop (Kiwi)
Frequent Advisor

Re: Strange(?) disklabel output

I should also provide this info ...

$ dupatch -track -type patch_level

Gathering details of relevant patch kits...

You are currently running OS version:
HP Tru64 UNIX V5.1B (Rev. 2650); Wed Dec 24 14:11:53 EST 2008
Patches installed on the system came from following software kits:
------------------------------------------------------------------
ERP 0204700540: T64KIT1001143-V51BB27-ES-20070305 OSF540
ERP 0206701540: T64KIT1001178-V51BB27-E-20070330 OSF540
ERP 0207000540: T64KIT1001176-V51BB27-E-20070328 OSF540
ERP 0207301540: T64KIT1001188-V51BB27-ES-20070404 OSF540
ERP 0207900540: T64KIT1001187-V51BB27-E-20070404 OSF540
ERP 0214400540: T64KIT1001259-V51BB27-E-20070717 OSF540
ERP 0216300540: T64KIT1001279-V51BB27-E-20070817 OSF540
ERP 0222201540: T64KIT1001398-V51BB27-ES-20071207 OSF540
ERP 0225200540: T64KIT1001450-V51BB27-E-20080305 OSF540
ERP 0226700540: T64KIT1001460-V51BB27-ES-20080310 OSF540
ERP 0230100540: T64KIT1001509-V51BB27-E-20080611 OSF540
ERP 0234100540: T64KIT1001551-V51BB27-ES-20081015 OSF540
Patch Kit 6: T64V51BB27AS0006-20061208 OSF540
Steven Schweda
Honored Contributor

Re: Strange(?) disklabel output

I know nothing, but the inconsistency within
the disklabel report would certainly worry
me. At least until I looked around a little.
Around here:

urtx# hwmgr -view device
HWID: Device Name Mfg Model Location
------------------------------------------------------------------------------
3: /dev/dmapi/dmapi
4: /dev/scp_scsi
5: /dev/kevm
57: /dev/random
58: /dev/urandom
91: /dev/disk/dsk4c FUJITSU MAB3091S SUN9.0G bus-1-targ-2-lun-0
94: /dev/disk/cdrom4c SONY DVD RW AW-Q170A bus-2-targ-0-lun-0
95: /dev/disk/dsk5c SEAGATE ST318404LC bus-0-targ-0-lun-0
96: /dev/disk/dsk6c SEAGATE ST318404LC bus-0-targ-1-lun-0

dsk5 has VMS on it, but dsk6 is a Tru64 boot
disk. These things make a little more sense,
but 14384 and 14080 are not exactly equal,
either:

urtx# disklabel -r dsk6
# /dev/rdisk/dsk6c:
type: SCSI
disk: ST318404LC
label:
flags:
bytes/sector: 512
sectors/track: 421
tracks/cylinder: 6
sectors/cylinder: 2526
cylinders: 14384
sectors/unit: 35566480
rpm: 10016
interleave: 1
trackskew: 75
cylinderskew: 85
headswitch: 0 # milliseconds
track-to-track seek: 0 # milliseconds
drivedata: 0

8 partitions:
# size offset fstype fsize bsize cpg # ~Cyl values
a: 1109466 0 AdvFS # 0 - 439*
b: 6339800 1109466 swap # 439*- 2949*
c: 35566480 0 unused 0 0 # 0 - 14080*
d: 0 0 unused 0 0 # 0 - 0
e: 0 0 unused 0 0 # 0 - 0
f: 28117213 7449266 AdvFS # 2949*- 14080*
g: 17586632 393216 unused 0 0 # 155*- 7117*
h: 17586632 17979848 unused 0 0 # 7117*- 14080*


This one looks better:

urtx# disklabel -r dsk4
# /dev/rdisk/dsk4c:
type: SCSI
disk: MAB3091S SUN9.0
label:
flags:
bytes/sector: 512
sectors/track: 133
tracks/cylinder: 27
sectors/cylinder: 3591
cylinders: 4926
sectors/unit: 17689267
rpm: 7200
interleave: 1
trackskew: 45
cylinderskew: 36
headswitch: 0 # milliseconds
track-to-track seek: 0 # milliseconds
drivedata: 0

8 partitions:
# size offset fstype fsize bsize cpg # ~Cyl values
a: 17689267 0 AdvFS # 0 - 4926*
b: 262144 131072 unused 0 0 # 36*- 109*
c: 17689267 0 unused 0 0 # 0 - 4926*
d: 0 0 unused 0 0 # 0 - 0
e: 0 0 unused 0 0 # 0 - 0
f: 0 0 unused 0 0 # 0 - 0
g: 8648025 393216 unused 0 0 # 109*- 2517*
h: 8648026 9041241 unused 0 0 # 2517*- 4926*

Except that in both cases, the "g" and "h"
partition data seem to have no connection
with reality. dsk4 has one big "a" AdvFS
partition, and dsk6 has "a" (root), "b"
(swap), and "f" (/usr + /var).

On the other hand, with any even semi-modern
SCSI disk, all those sectors/cylinders kinds
of parameters are fictional, so I wouldn't
spend a whole lot of time trying to make
sense of them. diskconfig seems to present a
picture which agrees better with what I
expect to see, but clicking on "Partition
Table..." returns to the land of unreality.
Perhaps those things are all (or mostly) junk
derived from "/etc/disktab", rather than
anything significant associated with the
actual disk. It sure looks that way.

So, I'd guess that the goofy partition table
supplied by disklabel may be safely ignored.
Whether there's a problem with the disk may
remain a mystery, however. Anything
suspicious in the system log files?

For the record, urtx is an XP1000:

urtx# sizer -v

HP Tru64 UNIX V5.1B (Rev. 2650); Fri Mar 20 20:19:48 CDT 2009

urtx# dupatch -track -type patch_level | grep -i 'patch kit '
Patch Kit 5: T64V51BB26AS0005-20050502 OSF540
Patch Kit 6: T64V51BB27AS0006-20061208 OSF540
Patch Kit 7: T64V51BB28AS0007-20090312 OSF540
Nick Bishop (Kiwi)
Frequent Advisor

Re: Strange(?) disklabel output

Hi Steven,

That's another Seagate that gives funny disklabels, hmmm.

Anyway, I've set up 4 partitions (3 domains and a space intended for swap), and I've been running fsx on the 3 domains and diskx on the intended swap, and had no problem. Of course the backup on the machine slows to a crawl, but no other problems.

There's nothing recent in the ascii kern.log files (/var/adm/syslog.dated/DATE/), but in the binary error (uerf) output, I see the occasional CONFIGURATION output message (like #84 in the attached).

Going back further, when the disk was playing up, there were entries from uerf as attached (we were getting CPU panics).

Perhaps it was something physical (or possibly electrical shorting), because I pulled the disk out, and put it back in-circuit, but it's not physically bolted in at present.

I'll make another post tomorrow.
Steven Schweda
Honored Contributor

Re: Strange(?) disklabel output

> That's another Seagate that gives funny
> disklabels, hmmm.

Knowing nothing, I assume that if the actual
model isn't found in "/etc/disktab", then you
get inappropriate junk from some default
entry in there. As I said, I found
diskconfig much more realistic regarding the
partitioning on these disks.

> Perhaps it was something physical [...]

Many things are possible.
Venkatesh BL
Honored Contributor

Re: Strange(?) disklabel output

Did you try running 'diskconfig' GUI? You could try resizing the partitions to see if disklabel shows correct values after that.
cnb
Honored Contributor

Re: Strange(?) disklabel output

Hi Nick:

FWIW:

The system is/was reporting Machine Checks in the error log. This *usually* indicates a Hardware issue.

There wasn't enough information posted to make a call on which specific hardware is faulty. This could be a CPU, Memory or anything else involved with the transactions at the time of the machine check.

Further investigation into the error log with the appropriate analysis tools is warranted. UERF is usually not the best for extracting specific hardware errors for the DS10. If you have the proper support, see this link:

http://www.compaq.com/support/svctools/webes/index.html


HTH.

Rgds,
Nick Bishop (Kiwi)
Frequent Advisor

Re: Strange(?) disklabel output

Oh, a flood of replies.

We don't have support on the machine, so we don't have privilege to get WEBES.

Given there's been no crashes since I physically repositioned the disk, I think the original problem was either
1. Shorting between the exposed electronics and the metal shelf, or
2. Crook connection on the SCSI plug (68 pin).

I'll get onto the console and run the 'diskconfig' utility (there still being no data or swap on this disk).

Nick.
Nick Bishop (Kiwi)
Frequent Advisor

Re: Strange(?) disklabel output

Ok, I've been working on a support issue that will take a couple of days more. I'll be back.

In the meantime, I've noticed this machine has indeed crashed again, so I spoke too soon.

Nick.
Nick Bishop (Kiwi)
Frequent Advisor

Re: Strange(?) disklabel output

We've had two crashes on this machine since the disk was connected (on 7 Dec), both while the disk was idle (unlike earlier this year).

I've tried Venkatesh's suggestion for diskconfig. I used it twice: first, to adjust the g-h boundary, then second, to rewrite the partition table totally (back to default a-b-g-h arrangement).

In neither case did this change:
> cylinders: 25238
(extract from disklabel -r output)

I've now disconnected the SCSI plug (leaving the power plugged in), so now I can see whether it's a SCSI issue.

I'll assign some overdue points now.