ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

ZFS - Solaris 10 doesn't see disks after reboot on HP Pro DL380 G7

travigne
Collector

ZFS - Solaris 10 doesn't see disks after reboot on HP Pro DL380 G7

The current box has 3 logical drives (the third one just added in) .shown in P410i Array Controller (F8 during POST) as:

  • Logical Drive 1 - 2 SAS 72GB HD Raid10 <----OS disk
  • Logical Drive 2 - 4 SAS 146GB HD Raid 5 <----data disk
  • Logical Drive 3 - 2 SAS 146GB HD Raid10 <--- new OS disk

And this is what it shows in the OS :

bash-3.2# echo | format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c0t0d0 <HP     -LOGICAL VOLUME -5.06 cyl 17841 alt 2 hd 255 sec 63>
          /pci@0,0/pci8086,3408@1/pci103c,3245@0/sd@0,0
       1. c0t1d0 <HP-LOGICAL VOLUME-5.70-410.10GB>
          /pci@0,0/pci8086,3408@1/pci103c,3245@0/sd@1,0
3. c0t2d0 <HP -LOGICAL VOLUME -5.06 cyl 17841 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,3408@1/pci103c,3245@0/sd@2,0

The size of current rpool (c0t0d0) that contains the OS is too small which is 72GB. I want to increase the rpool by putting in new hard disks (2 SAS 146GB) and do the zpool attach the c0t2d0 to the rpool, after that I detach the c0t0d0. I make sure the new disk is bootable by setting

eeprom bootpath = /pci@0,0/pci8086,3408@1/pci103c,3245@0/sd@2,0

When i reboot the server and access the controller (F8), i also go to Select Boot Volume and select Logical Volume 3 as the new current boot lun. The server is able to boot up and picking up the new boot device correctly. However, I don't see 2 previous disks there anymore (c0t0d0 and c0t1d0).

bash-3.2# echo | format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0t2d0 <HP     -LOGICAL VOLUME -5.06 cyl 17841 alt 2 hd 255 sec 63>
          /pci@0,0/pci8086,3408@1/pci103c,3245@0/sd@2,0

I tried everything, devfsadm or cfgadm and reboot many times to check the status of those Logical Drives in the Controller, they are still there with OK status. But i don't know why i no longer see them in the OS. c0t0d0 is the old boot disk that i don't need anymore, but c0t1d0 is the data disk and i need it to show up here. Do you have any clues?

3 REPLIES
parnassus
Honored Contributor

Re: ZFS - Solaris 10 doesn't see disks after reboot on HP Pro DL380 G7

I get confused: how can an Array (group of disks) made up by 2 physical disks only - see your "Logical Drive 1" / "Logical Drive 3" descriptions - be represented as a RAID 10 Array (known as 1+0 or "stripe of mirrors"), a combination that requires at least 4 physical disks (a stripe of two mirrored disks)?

What does zpool status rpool or zpool status report?

"The size of current rpool (c0t0d0) that contains the OS is too small which is 72GB. I want to increase the rpool by putting in new hard disks (2 SAS 146GB) and do the zpool attach the c0t2d0 to the rpool, after that I detach the c0t0d0"

I think that way you can't. If rpool is on a single vdev (assuming that your first real 72GB physical disk is "c0t0d0", not on a logical volume made by the underlying HP Smart Array RAID Controller) and is on ZFS you can only attach a new vdev (assuming that your second real 72GB physical disk is "c0t2d0", not on a logical volume made by the underlying HP Smart Array RAID Controller) to form a (rpool) mirror vdev. Then "growing the size" of this new formed rpool mirror vdev (which should still be of 72GB) is a whole another story.

Running ZFS on top of an Hardware RAID Controller like the HP Smart Array P410i (that manages RAID Arrays hiding physical disks to ZFS and, AFAIK, doesn't support JBOD) is very dangerous and strongly not recommended: ZFS should be implemented with JBOD Controllers and so it's able to manage physical disks directly (ZFS has its single parity, double parity and also triple parity RAIDZ levels with many disks you throw at it...), it doesn't need any Hardware RAID Controller.

travigne
Collector

Re: ZFS - Solaris 10 doesn't see disks after reboot on HP Pro DL380 G7

Well, I think the disk requirement for Raid10 is at least 2 physical disks, not 4. P410i Controller automatically decides the RAID type depends on how many physical disks we choose. 

Here is the current zpool status that is missing c0t1d0s0 (data disk)., and the rpool is replaced c0t0d0s0 with c0t2d0s0. 

 

bash-3.2# zpool status
  pool: rpool
 state: ONLINE
 scan: scrub repaired 0 in 0h11m with 0 errors on Fri May 20 11:31:17 2016
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c0t2d0s0  ONLINE       0     0     0

errors: No known data errors

  pool: rpool-app
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool-app   UNAVAIL      0     0     0  insufficient replicas
          c0t1d0s0  UNAVAIL      0     0     0  cannot open

I think your assumption about the way rpool is set up is not right. Rpool is made up of whatever disks available in the OS. You can see those disks by issuing the command "format".  c0t0d0 and c0t1d0 and c0t2d0 are all logical disks created by HP Smart Array. They are not physical disks.  I'm using hpqacucli to manage the arrays and this is what it shows for Logical Drive 1, for example:

 

 

=> ctrl slot=0 ld 1 show

Smart Array P410i in Slot 0 (Embedded)

   array A

      Logical Drive: 1
         Size: 68.3 GB
         Fault Tolerance: RAID 1
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 17562
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Array Accelerator: Enabled
         Unique Identifier: 600508B1001C4B2D1B29DAB8A2DFF843
         Disk Name: /dev/dsk/c0t0d0
         Mount Points: None
         Logical Drive Label: AE51964B500143801657264058DC
         Mirror Group 0:
            physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 72 GB, OK)
         Mirror Group 1:
            physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 72 GB, OK)

You can notice that the Disk Name is /dev/dsk/c0t0d0 . And that's how the OS shows me the disk.

My problem is, the Controller still see all the disks connected but the OS doesn't. How can I get to see them on the OS?

 

 

 

parnassus
Honored Contributor

Re: ZFS - Solaris 10 doesn't see disks after reboot on HP Pro DL380 G7


travigne wrote:

Well, I think the disk requirement for Raid10 is at least 2 physical disks, not 4. P410i Controller automatically decides the RAID type depends on how many physical disks we choose. 

I don't thinks so.

A true Logical Array with RAID level 10, known as RAID 1+0 or RAID 10, is a stripe of mirrors. Therefore you need - at least - a set of 2 physical drives for one mirror (RAID 1) and - at least - one more set (of 2) to achieve your stripe (RAID 0). So 4 physical drives would be your minimum requirement to setup a RAID 10.

You can have a RAID 1+0 with three disks (which could be the result of a "degraded" RAID 10 in which one of the mirrored disk faulted *or* could be a even more not true RAID 1+0 made of a stripe of one mirros with another single disk added).

This doesn't matter since it's up to the HP Smart Array to adopt a coherent naming convention.

Here is the current zpool status that is missing c0t1d0s0 (data disk)., and the rpool is replaced c0t0d0s0 with c0t2d0s0. 

bash-3.2# zpool status
  pool: rpool
 state: ONLINE
 scan: scrub repaired 0 in 0h11m with 0 errors on Fri May 20 11:31:17 2016
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c0t2d0s0  ONLINE       0     0     0

errors: No known data errors

  pool: rpool-app
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool-app   UNAVAIL      0     0     0  insufficient replicas
          c0t1d0s0  UNAVAIL      0     0     0  cannot open

I think your assumption about the way rpool is set up is not right. Rpool is made up of whatever disks available in the OS. You can see those disks by issuing the command "format".  c0t0d0 and c0t1d0 and c0t2d0 are all logical disks created by HP Smart Array. They are not physical disks.  I'm using hpqacucli to manage the arrays and this is what it shows for Logical Drive 1, for example:

=> ctrl slot=0 ld 1 show

Smart Array P410i in Slot 0 (Embedded)

   array A

      Logical Drive: 1
         Size: 68.3 GB
         Fault Tolerance: RAID 1
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 17562
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Array Accelerator: Enabled
         Unique Identifier: 600508B1001C4B2D1B29DAB8A2DFF843
         Disk Name: /dev/dsk/c0t0d0
         Mount Points: None
         Logical Drive Label: AE51964B500143801657264058DC
         Mirror Group 0:
            physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 72 GB, OK)
         Mirror Group 1:
            physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 72 GB, OK)

You can notice that the Disk Name is /dev/dsk/c0t0d0 . And that's how the OS shows me the disk.

My problem is, the Controller still see all the disks connected but the OS doesn't. How can I get to see them on the OS?


Your OS (Solaris) doesn't (see all the disks) because your Hardware Controller is hiding them to the OS.

How can Solaris (or another OS supporting ZFS) try to manage, through its ZFS capabilities, connected physical disks if those are hidden behind the Hardware controller logic?

Mmmm...AFAIK a ZFS rpool (your root pool is ZFS) admits:

  • single vdev (single disk or single file) root pool
  • mirrored vdev (two or more disks) root pool

A root pool (rpool) has different requirements than a storage pool (your rpool-app).

Read it here or just find other sources through Google (there are very good Guides/Wiki Articles about Solaris, Oracle, illumos, OpenIndiana and OmniOS that explain that).

Maybe HP Smart Array calls a Drive what in reality it is just a Logical Volume (Array) made of more physical disks (striped, mirrored, stripe of mirrors and so on) and so the HP Smart Array offers that Logical Volume to the OS as it was a real disk. This principle is correct and expected.

The point here is that you are using ZFS over Logical Volumes created and managed directly by the HP Smart Array RAID controller: that should be avoided (IMHO it's a recipe for a disaster) since ZFS is able to manage its RAID levels in a way (RAIDZ pools) an Hardware controller can't and this can be rock solid ONLY if ZFS has a direct access to physical disks (not to Logical Volume or to fancy Logical Disks) without definitely anything interposed in the middle. ZFS needs maximum transparency, no opacity admitted (so JBOD is the way to go).

What does an iostat -En and a zpool status -x report?

Did you then try a zpool clear rpool-app ?

The error ZFS-8000-3C means exactly this.