SAS PEL errors on cage0

jorgevisentini · ‎02-06-2024

Hi all.

We have the following situation:
The error below is appearing all the time. Apparently all the doors are working and have a "green light", but the error keeps appearing.

SAS PEL errors on 1:1:1 from pd1(0x5002538A072C4D31),Phy1 to cage0(0x5001438030F5953F),Phy9.

3par cli% showport
N:S:P      Mode     State ----Node_WWN---- -Port_WWN/HW_Addr- Type Protocol Label Partner FailoverState
0:0:1    target     ready 2FF70002AC01E30D   20010002AC01E30D free       FC     -   1:0:1          none
0:0:2    target     ready 2FF70002AC01E30D   20020002AC01E30D free       FC     -   1:0:2          none
0:1:1 initiator     ready 50002ACFF701E30D   50002AC01101E30D disk      SAS  DP-1       -             -
0:1:2 initiator loss_sync 50002ACFF701E30D   50002AC01201E30D free      SAS  DP-2       -             -
0:2:1    target     ready 2FF70002AC01E30D   20210002AC01E30D free       FC     -   1:2:1          none
0:2:2    target     ready 2FF70002AC01E30D   20220002AC01E30D free       FC     -   1:2:2          none
0:2:3    target loss_sync 2FF70002AC01E30D   20230002AC01E30D free       FC     -   1:2:3          none
0:2:4    target loss_sync 2FF70002AC01E30D   20240002AC01E30D free       FC     -   1:2:4          none
0:3:1      peer   offline                -       941882459D31 free       IP   IP0       -             -
1:0:1    target     ready 2FF70002AC01E30D   21010002AC01E30D free       FC     -   0:0:1          none
1:0:2    target     ready 2FF70002AC01E30D   21020002AC01E30D free       FC     -   0:0:2          none
1:1:1 initiator     ready 50002ACFF701E30D   50002AC11101E30D disk      SAS  DP-1       -             -
1:1:2 initiator loss_sync 50002ACFF701E30D   50002AC11201E30D free      SAS  DP-2       -             -
1:2:1    target     ready 2FF70002AC01E30D   21210002AC01E30D free       FC     -   0:2:1          none
1:2:2    target     ready 2FF70002AC01E30D   21220002AC01E30D free       FC     -   0:2:2          none
1:2:3    target loss_sync 2FF70002AC01E30D   21230002AC01E30D free       FC     -   0:2:3          none
1:2:4    target loss_sync 2FF70002AC01E30D   21240002AC01E30D free       FC     -   0:2:4          none
1:3:1      peer   offline                -       94188245B471 free       IP   IP1       -             -
-------------------------------------------------------------------------------------------------------
   18

3par cli% showpd -c
                              -------- Normal Chunklets --------- ---- Spare Chunklets -----
                              - Used - --------- Unused --------- - Used - ---- Unused -----
Id CagePos Type State   Total  OK Fail   Free Uninit Unavail Fail OK  Fail  Free Uninit Fail
 0 0:0:0   SSD  normal   7152  14    0   6422      0       0    0  0     0   716      0    0
 1 0:1:0   SSD  normal   7152  13    0   6423      0       0    0  0     0   716      0    0
 2 0:2:0   SSD  normal   7152  18    0   6418      0       0    0  0     0   716      0    0
 3 0:3:0   SSD  normal   7152  17    0   6419      0       0    0  0     0   716      0    0
 4 0:4:0   SSD  normal   7152  18    0   6419      0       0    0  0     0   715      0    0
 5 0:5:0   SSD  normal   7152  18    0   6419      0       0    0  0     0   715      0    0
 6 0:6:0   SSD  normal   7152  18    0   6419      0       0    0  0     0   715      0    0
 7 0:7:0   SSD  normal   7152  16    0   6421      0       0    0  0     0   715      0    0
 8 0:8:0   SSD  normal   7152  13    0   6424      0       0    0  0     0   715      0    0
 9 0:9:0   SSD  normal   7152  13    0   6424      0       0    0  0     0   715      0    0
10 0:10:0  SSD  normal   7152  16    0   6421      0       0    0  0     0   715      0    0
11 0:11:0  SSD  normal   7152  14    0   6423      0       0    0  0     0   715      0    0
12 0:12:0  SSD  normal   7152  18    0   6419      0       0    0  0     0   715      0    0
13 0:13:0  SSD  normal   7152  18    0   6419      0       0    0  0     0   715      0    0
14 0:14:0  SSD  normal   7152  18    0   6419      0       0    0  0     0   715      0    0
15 0:15:0  SSD  normal   7152  18    0   6419      0       0    0  0     0   715      0    0
16 0:16:0  SSD  normal   7152  17    0   6420      0       0    0  0     0   715      0    0
17 0:17:0  SSD  normal   7152  15    0   6422      0       0    0  0     0   715      0    0
18 0:18:0  SSD  normal   7152  14    0   6423      0       0    0  0     0   715      0    0
19 0:19:0  SSD  normal   7152  13    0   6424      0       0    0  0     0   715      0    0
--------------------------------------------------------------------------------------------
20 total               143040 319    0 128417      0       0    0  0     0 14304      0    0

ksmmi3r023par02 cli% showversion
Release version 3.3.2 (MU1)
Patches:  P04

Component Name                   Version
CLI Server                       3.3.2 (MU1)
CLI Client                       3.3.2
System Manager                   3.3.2 (MU1)
Kernel                           3.3.2 (MU1)
TPD Kernel Code                  3.3.2 (MU1)

Does anyone know what could have happened? Any tips?
Cheers!

veeyarvi · ‎02-07-2024

Hi jorgevisentini,

If the error is appearing for only one pd, the first action is to get that replaced. The below command would provide the details ncluding the SAS Phy Error Log (PEL) data for devices under this port.

cli% showportdev sas -pel 1:1:1

Regards,
Veeyaarvi

I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

jorgevisentini · ‎02-07-2024

3par02 cli% showportdev sas -pel 1:1:1
ID      DevName          SASAddr          Phy ParentDevHdl DevHdl AttDevHdl Link   AttID   AttDevName       AttSASAddr       AttPhy  InvDC RunDEC LossDSC PhyRPC
<1:1:1> 50002ACFF701E30D 50002AC11101E30D 0   -            0x01   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 16          0      0       0      0
<1:1:1> 50002ACFF701E30D 50002AC11101E30D 1   -            0x01   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 17          0      0       0      0
<1:1:1> 50002ACFF701E30D 50002AC11101E30D 2   -            0x01   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 18          0      0       0      0
<1:1:1> 50002ACFF701E30D 50002AC11101E30D 3   -            0x01   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 19          0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 0   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 1   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 2   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 3   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 4   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 5   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 6   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 7   0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 8   0x01         0x09   0x0a      12Gbps pd0     5002538A0727C461 5002538A0727C463 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 9   0x01         0x09   0x0b      12Gbps pd1     5002538A072C4D31 5002538A072C4D33 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 10  0x01         0x09   0x0c      12Gbps pd2     5002538A072BF361 5002538A072BF363 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 11  0x01         0x09   0x0d      12Gbps pd3     5002538A072BF3C1 5002538A072BF3C3 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 12  0x01         0x09   0x0e      12Gbps pd4     5002538A072C4731 5002538A072C4733 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 13  0x01         0x09   0x0f      12Gbps pd5     5002538A0727C671 5002538A0727C673 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 14  0x01         0x09   0x10      12Gbps pd6     5002538A072C4111 5002538A072C4113 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 15  0x01         0x09   0x11      12Gbps pd7     5002538A072C4D41 5002538A072C4D43 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 16  0x01         0x09   0x01      12Gbps <1:1:1> 50002ACFF701E30D 50002AC11101E30D 0          38     39       1      0
exp09   5001438030F5953F 5001438030F5953F 17  0x01         0x09   0x01      12Gbps <1:1:1> 50002ACFF701E30D 50002AC11101E30D 1          39     40       1      0
exp09   5001438030F5953F 5001438030F5953F 18  0x01         0x09   0x01      12Gbps <1:1:1> 50002ACFF701E30D 50002AC11101E30D 2          38      1       1      0
exp09   5001438030F5953F 5001438030F5953F 19  0x01         0x09   0x01      12Gbps <1:1:1> 50002ACFF701E30D 50002AC11101E30D 3          31     32       1      0
exp09   5001438030F5953F 5001438030F5953F 20  0x01         0x09   0x12      12Gbps pd8     5002538A072C4601 5002538A072C4603 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 21  0x01         0x09   0x13      12Gbps pd9     5002538A072BF071 5002538A072BF073 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 22  0x01         0x09   0x14      12Gbps pd10    5002538A072C4C91 5002538A072C4C93 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 23  0x01         0x09   0x15      12Gbps pd11    5002538A072C46A1 5002538A072C46A3 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 24  0x01         0x09   0x16      12Gbps pd12    5002538A072C4261 5002538A072C4263 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 25  0x01         0x09   0x17      12Gbps pd13    5002538A072C4D51 5002538A072C4D53 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 26  0x01         0x09   0x18      12Gbps pd14    5002538A072BF351 5002538A072BF353 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 27  0x01         0x09   0x19      12Gbps pd15    5002538A072BF3A1 5002538A072BF3A3 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 28  0x01         0x09   0x1a      12Gbps pd16    5002538A072C4641 5002538A072C4643 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 29  0x01         0x09   0x1b      12Gbps pd17    5002538A072BF341 5002538A072BF343 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 30  0x01         0x09   0x1c      12Gbps pd18    5002538A072C4761 5002538A072C4763 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 31  0x01         0x09   0x1d      12Gbps pd19    5002538A072C4661 5002538A072C4663 1           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 32  0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 33  0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 34  0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 35  0x01         0x09   -         n/a    -       -                -                -           0      0       0      0
exp09   5001438030F5953F 5001438030F5953F 36  0x01         0x09   0x1e      12Gbps cage0   50050CC106234D1A 5001438030F5953E 1           -      -       -      -
pd0     5002538A0727C461 5002538A0727C463 1   0x09         0x0a   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 8           2      2       1      1
pd1     5002538A072C4D31 5002538A072C4D33 1   0x09         0x0b   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 9      344244 333880       6      1
pd2     5002538A072BF361 5002538A072BF363 1   0x09         0x0c   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 10        200    200       6      1
pd3     5002538A072BF3C1 5002538A072BF3C3 1   0x09         0x0d   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 11         85     85       6      1
pd4     5002538A072C4731 5002538A072C4733 1   0x09         0x0e   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 12          2      2       1      1
pd5     5002538A0727C671 5002538A0727C673 1   0x09         0x0f   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 13          2      2       1      1
pd6     5002538A072C4111 5002538A072C4113 1   0x09         0x10   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 14        115    115       6      0
pd7     5002538A072C4D41 5002538A072C4D43 1   0x09         0x11   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 15          2      2       6      1
pd8     5002538A072C4601 5002538A072C4603 1   0x09         0x12   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 20          1      1       6      1
pd9     5002538A072BF071 5002538A072BF073 1   0x09         0x13   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 21         60     60       2      1
pd10    5002538A072C4C91 5002538A072C4C93 1   0x09         0x14   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 22          4      4       2      0
pd11    5002538A072C46A1 5002538A072C46A3 1   0x09         0x15   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 23          3      3       2      1
pd12    5002538A072C4261 5002538A072C4263 1   0x09         0x16   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 24          5      5       7      1
pd13    5002538A072C4D51 5002538A072C4D53 1   0x09         0x17   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 25          3      3       7      0
pd14    5002538A072BF351 5002538A072BF353 1   0x09         0x18   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 26          2      2       2      0
pd15    5002538A072BF3A1 5002538A072BF3A3 1   0x09         0x19   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 27        172    172       7      1
pd16    5002538A072C4641 5002538A072C4643 1   0x09         0x1a   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 28          3      3       2      0
pd17    5002538A072BF341 5002538A072BF343 1   0x09         0x1b   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 29          3      3       2      0
pd18    5002538A072C4761 5002538A072C4763 1   0x09         0x1c   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 30          3      3       2      1
pd19    5002538A072C4661 5002538A072C4663 1   0x09         0x1d   0x09      12Gbps exp09   5001438030F5953F 5001438030F5953F 31          5      4       7      1
----------------------------------------------------------------------------------------------------------------------------------------------------------------
     61 total
3par02 cli%

Thanks for your help
It turns out that before we had cage1, but we removed it and left only cage0 and reorganized the disks.
There is nothing productive in this array. No LUN created

veeyarvi · ‎02-07-2024

Hi Jorgevisentini,

As we see, the errors are reporting for the pd1. The first step to get it replaced.

Regards,
Veeyaarvi

I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

jorgevisentini · ‎02-08-2024

@veeyarvi

Thanks man. You were perfect in your analysis.

3par02 cli% showpd -i
Id CagePos State    ----Node_WWN---- --MFR-- -----Model------ -Serial- -FW_Rev- Protocol MediaType -----AdmissionTime-----
 0 0:0:0   normal   5002538A0727C461 SAMSUNG AREA7680S5xnNTRI 0J201420 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 1 0:23:0? degraded 5002538A072C4D31 SAMSUNG AREA7680S5xnNTRI 0J202848 3P05     SAS      --        2023-05-30 18:31:03 -03
 2 0:2:0   normal   5002538A072BF361 SAMSUNG AREA7680S5xnNTRI 0J202571 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 3 0:3:0   normal   5002538A072BF3C1 SAMSUNG AREA7680S5xnNTRI 0J202577 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 4 0:4:0   normal   5002538A072C4731 SAMSUNG AREA7680S5xnNTRI 0J202752 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 5 0:5:0   normal   5002538A0727C671 SAMSUNG AREA7680S5xnNTRI 0J201453 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 6 0:6:0   normal   5002538A072C4111 SAMSUNG AREA7680S5xnNTRI 0J202679 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 7 0:7:0   normal   5002538A072C4D41 SAMSUNG AREA7680S5xnNTRI 0J202849 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 8 0:8:0   normal   5002538A072C4601 SAMSUNG AREA7680S5xnNTRI 0J202733 3P05     SAS      MLC       2023-05-30 18:31:03 -03
 9 0:9:0   normal   5002538A072BF071 SAMSUNG AREA7680S5xnNTRI 0J202524 3P05     SAS      MLC       2023-05-30 18:31:03 -03
10 0:10:0  normal   5002538A072C4C91 SAMSUNG AREA7680S5xnNTRI 0J202838 3P05     SAS      MLC       2023-05-30 18:31:03 -03
11 0:11:0  normal   5002538A072C46A1 SAMSUNG AREA7680S5xnNTRI 0J202743 3P05     SAS      MLC       2023-05-30 18:31:03 -03
12 0:12:0  normal   5002538A072C4261 SAMSUNG AREA7680S5xnNTRI 0J202690 3P05     SAS      MLC       2023-05-30 18:31:03 -03
13 0:13:0  normal   5002538A072C4D51 SAMSUNG AREA7680S5xnNTRI 0J202850 3P05     SAS      MLC       2023-05-30 18:31:03 -03
14 0:14:0  normal   5002538A072BF351 SAMSUNG AREA7680S5xnNTRI 0J202570 3P05     SAS      MLC       2023-05-30 18:31:03 -03
15 0:15:0  normal   5002538A072BF3A1 SAMSUNG AREA7680S5xnNTRI 0J202575 3P05     SAS      MLC       2023-05-30 18:31:03 -03
16 0:16:0  normal   5002538A072C4641 SAMSUNG AREA7680S5xnNTRI 0J202737 3P05     SAS      MLC       2023-05-30 18:31:03 -03
17 0:17:0  normal   5002538A072BF341 SAMSUNG AREA7680S5xnNTRI 0J202569 3P05     SAS      MLC       2023-05-30 18:31:03 -03
18 0:18:0  normal   5002538A072C4761 SAMSUNG AREA7680S5xnNTRI 0J202755 3P05     SAS      MLC       2023-05-30 18:31:03 -03
19 0:19:0  normal   5002538A072C4661 SAMSUNG AREA7680S5xnNTRI 0J202739 3P05     SAS      MLC       2023-05-30 18:31:03 -03
20 0:1:0   normal   5002538A072C4CC1 SAMSUNG AREA7680S5xnNTRI 0J202841 3P05     SAS      MLC       2024-02-08 18:23:57 -03
--------------------------------------------------------------------------------------------------------------------------
21 total
3par02 cli%

Do you know how I remove logically this degraded disk? We physically removed it, but it continues to appear.
I removed the cpg and I still can't remove it..

3par02 cli% dismisspd 1
Error : Pd id 1 is referenced by chunklet 2:500


3par02 cli% showpdch -from 1
Pdid Chnk        LdName LdCh  State Usage Media Sp Cl   From  To
   2  500        log1.0    6 normal    ld valid  N  N 1:7141 ---
  20 6471 .srdata.usr.1   74 normal    ld valid  Y  N 1:7142 ---
  20 6472 .srdata.usr.1   66 normal    ld valid  Y  N 1:7143 ---
  20 6473 .srdata.usr.1   54 normal    ld valid  Y  N 1:7144 ---
  20 6474 .srdata.usr.1   46 normal    ld valid  Y  N 1:7145 ---
  20 6475 .srdata.usr.1   36 normal    ld valid  Y  N 1:7146 ---
  20 6476 .srdata.usr.1   10 normal    ld valid  Y  N 1:7149 ---
  20 6477        log1.0   30 normal    ld valid  Y  N 1:7139 ---
  20 6478        log1.0   22 normal    ld valid  Y  N 1:7140 ---
  20 6479   admin.usr.1    0 normal    ld valid  Y  N 1:7151 ---
----------------------------------------------------------------
Total chunklets: 10
3par02 cli%

veeyarvi · ‎02-12-2024

Hi Jorgevisentini,

The correct procedure was to run a 'servicemag start' for this PD; and once 'servicemag start' successful physically remove it. After replacing with a new spare, 'servicemag resume' shall be run.

Since the drive physically removed, I expect the sytem already started reconstructing the chunklets from RAID. Could you check the status?

showpd -s 1
showpd -c 1
servicemag status 0 23

The drive hould appear 'failed' then only you can dismiss it safely.

PS: Do you have a spare drive for the replacement?

Regards,
Veeyaarvi

I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

jorgevisentini · ‎02-12-2024

@veeyarvi Hi,

As this array is a non-production array yet, we just removed and relocated the disks in the array.

I'm thinking about removing all the disks, restarting the array and recreating the TOC, but I still wanted to learn how to remove a disk from the array, you know?

3par02 cli% showpd -s 1
Id CagePos Type -State-- ---------Detailed_State--------- -SedState-
 1 0:23:0? SSD  degraded missing,no_valid_ports,servicing unknown
--------------------------------------------------------------------
 1 total
3par02 cli%
3par02 cli%
3par02 cli% showpd -c 1
                               ------- Normal Chunklets -------- ---- Spare Chunklets ----
                               - Used - -------- Unused -------- - Used - ---- Unused ----
Id CagePos Type State    Total OK  Fail Free Uninit Unavail Fail OK  Fail Free Uninit Fail
 1 0:23:0? SSD  degraded  7152  0     0 6423    729       0    0  0     0    0      0    0
------------------------------------------------------------------------------------------
 1 total                  7152  0     0 6423    729       0    0  0     0    0      0    0
3par02 cli%
3par02 cli%
3par02 cli% servicemag start 0 23
Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
servicemag start 0 23
... servicing disks in mag: 0 23
... normal disks:
... not normal disks:  WWN [5002538A072C4D31] Id [ 1]

The servicemag start operation will continue in the background.
3par02 cli%
3par02 cli%
3par02 cli% servicemag status 0 23
A servicemag start command failed on this magazine.
The command completed at Tue Feb 13 00:40:33 2024.
The command started at Tue Feb 13 00:40:33 2024
Failed -- -- Failed. Please run servicemag status -d for more detail
3par02 cli%
3par02 cli%
3par02 cli% servicemag status -d
Cage 0, magazine 23:
A servicemag start command failed on this magazine.
The command completed at Tue Feb 13 00:40:33 2024.
The command started at Tue Feb 13 00:40:33 2024
The output of the servicemag start was:
Failed --
Unable to run servicemag Start command. Servicemag is either active or has failed for this cage and magazine.
Please rectify any error conditions and issue "servicemag unmark" before retrying.

servicemag start 0 23 -- Failed
3par02 cli%
3par02 cli%
3par02 cli% servicemag unmark 0 23
Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
servicemag unmark 0:23
3par02 cli%
3par02 cli%
3par02 cli% servicemag status 0 23
A servicemag start command failed on this magazine.
The command completed at Tue Feb 13 00:40:33 2024.
The command started at Tue Feb 13 00:40:33 2024
Failed -- -- Failed. Please run servicemag status -d for more detail
3par02 cli%

veeyarvi · ‎02-13-2024

Hi Jorgevisentini,

There are two scenarios mainly - The drive failed by the system and a practive drive replacement initiated by the user.

If a storage drive fails, the system automatically runs the servicemag command in the background. The servicemag command illuminates the blue drive LED to indicate a fault and the drive to replace. Storage drives are replaced for various reasons and not necessarily the result of a failure. In this case, the displayed output may not show errors. Once the drive replaced by a new spare, the servicemag resume also will be kicked in automatically.

If the replacement is a proactive replacement prior to a failure, enter servicemag start -pdid <pdID> to initiate the removal of data from the drive. The system will store the removed data on the spare chunklets. Once the servicemag start completes, replace the drive with a new spare and run 'servicemag resume <PD ID>.

Regards,

Veeyaarvi

I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

jorgevisentini · ‎02-13-2024

@veeyarvi I understood what you meant.

But do you understand that the degraded disk is no longer in the array? We took it out without first moving the chuckles

Well... I managed to run servicemag start -pdid 1.

Now, I don't want to put another disk, I ALREADY reallocated the disks in the array.
I just want to logically make the disk disappear (1 0:23:0? SSD degraded) from the array, understand?

3par02 cli% servicemag start -pdid 1
Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
Pd 1 does not have a current cage position.
Its last known position was at cage 0 mag 23 pos 0
Using that to run servicemag on cage 0 mag 23
servicemag start -pdid 1
... servicing disks in mag: 0 23
... normal disks:
... not normal disks:  WWN [5002538A072C4D31] Id [ 1]

The servicemag start operation will continue in the background.
3par02 cli%
3par02 cli%
3par02 cli% servicemag status
Cage 0, magazine 23:
The magazine was successfully brought offline by a servicemag start command.
The command completed at Tue Feb 13 16:06:31 2024.
The command started at Tue Feb 13 16:06:21 2024
servicemag start -pdid 1 -- Succeeded
3par02 cli%
3par02 cli%
3par02 cli% servicemag status
Cage 0, magazine 23:
A servicemag start command failed on this magazine.
The command completed at Tue Feb 13 16:06:55 2024.
The command started at Tue Feb 13 16:06:55 2024
Failed -- -- Failed. Please run servicemag status -d for more detail
3par02 cli%
3par02 cli%
3par02 cli% servicemag status -d
Cage 0, magazine 23:
A servicemag start command failed on this magazine.
The command completed at Tue Feb 13 16:06:55 2024.
The command started at Tue Feb 13 16:06:55 2024
The output of the servicemag start was:
Failed --
Unable to run servicemag Start command. Servicemag is either active or has failed for this cage and magazine.
Please rectify any error conditions and issue "servicemag unmark" before retrying.

servicemag start -pdid 1 -- Failed
3par02 cli%
3par02 cli% servicemag unmark 0 23
Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
servicemag unmark 0:23
3par02 cli% servicemag start -pdid 1
Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
Pd 1 does not have a current cage position.
Its last known position was at cage 0 mag 23 pos 0
Using that to run servicemag on cage 0 mag 23
servicemag start -pdid 1
... servicing disks in mag: 0 23
... normal disks:
... not normal disks:  WWN [5002538A072C4D31] Id [ 1]

The servicemag start operation will continue in the background.
3par02 cli%
3par02 cli%
3par02 cli% servicemag status -d
Cage 0, magazine 23:
The magazine is being brought offline due to a servicemag start.
The last status update was at Tue Feb 13 16:14:53 2024.
The command started at Tue Feb 13 16:14:52 2024
Unable to provide a relocation estimate
The cumulative output so far is:
servicemag start -pdid 1
... servicing disks in mag: 0 23
... normal disks:
... not normal disks:  WWN [5002538A072C4D31] Id [ 1]
... relocating chunklets to spare space...
3par02 cli%
3par02 cli% servicemag status -d
Cage 0, magazine 23:
The magazine was successfully brought offline by a servicemag start command.
The command completed at Tue Feb 13 16:15:03 2024.
The command started at Tue Feb 13 16:14:52 2024
The output of the servicemag start was:
servicemag start -pdid 1
... servicing disks in mag: 0 23
... normal disks:
... not normal disks:  WWN [5002538A072C4D31] Id [ 1]
... relocating chunklets to spare space...
... skipping bypass on this type of cage
servicemag start -pdid 1 -- Succeeded
3par02 cli%
3par02 cli%
3par02 cli% showpd -c -degraded
                               ------- Normal Chunklets -------- ---- Spare Chunklets ----
                               - Used - -------- Unused -------- - Used - ---- Unused ----
Id CagePos Type State    Total OK  Fail Free Uninit Unavail Fail OK  Fail Free Uninit Fail
 1 0:23:0? SSD  degraded  7152  0     0 6423    729       0    0  0     0    0      0    0
------------------------------------------------------------------------------------------
 1 total                  7152  0     0 6423    729       0    0  0     0    0      0    0
3par02 cli%

veeyarvi · ‎02-14-2024

Hi Jorgevisentini,

The disk is in degraded status since the servicemag is active on the same. (You can see the status of the drive showpd -s 1 for a confirmation).

Try 'dismisspd 1' and see whether it successfully dismiss the PD. If all chunklets already vacatee it has no reference chunklets, this will be successful.

Regards,
Veeyaarvi

I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

veeyarvi · ‎02-14-2024

Hi Jorgevisentini

If the above step fails, please unmark servicemag and check the steps as metnioned in Solved: Re: Remove a PD permanently. - Hewlett Packard Enterprise Community (hpe.com)

Regards,
Veeyaarvi

I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

jorgevisentini · ‎02-14-2024

I followed the procedure but it didn't work.
I will do a complete reset with the serial cable.

Thank you very much for your patience!

3par02 cli% showpd -s 1
Id CagePos Type -State-- ---------Detailed_State--------- -SedState-
 1 0:23:0? SSD  degraded missing,no_valid_ports,servicing unknown
--------------------------------------------------------------------
 1 total
3par02 cli%
3par02 cli%
3par02 cli% setpd ldalloc off 1
3par02 cli%
23par02 cli% showpd -s
Id CagePos Type -State-- ---------Detailed_State--------- -SedState--
 0 0:0:0   SSD  normal   normal                           not_capable
 1 0:23:0? SSD  degraded missing,no_valid_ports,servicing unknown
 2 0:2:0   SSD  normal   normal                           not_capable
 3 0:3:0   SSD  normal   normal                           not_capable
 4 0:4:0   SSD  normal   normal                           not_capable
 5 0:5:0   SSD  normal   normal                           not_capable
 6 0:6:0   SSD  normal   normal                           not_capable
 7 0:7:0   SSD  normal   normal                           not_capable
 8 0:8:0   SSD  normal   normal                           not_capable
 9 0:9:0   SSD  normal   normal                           not_capable
10 0:10:0  SSD  normal   normal                           not_capable
11 0:11:0  SSD  normal   normal                           not_capable
12 0:12:0  SSD  normal   normal                           not_capable
13 0:13:0  SSD  normal   normal                           not_capable
14 0:14:0  SSD  normal   normal                           not_capable
15 0:15:0  SSD  normal   normal                           not_capable
16 0:16:0  SSD  normal   normal                           not_capable
17 0:17:0  SSD  normal   normal                           not_capable
18 0:18:0  SSD  normal   normal                           not_capable
19 0:19:0  SSD  normal   normal                           not_capable
20 0:1:0   SSD  normal   normal                           not_capable
---------------------------------------------------------------------
21 total
3par02 cli%
3par02 cli%
3par02 cli% movepdtospare -f -vacate -nowait 1
There are no chunklets to move.

3par02 cli%
3par02 cli% showpdch -mov
Pdid Chnk        LdName LdCh  State Usage Media Sp Cl   From  To
  20 6471 .srdata.usr.1   74 normal    ld valid  Y  N 1:7142 ---
  20 6472 .srdata.usr.1   66 normal    ld valid  Y  N 1:7143 ---
  20 6473 .srdata.usr.1   54 normal    ld valid  Y  N 1:7144 ---
  20 6474 .srdata.usr.1   46 normal    ld valid  Y  N 1:7145 ---
  20 6475 .srdata.usr.1   36 normal    ld valid  Y  N 1:7146 ---
  20 6476 .srdata.usr.1   10 normal    ld valid  Y  N 1:7149 ---
  20 6477        log1.0   30 normal    ld valid  Y  N 1:7139 ---
  20 6478        log1.0   22 normal    ld valid  Y  N 1:7140 ---
  20 6479   admin.usr.1    0 normal    ld valid  Y  N 1:7151 ---
----------------------------------------------------------------
Total chunklets: 9
3par02 cli%
3par02 cli% showpdch -mov 1
No chunklet information available.
3par02 cli%
3par02 cli% showpdch -spr 1
No chunklet information available.
3par02 cli%
3par02 cli% removespare 1:a
Are you sure you want to remove spares?
select q=quit y=yes n=no: y
No spares removed
3par02 cli%
3par02 cli% dismisspd 1
Error : Pd id 1 is referenced by chunklet 20:6471
3par02 cli%

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

SAS PEL errors on cage0

SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0

Re: SAS PEL errors on cage0