Disk Enclosures
1748275 Members
3896 Online
108761 Solutions
New Discussion юеВ

Re: finding which disk inside the array is bad?

 
SOLVED
Go to solution
Kumar Deuja_1
Advisor

Re: finding which disk inside the array is bad?

Thank you so much for the reply. I have attached the errors from the syslog from yesterday. Please let me know what you think of it. It seems like its complaining about the first physical volume of vg02 which is the array.

Any help would be appreciated.

Thanks again.
Eugeny Brychkov
Honored Contributor

Re: finding which disk inside the array is bad?

I see you daisy-chained 12h's controllers. Your problems are because if links are even switched, alternate dies and data to write is lost. This should not be a 12h's problem. If you wish to solve (or at least try to solve) problems here please:
1. replace all SCSI cables. Make sure daisy-chain cable between X and Y is not shorter than 1m;
2. replace terminator - make sure replacement term is also HVD (C2905A, A1658-63013 or A1658-62024);
3. patch your system (see my reply above).
And then we will see
Eugeny
Kumar Deuja_1
Advisor

Re: finding which disk inside the array is bad?

Eugeny,

Did you mean that the daisy chain cable shouldn't be shorter than 1 meter? cause what I have right now is certainly shorter than 1 meter and have been using for all these years. I just patched the box with the May 2003 QPack, going to install the HWE bundle next. After rebooting the machine after the patch was installed, got an error for inode 999 has and went through the fix process and continue starting the machine. I am going to remove the hard drive A6 which happened to be the 50G drive and rebuild the array. It was a regular seagate drive put inside the hp array hard drive case. I can't find the cable C2905A but I have C2981A instead. I will also change the terminators later on.

Thanks.
Eugeny Brychkov
Honored Contributor

Re: finding which disk inside the array is bad?

C2981A is 0.5 meter SCSI cable...
Anyway, you need to act systematically to find defective component (within software or hardware). Write an action plan and replace components one-by-one. This way you can find bad component. Although there's another way - replace everything at once and see if it will fix. If yes, then return old components and see if problem will appear again. As soon as it will appear, last installed component is bad.
These are just basic troubleshooting hints...
About cable lenght. I've heard something about it, but I believe it may cause SCSI events, but not PV powerfails
Eugeny
Kumar Deuja_1
Advisor

Re: finding which disk inside the array is bad?

Thanks for the reply. Basically, this is what I am doing right now. I have installed both patches (QPK and HWE). I removed the disk A6, now the array is rebuilding it and its also complaining about loosing the redundancy. I am replacing the daisy chain cable with the one I have C2981A since I don't have other at the moment. Actually, C2981A came with the array I believe. This is something I inheritated from someone and not being too savvy with HP-UX, I am figuring ways to find and fix this problem. I will send you the update. You can e-mail me if you have other suggestions at kumardeuja@yahoo.com. I greatly appreciate it.

Kumar Deuja_1
Advisor

Re: finding which disk inside the array is bad?

Eugeny,

So far, I haven't seen any errors. Basically, I removed the disk A6 and replaced the cable which connects the array and the K class. The system is been up for about 20 hours and looks good. Thanks for all the help and advices, really appreciate it. I will let you know if anything new comes up.

Thanks again.

-K
Kumar Deuja_1
Advisor

Re: finding which disk inside the array is bad?

Well, here is what happened and this is driving me crazy. As soon I am trying to backup that particular mount point then I get this PV failed error but when I unmount and run a full scan then it doesn't show any error. Please see the attached message. Any thoughts?


Thanks again.
Eugeny Brychkov
Honored Contributor

Re: finding which disk inside the array is bad?

If your server has one more HVD adapter then try splitting controllers between them, i.e. connect X to SCSI controller1 and Y to SCSI controller2. It will require VG reconfiguration (alternate links to PV will change), but in this case PV will be switching not just between 12h controllers, but between SCSI HBAs and we will see if host claims both SCSI HBAs as 'power failed'
Eugeny
Kumar Deuja_1
Advisor

Re: finding which disk inside the array is bad?

Eugeny,

I don't have another HVD on that server. But after I replaced the terminator and all the cables. It seems to be working fine. Its still a mystery to me. Thanks for all the advice.

-K