- Community Home
- >
- Storage
- >
- Entry Storage Systems
- >
- Disk Enclosures
- >
- Re: va7400 disk failed, leading to IO
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-16-2007 02:53 AM
тАО08-16-2007 02:53 AM
i've got a va7400's failing disk. before the disk was failed, armdsp presented such messages:
Redundancy Group:_____________________1
Total Disks:________________________15
Total Physical Size:________________500.679 GB
Allocated to Regular LUNs:__________167.05 GB
Allocated as Business Copies:_______0 bytes
Used as Active Hot Spare:___________66.757 GB
Used for Redundancy:________________187.035 GB
Unallocated (Available for LUNs):___79.835 GB
Redundancy Group:_____________________2
Total Disks:________________________14
Total Physical Size:________________467.3 GB
Allocated to Regular LUNs:__________217.519 GB
Allocated as Business Copies:_______0 bytes
Used as Active Hot Spare:___________66.757 GB
Used for Redundancy:________________167.527 GB
Unallocated (Available for LUNs):___15.496 GB
and after the disk(M/D10) failed,armdsp presented following messages:
Redundancy Group:_____________________1
Total Disks:________________________15
Total Physical Size:________________500.679 GB
Allocated to Regular LUNs:__________167.05 GB
Allocated as Business Copies:_______0 bytes
Used as Active Hot Spare:___________0 bytes
Used for Redundancy:________________253.792 GB
Unallocated (Available for LUNs):___79.835 GB
Redundancy Group:_____________________2
Total Disks:________________________13
Total Physical Size:________________433.921 GB
Allocated to Regular LUNs:__________217.519 GB
Allocated as Business Copies:_______0 bytes
Used as Active Hot Spare:___________0 bytes
Used for Redundancy:________________216.402 GB
Unallocated (Available for LUNs):___0 bytes
and the sybase database's IO failed, i saw following messages in syslog.log:
Aug 15 09:13:09 timsa vmunix: SCSI: Read error -- dev: b 31 0x0b0400, errno: 126, resid: 2048,
Aug 15 09:13:09 timsa vmunix: blkno: 37842266, sectno: 75684532, offset: 95774720, bcount: 2048.
Aug 15 09:13:09 timsa vmunix: blkno: 37842290, sectno: 75684580, offset: 95799296, bcount: 2048.
Aug 15 09:13:09 timsa vmunix: LVM: Performed a switch for Lun ID = 0 (pv = 0x0000000048a9e000), from raw device 0x1f0b0400 (with priority: 0, and current flags: 0x40) to raw device 0x1f0a0400 (with priority: 1, and current flags: 0x0).
Aug 15 09:13:09 timsa vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048.
Aug 15 09:13:09 timsa vmunix:
Aug 15 09:13:09 timsa above message repeats 2 times
Aug 15 09:13:09 timsa vmunix: LVM: vg[3]: pvnum=1 (dev_t=0x1f0a0400) is POWERFAILED
Aug 15 09:13:09 timsa vmunix: SCSI: Read error -- dev: b 31 0x0b0400, errno: 126, resid: 2048,
Aug 15 09:13:14 timsa above message repeats 2 times
Aug 15 09:13:14 timsa vmunix: LVM: Recovered Path (device 0x1f0b0400) to PV 1 in VG 3.
Aug 15 09:13:39 timsa vmunix: LVM: Performed a switch for Lun ID = 0 (pv = 0x0000000048a9e000), from raw device 0x1f0a0400 (with priority: 1, and current flags: 0xc0) to raw device 0x1f0b0400 (with priority: 0, and current flags: 0x80).
Aug 15 09:13:44 timsa vmunix: LVM: vg[3]: pvnum=1 (dev_t=0x1f0b0400) is POWERFAILED
Aug 15 09:14:12 timsa vmunix:
Aug 15 09:13:09 timsa vmunix: LVM: vg[3]: pvnum=1 (dev_t=0x1f0a0400) is POWERFAILED
Aug 15 09:14:12 timsa vmunix: SCSI: Read error -- dev: b 31 0x0b0400, errno: 126, resid: 2048,
Aug 15 09:14:12 timsa vmunix: blkno: 37842264, sectno: 75684528, offset: 95772672, bcount: 2048.
Aug 15 09:16:25 timsa vmunix: LVM: Recovered Path (device 0x1f0b0400) to PV 1 in VG 3.
Aug 15 09:16:25 timsa vmunix: LVM: Recovered Path (device 0x1f0a0400) to PV 1 in VG 3.
i don't think a failed disk in va7400 would cause IO error, but i notice that there was hot spare problem in that va7400, would it caused the IO problem? before, there was 66GB hot spare in RG 1, and after, there was no hot spare in RG 1, and redundancy changed from 187GB to 253GB, why? would failed disk in RG 2 affect RG 1? i could not figure out the relationship in between, could you offer me some hints?
thanks a lot
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-16-2007 02:44 PM
тАО08-16-2007 02:44 PM
Re: va7400 disk failed, leading to IO
Bad block relocation should be "NONE".
lvchange -r N /dev/vgXX/lvolYY
-denver
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-16-2007 03:30 PM
тАО08-16-2007 03:30 PM
Re: va7400 disk failed, leading to IO
a failed disk won't cause io error but the error you saw in the syslog 'SCSI: Read error -- dev: b 31 0x0b0400, errno: 126, resid: 2048' was due to i/o timeout from the va7400. when you have a failed disk, especially if the free disk space is low, there will be some performance impact because of the rebuilding process.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-22-2007 06:37 PM
тАО08-22-2007 06:37 PM
Re: va7400 disk failed, leading to IO
I got a chance to work on a VA7400 2 months ago, one failed disk caused the unix server to completly not see any of the LUN's in the ioscan results.
the minute we pulled out the disk, the VA started rebuilding and the LUN's returned to be seen by the unix server.
I never thought that a single disk can cause a VA with 3 JBOD, a total of 50 disks so many failures.
I suggest you remove that disk from the VA and see if it resolve your issue.
Tal
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-23-2007 12:01 AM
тАО08-23-2007 12:01 AM
Re: va7400 disk failed, leading to IO
HP also Recommends that you leave the space for 2 disks unallocated for optimal performance. IS pvtimeout set to 60 seconds on the hosts or still on the default (30) and incorrect?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-23-2007 02:08 AM
тАО08-23-2007 02:08 AM
SolutionYou are quite right, a failure in RG2 should not effect RG1.
With the issue you raised about RG1, all the space that was in Active Hot Spare was placed into Used for Redundancy.
With the actual fault that was occurring in RG2, if you calculate it through it is displaying everything correctly. Let me run you through it
1. Disk fails.
2. Array takes all space in Unallocated and Active Hot Spare and assigns it to the failing disk process.
3. If you add up 66.67+15.496=82.253
Then 82.253-33.38(size of disk fail)=48.87
Then 167.527(Before failure Redundancy size)+48.84=216.402 = After failure Redundancy size.
So, during a rebuild process the array places all space from Unallocated and Active Hot Spare into the Redundancy category.
Once the rebuild and any other balancing or leveling finishes, it will restore what it can into the Active Hot Spare and Unallocated groups.
Obviously, when any disk fails, no matter which RG it is in, causes the array to perform the action of grouping these values together as described when displaying the output of armdsp.
Hope this helps explain it.
Regards
Owen
, as soon as the array had a disk failure, it firstly took all the space in Unused and allocated it for use