- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- I/O error Disk and timeout issue
Operating System - Linux
1752796
Members
5908
Online
108789
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-17-2011 08:39 PM
05-17-2011 08:39 PM
I/O error Disk and timeout issue
Hi ,
I have 5* bl460c g7 server with EVA6400 ,VC fabric with 2* 8/40 brocade switch.
Each VC has 2*8 GB link to each switch .And all the 4 port on the each controller is connected as per the standard .
OS RHEL 5u6
I am facing the below error when ever there is heavy I/O ,or some times (one try out of 10) simple multipath -ll or pvs will hung for some time and it throw the below error in the console/message file .Does any one having any idea what si gone wrong ?
Jan 19 03:15:54 testkernel: INFO: task mpath_prio_alua:23084 blocked for more than 120 seconds.
Jan 19 03:15:54 testkernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 19 03:15:54 testkernel: mpath_prio_al D ffffffff80153806 0 23084 8051 (NOTLB)
Jan 19 03:15:54 testkernel: ffff810f6b2f9a28 0000000000000086 ffff81080a6a2080 ffff81080bdbe4f8
Jan 19 03:15:54 testkernel: ffff81080bdbe000 0000000000000001 ffff810828250820 ffff81080b82e7a0
Jan 19 03:15:54 testkernel: 00005afd5ca08298 0000000000002cd3 ffff810828250a08 0000000f0a0b52c0
Jan 19 03:15:54 testkernel: Call Trace:
Jan 19 03:15:54 testkernel: [] wait_for_completion+0x79/0xa2
Jan 19 03:15:54 testkernel: [] default_wake_function+0x0/0xe
Jan 19 03:15:54 testkernel: [] blk_execute_rq_nowait+0x7e/0x92
Jan 19 03:15:54 testkernel: [] blk_execute_rq+0x98/0xc0
Jan 19 03:15:54 testkernel: [] sg_io+0x258/0x356
Jan 19 03:15:54 testkernel: [] scsi_cmd_ioctl+0x1d2/0x3b5
Jan 19 03:15:54 testkernel: [] avc_has_perm+0x46/0x58
Jan 19 03:15:54 testkernel: [] do_lookup+0x65/0x1e6
Jan 19 03:15:54 testkernel: [] :sd_mod:sd_ioctl+0x93/0xc2
Jan 19 03:15:54 testkernel: [] blkdev_driver_ioctl+0x5d/0x72
Jan 19 03:15:54 testkernel: [] blkdev_ioctl+0x63c/0x697
Jan 19 03:15:54 testkernel: [] avc_has_perm+0x46/0x58
Jan 19 03:15:54 testkernel: [] inode_has_perm+0x56/0x63
Jan 19 03:15:54 testkernel: [] blkdev_open+0x0/0x4f
Jan 19 03:15:54 testkernel: [] blkdev_open+0x23/0x4f
Jan 19 03:15:54 testkernel: [] __dentry_open+0x101/0x1dc
Jan 19 03:15:54 testkernel: [] block_ioctl+0x1b/0x1f
Jan 19 03:15:54 testkernel: [] do_ioctl+0x21/0x6b
Jan 19 03:15:54 testkernel: [] vfs_ioctl+0x457/0x4b9
Jan 19 03:15:54 testkernel: [] sys_ioctl+0x59/0x78
Jan 19 03:15:54 testkernel: [] tracesys+0xd5/0xe0
J
Appriciate your response on this
Thanks
I have 5* bl460c g7 server with EVA6400 ,VC fabric with 2* 8/40 brocade switch.
Each VC has 2*8 GB link to each switch .And all the 4 port on the each controller is connected as per the standard .
OS RHEL 5u6
I am facing the below error when ever there is heavy I/O ,or some times (one try out of 10) simple multipath -ll or pvs will hung for some time and it throw the below error in the console/message file .Does any one having any idea what si gone wrong ?
Jan 19 03:15:54 testkernel: INFO: task mpath_prio_alua:23084 blocked for more than 120 seconds.
Jan 19 03:15:54 testkernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 19 03:15:54 testkernel: mpath_prio_al D ffffffff80153806 0 23084 8051 (NOTLB)
Jan 19 03:15:54 testkernel: ffff810f6b2f9a28 0000000000000086 ffff81080a6a2080 ffff81080bdbe4f8
Jan 19 03:15:54 testkernel: ffff81080bdbe000 0000000000000001 ffff810828250820 ffff81080b82e7a0
Jan 19 03:15:54 testkernel: 00005afd5ca08298 0000000000002cd3 ffff810828250a08 0000000f0a0b52c0
Jan 19 03:15:54 testkernel: Call Trace:
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
Jan 19 03:15:54 testkernel: [
J
Appriciate your response on this
Thanks
1 REPLY 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2011 02:33 AM
05-24-2011 02:33 AM
Re: I/O error Disk and timeout issue
The "mpath_prio_alua" process has been stuck waiting for response from the storage system for more than 120 seconds, causing the hung-task detector in the kernel to trigger.
The detector will automatically produce a kernel-space call trace of the possibly-hung process. This does not necessarily mean anything is "wrong" within your system: it's just trying to help you figure out what the possibly-hung process is doing.
"mpath_prio_alua" is the tool used by the multipath utilities to query the active/passive state of the storage LUNs using the ALUA standard.
If running "mpath_prio_alua" takes a long time in your environment, you might want to find out why and fix it.
One possible way to get more information would be to run /sbin/mpath_prio_alua manually for each LUN using the -v option. If some particular LUNs make the command hang for longer time than others, then check the storage-side configuration of those LUNs.
Run "man mpath_prio_alua" for details and examples.
If "mpath_prio_alua" takes a long time simply because your system sees a very large number of LUNs, you might want to find out if it's necessary for your system to see them all. If your system sees an excessive number of LUNs, you might want to limit them by storage-side WWN masking and/or by fabric zoning.
MK
The detector will automatically produce a kernel-space call trace of the possibly-hung process. This does not necessarily mean anything is "wrong" within your system: it's just trying to help you figure out what the possibly-hung process is doing.
"mpath_prio_alua" is the tool used by the multipath utilities to query the active/passive state of the storage LUNs using the ALUA standard.
If running "mpath_prio_alua" takes a long time in your environment, you might want to find out why and fix it.
One possible way to get more information would be to run /sbin/mpath_prio_alua manually for each LUN using the -v option. If some particular LUNs make the command hang for longer time than others, then check the storage-side configuration of those LUNs.
Run "man mpath_prio_alua" for details and examples.
If "mpath_prio_alua" takes a long time simply because your system sees a very large number of LUNs, you might want to find out if it's necessary for your system to see them all. If your system sees an excessive number of LUNs, you might want to limit them by storage-side WWN masking and/or by fabric zoning.
MK
MK
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP