- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: EXT3-fs error(device dm-6)
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-06-2010 04:00 AM
тАО08-06-2010 04:00 AM
Just replaced the bad disk yesterday and still getting errors. I am getting the following errors in dmesg and still one of the file system is read-only mounted. Any ideas?
sd 0:0:0:0: SCSI error: return code = 0x08000002
sda: Current: sense key: Hardware Error
Add. Sense: Internal target failure
Info fld=0x0
end_request: I/O error, dev sda, sector 235753109
Buffer I/O error on device dm-6, logical block 2704297
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704298
lost page write due to I/O error on dm-6
sd 0:0:0:0: SCSI error: return code = 0x08000002
sda: Current: sense key: Hardware Error
Add. Sense: Internal target failure
Info fld=0x0
end_request: I/O error, dev sda, sector 235752965
Buffer I/O error on device dm-6, logical block 2704279
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704280
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704281
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704282
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704283
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704284
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704285
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 2704286
lost page write due to I/O error on dm-6
Aborting journal on device dm-6.
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
ext3_abort called.
EXT3-fs error (device dm-6): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
SCSI device sda: 859525120 512-byte hdwr sectors (440077 MB)
sda: Write Protect is off
sda: Mode Sense: 06 00 10 00
SCSI device sda: drive cache: write back w/ FUA
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-09-2010 06:34 AM
тАО08-09-2010 06:34 AM
Re: EXT3-fs error(device dm-6)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-09-2010 08:42 AM
тАО08-09-2010 08:42 AM
Solution> How can I tell which disk is failed form O/S side. I have no hardware monitor tools.
Your question would be much easier to answer if you had given more information about your set-up:
- name and version of Linux distribution
- system manufacturer and model
- RAID hardware model (if applicable)
First, let's try to find the persistent device-mapper devicename that corresponds to /dev/dm-6:
ls -l /dev/dm-6 /dev/mapper/* /dev/md*
The device that has the same major and minor device numbers as /dev/dm-6 is the device you're looking for.
The next step would be to find out what does /dev/dm-6 do and which hardware-level devices are associated with it. By the error messages I assume /dev/sda is one of them; but are there others?
Possibly useful commands:
dmsetup table
dmsetup ls --tree
cat /proc/mdstat
pvs
True hardware RAID usually hides the actual physical disks: the only way to get information about the state of the disks is to ask the driver. Usually some RAID-manufacturer-specific diagnostic program is required to get the full report, but basic information may be available in the /proc filesystem. Look into /proc/scsi/
For example, if it's a HP SmartArray hardware RAID which is controlled by the "cciss" driver module, then "cat /proc/driver/cciss/0" would display basic information about the first SmartArray controller on the system (controller 0).
If you had the "hpacucli" (HP Array Configuration Utility CLI) tool installed, the command "hpacucli controller all show config detail" would produce a more verbose report about the SmartArray controllers, including the state, model and serial numbers of all physical disks attached to them.
There's also the "Array Diagnostic Utility" which can produce an even more verbose report.
If you don't have any RAID diagnostic programs installed and cannot install them, the only way to identify a failed disk might be to look at the disk diagnostic LEDs in the server's front panel (if the RAID controller has such LEDs).
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2010 06:04 AM
тАО08-10-2010 06:04 AM
Re: EXT3-fs error(device dm-6)
Vow, Thank you so much for the response.
Basic environment info:
RHEL 5.1 with kernel 2.6.18-53.el5
SunFire X4150 and RAID is done at BIOS level.
(4) 146 HD with RAID 5.
From the BIOS, I can see all four disk drives in a solid state. And also from the /proc/scsi/scsi file, I can see all four disks as listed below:
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: Sun Model: sys_root Rev: V1.0
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 01 Id: 00 Lun: 00
Vendor: SEAGATE Model: ST914602SSUN146G Rev: 0603
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 01 Id: 01 Lun: 00
Vendor: SEAGATE Model: ST914602SSUN146G Rev: 0603
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 01 Id: 02 Lun: 00
Vendor: SEAGATE Model: ST914602SSUN146G Rev: 0603
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 01 Id: 03 Lun: 00
Vendor: SEAGATE Model: ST914602SSUN146G Rev: 0603
Type: Direct-Access ANSI SCSI revision: 05
And from the commands you asked to try:
[root@mtstalpd-rac3 sg]# ls -l /dev/dm-6 /dev/mapper/* /dev/md*
ls: /dev/dm-6: No such file or directory
crw------- 1 root root 10, 63 Aug 9 09:01 /dev/mapper/control
brw-rw---- 1 root disk 253, 0 Aug 9 13:01 /dev/mapper/VolGroup00-LogVol00
brw-rw---- 1 root disk 253, 9 Aug 9 09:01 /dev/mapper/VolGroup00-LogVol01
brw-rw---- 1 root disk 253, 4 Aug 9 13:01 /dev/mapper/VolGroup00-LogVol02
brw-rw---- 1 root disk 253, 2 Aug 9 13:01 /dev/mapper/VolGroup00-LogVol03
brw-rw---- 1 root disk 253, 3 Aug 9 13:01 /dev/mapper/VolGroup00-LogVol04
brw-rw---- 1 root disk 253, 1 Aug 9 13:01 /dev/mapper/VolGroup00-LogVol05
brw-rw---- 1 root disk 253, 7 Aug 9 13:12 /dev/mapper/VolGroup00-oracleadminlv
brw-rw---- 1 root disk 253, 6 Aug 9 13:12 /dev/mapper/VolGroup00-oraclelv
brw-rw---- 1 root disk 253, 5 Aug 9 13:01 /dev/mapper/VolGroup00-standby
brw-rw---- 1 root disk 253, 8 Aug 9 13:01 /dev/mapper/VolGroup00-swaplv
brw-r----- 1 root disk 9, 0 Aug 9 13:01 /dev/md0
[root@mtstalpd-rac3 sg]# dmsetup table
VolGroup00-standby: 0 134217728 linear 8:2 79692160
VolGroup00-LogVol05: 0 8388608 linear 8:2 16777600
VolGroup00-oraclelv: 0 33554432 linear 8:2 213909888
VolGroup00-oraclelv: 33554432 4194304 linear 8:2 515899776
VolGroup00-LogVol04: 0 16777216 linear 8:2 33554816
VolGroup00-LogVol03: 0 8388608 linear 8:2 25166208
VolGroup00-LogVol02: 0 29360128 linear 8:2 50332032
VolGroup00-LogVol01: 0 134217728 linear 8:2 381682048
VolGroup00-oracleadminlv: 0 33554432 linear 8:2 247464320
VolGroup00-LogVol00: 0 16777216 linear 8:2 384
VolGroup00-swaplv: 0 100663296 linear 8:2 281018752
[root@mtstalpd-rac3 sg]# dmsetup ls --tree
VolGroup00-standby (253:5)
├Г┬в├Г┬в (8:2)
VolGroup00-LogVol05 (253:1)
├Г┬в├Г┬в (8:2)
VolGroup00-oraclelv (253:6)
├Г┬в├Г┬в (8:2)
VolGroup00-LogVol04 (253:3)
├Г┬в├Г┬в (8:2)
VolGroup00-LogVol03 (253:2)
├Г┬в├Г┬в (8:2)
VolGroup00-LogVol02 (253:4)
├Г┬в├Г┬в (8:2)
VolGroup00-LogVol01 (253:9)
├Г┬в├Г┬в (8:2)
VolGroup00-oracleadminlv (253:7)
├Г┬в├Г┬в (8:2)
VolGroup00-LogVol00 (253:0)
├Г┬в├Г┬в (8:2)
VolGroup00-swaplv (253:8)
├Г┬в├Г┬в (8:2)
[root@mtstalpd-rac3 sg]# cat /proc/mdstat
Personalities :
unused devices:
[root@mtstalpd-rac3 sg]# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 VolGroup00 lvm2 a- 409.72G 161.72G
[root@mtstalpd-rac3 sg]#
I didn't get any information. So is there a possibility that disk controller is bad?
Thank you so much for your valuable time and I really appreciate.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2010 10:28 AM
тАО08-10-2010 10:28 AM
Re: EXT3-fs error(device dm-6)
The "dmesg" command lists the kernel message buffer. Old messages will only be removed from the buffer when overwritten by newer messages. The size of the message buffer used to be about 16 KB, but it may have been increased in newer kernels. When you run "dmesg", you get everything that's in the buffer - whether the messages are new or old.
If you want to clear the message buffer (to make it easier to see which messages are new), run "dmesg -c".
The last four messages seem to indicate a state change of some sort on /dev/sda. If that's the point where you hot-swapped the bad disk, it might have caused these messages.
If no new "Buffer I/O error" messages appear after the lines:
SCSI device sda: 859525120 512-byte hdwr sectors (440077 MB)
sda: Write Protect is off
sda: Mode Sense: 06 00 10 00
SCSI device sda: drive cache: write back w/ FUA
then your RAID5 set is probably OK now.
In RHEL 5.1, the dm-* devices no longer exist in /dev, but the kernel error messages still refer to them. No matter: the
>ext3_abort called.
>EXT3-fs error (device dm-6): ext3_journal_start_sb: Detected aborted journal
>Remounting filesystem read-only
These messages indicate that an error was detected at the filesystem level, and the filesystem was switched to read-only mode to protect the data. You can try to switch it back to read-write mode with:
mount -o remount,rw /dev/VolGroup00/oraclelv
but usually the system will block this command until the filesystem is checked first.
To check the filesystem, you must stop the applications using it (i.e. Oracle) and unmount it:
umount /dev/VolGroup00/oraclelv
fsck -C 0 /dev/VolGroup00/oraclelv
If the filesystem check finds no errors (or can fix all the errors it can find), you can again mount the filesystem and resume using it:
mount /dev/VolGroup00/oraclelv
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2010 11:05 AM
тАО08-10-2010 11:05 AM
Re: EXT3-fs error(device dm-6)
Thank you again for your kind and detail response.
Yes, you are right, it was /oracle file system which is having the issue. I did run the fsck on /oracle twice yesterday. But whenever the oracle starts it read-only mount comes back. So I am guessing the disk controller itself must be bad.
Again, after I saw ur response, I did fsck.
umount /dev/VolGroup00/oraclelv
fsck -C 0 /dev/VolGroup00/oraclelv
(fixed one journal)
mount /dev/VolGroup00/oraclelv
Now, I am able to touch but didn't start the oracle yet as DBA is not here.
I/O errors we are getting even after the following:
SCSI device sda: 859525120 512-byte hdwr sectors (440077 MB)
sda: Write Protect is off
sda: Mode Sense: 06 00 10 00
SCSI device sda: drive cache: write back w/ FUA
So do you think RAID 5 was corrupted?
But when I called SUN they think it is the bad disk. But like I said I can see all four disks. So I am guessing it is something to do with the controller.
Again, thank you so much for teaching me and explaining. So nice of you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-25-2012 03:52 AM
тАО01-25-2012 03:52 AM
Re: EXT3-fs error(device dm-6)
Hi there ,
It should be much more easy to identify the device ausing the issue by just checking the Array Diagnostic Utility logs :
example :
Symptom :
lost page write due to I/O error on dm-6
Buffer I/O error on device dm-6, logical block 8201
lost page write due to I/O error on dm-6
REISERFS: abort (device dm-6): Journal write error in flush_commit_list
REISERFS: Aborting journal for filesystem on dm-6
Matching error on ADU :
Smart Array P400 in Embedded Slot : Storage Enclosure at Port 1I : Box 1 : Drive Cage on Port 1I : Physical Drive 1I:1:4 : Monitor and Performance Parameter Control
Bus Faults 8452 (0x2104)
Hot Plug Count 0x2104
Track Rewrite Errors 0x2902
Write Errors After Remap 0x2102
Background Firmware Revision 0x0848
Media Failures 0x2102
Hardware Errors 0x2102
Aborted Command Failures 0x2102
Spin Up Failures 0x2102
Bad Target Count 8450 (0x2102)
Predictive Failure Errors 0x2104
The Hard drive on port 1I:1:4 is the root cause of the bus faults /time outs , usually the a Firmware upgrade of the Hard drive itself solve the problem if not a classic HW change will fix it definitely.