- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Constant File System Corruption
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-31-2013 07:10 AM - edited 05-31-2013 09:20 AM
05-31-2013 07:10 AM - edited 05-31-2013 09:20 AM
Constant File System Corruption
Hello,
I have a DL380 G4 with 4gb Ram, and 6 146gb Seagate Cheetah 10k Drives in a RAID 5 configuration using the Integrated Smart Array 6i Controller, and am using Ubuntu 12.04 (x86_64) OS.
I've been having issues with this setup for the past week or so, getting EXT4-fs errors, so I switched to EXT3-fs, and still receiving the same errors.
The latest one was
[ 10.644162] EXT3-fs error (device dm-0): ext3_add_entry: bad entry in directory #3162531: rec_len % 4 != 0 - offset=48, inode=1774361275, rec_len=28257, name_len=45
[ 10.658877] Aborting journal on device dm-0.
[ 10.664328] EXT3-fs (dm-0): error: remounting filesystem read-only
But I've also been getting a lot of segfault errors, that go away when I reboot, and come back eventually.
So in thinking it was a Hardware issue I did S.M.A.R.T tests on all of my drives (both long, and short) and all of them passed both tests. I've also ran memtest86 for about 3-4 passes through and all of those passed as well.
I'm wondering if anyone here has any more Diagnostic Check ideas I can run to narrow down this issue even more, as I've started to run out of ideas.
Another Interesting thing is when I create the Array in ACU through SmartStart it shows the logical drive as Max 683.6gb. but when Installing any OS, the Drive volume is 740gb.
Smartctl results:
root@server:~# for i in {0..5}; do echo "Drive $i"; smartctl -a -d cciss,$i /dev/cciss/c0d0; echo ""; done | less
Drive 0
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
Vendor: FUJITSU
Product: MAP3147NC
Revision: 5608
User Capacity: 146,815,733,760 bytes [146 GB]
Logical block size: 512 bytes
Serial number: UQ07P4A04LEP
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Fri May 31 11:17:03 2013 CDT
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK
Current Drive Temperature: 33 C
Drive Trip Temperature: 65 C
Manufactured in week 43 of year 2004
Specified cycle count over device lifetime: 10000
Accumulated start-stop cycles: 74
Elements in grown defect list: 0
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 21 0 0 0 43065.555 0
write: 0 1 0 0 0 7240.451 0
verify: 0 0 0 0 0 4902.749 0
Non-medium error count: 389
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 27076 - [- - -]
# 2 Background long Completed - 27055 - [- - -]
# 3 Background long Completed - 26927 - [- - -]
# 4 Background short Completed - 26925 - [- - -]
# 5 Background short Completed - 26866 - [- - -]
# 6 Background long Completed - 1 - [- - -]
Long (extended) Self Test duration: 2621 seconds [43.7 minutes]
Drive 1
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
Vendor: SEAGATE
Product: ST3146707LC
Revision: D701
User Capacity: 146,815,733,760 bytes [146 GB]
Logical block size: 512 bytes
Serial number: 3KS1L65Y
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Fri May 31 11:17:04 2013 CDT
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 35 C
Drive Trip Temperature: 68 C
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 1454822628
Blocks received from initiator = 1223409221
Blocks read from cache and sent to initiator = 3338647048
Number of read and write commands whose size <= segment size = 240360360
Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 54537.98
number of minutes until next internal SMART test = 116
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 5439890 0 0 5439890 5439890 13456.683 0
write: 0 0 0 0 0 1475.446 0
verify: 476108967 0 0 476108967 476108967 1620106.637 0
Non-medium error count: 12010
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 54499 - [- - -]
# 2 Background long Completed - 54478 - [- - -]
# 3 Background long Completed - 54456 - [- - -]
# 4 Background short Completed - 54398 - [- - -]
# 5 Background long Completed - 1 - [- - -]
# 6 Background short Completed - 0 - [- - -]
Long (extended) Self Test duration: 2726 seconds [45.4 minutes]
Drive 2
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
Vendor: SEAGATE
Product: ST3146807LC
Revision: DS09
User Capacity: 146,815,733,760 bytes [146 GB]
Logical block size: 512 bytes
Serial number: 3HY9RP06
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Fri May 31 11:17:05 2013 CDT
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 35 C
Drive Trip Temperature: 68 C
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 80496220
Blocks received from initiator = 2757574639
Blocks read from cache and sent to initiator = 2711663054
Number of read and write commands whose size <= segment size = 1857921065
Number of read and write commands whose size > segment size = 5794
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 58643.68
number of minutes until next internal SMART test = 116
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 120844078 0 0 120844078 120844078 400280.082 0
write: 0 0 0 0 0 26180.766 0
Non-medium error count: 117823
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 58609 - [- - -]
# 2 Background long Completed - 58588 - [- - -]
# 3 Background long Completed - 58465 - [- - -]
# 4 Background short Completed - 58461 - [- - -]
# 5 Background short Completed - 58437 - [- - -]
# 6 Background short Completed - 58403 - [- - -]
# 7 Background long Completed - 11 - [- - -]
# 8 Background short Completed - 10 - [- - -]
Long (extended) Self Test duration: 3072 seconds [51.2 minutes]
Drive 3
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
Vendor: SEAGATE
Product: ST3146807LC
Revision: DS09
User Capacity: 146,815,733,760 bytes [146 GB]
Logical block size: 512 bytes
Serial number: 3HY9TFFW
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Fri May 31 11:17:05 2013 CDT
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK
Current Drive Temperature: 34 C
Drive Trip Temperature: 68 C
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 1862183402
Blocks received from initiator = 144326800
Blocks read from cache and sent to initiator = 3321322531
Number of read and write commands whose size <= segment size = 294180989
Number of read and write commands whose size > segment size = 5771
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 60764.37
number of minutes until next internal SMART test = 116
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 21679267 0 0 21679267 21679267 35978.974 0
write: 0 0 0 0 0 3770.736 0
verify: 428907230 0 0 428907230 428907498 859813.569 0
Non-medium error count: 187083
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 60729 - [- - -]
# 2 Background long Completed - 60709 - [- - -]
# 3 Background long Completed - 60586 - [- - -]
# 4 Background short Completed - 60582 - [- - -]
# 5 Background short Completed - 60524 - [- - -]
# 6 Background long Completed - 10 - [- - -]
# 7 Background short Completed - 9 - [- - -]
Long (extended) Self Test duration: 3072 seconds [51.2 minutes]
Drive 4
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
Vendor: SEAGATE
Product: ST3146807LC
Revision: DS09
User Capacity: 146,815,733,760 bytes [146 GB]
Logical block size: 512 bytes
Serial number: 3HY9RMXC
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Fri May 31 11:17:06 2013 CDT
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 34 C
Drive Trip Temperature: 68 C
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 229252668
Blocks received from initiator = 3109780939
Blocks read from cache and sent to initiator = 2739697770
Number of read and write commands whose size <= segment size = 1881841609
Number of read and write commands whose size > segment size = 5963
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 58648.93
number of minutes until next internal SMART test = 116
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 64453726 0 0 64453726 64453726 400782.861 0
write: 0 0 2 2 60 26503.243 0
Non-medium error count: 70442
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background long Completed - 58593 - [- - -]
# 2 Background long Completed - 58470 - [- - -]
# 3 Background short Completed - 58441 - [- - -]
# 4 Background long Completed - 11 - [- - -]
# 5 Background short Completed - 10 - [- - -]
Long (extended) Self Test duration: 3072 seconds [51.2 minutes]
Drive 5
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
Vendor: SEAGATE
Product: ST3146807LC
Revision: DS09
User Capacity: 146,815,733,760 bytes [146 GB]
Logical block size: 512 bytes
Serial number: 3HY9QV0L
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Fri May 31 11:17:07 2013 CDT
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 34 C
Drive Trip Temperature: 68 C
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 84690809
Blocks received from initiator = 2885140073
Blocks read from cache and sent to initiator = 2700374170
Number of read and write commands whose size <= segment size = 1865990889
Number of read and write commands whose size > segment size = 6920
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 58644.02
number of minutes until next internal SMART test = 116
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 99023039 0 0 99023039 99023039 401045.516 0
write: 0 0 0 0 0 26229.572 0
Non-medium error count: 95944
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background long Completed - 58588 - [- - -]
# 2 Background long Completed - 58465 - [- - -]
# 3 Background short Completed - 58461 - [- - -]
# 4 Background long Completed - 11 - [- - -]
# 5 Background short Completed - 10 - [- - -]
Long (extended) Self Test duration: 3072 seconds [51.2 minutes]