My server has suddenly started crashing and producing errors such as these:
Aug 19 12:59:09 sme kernel: [<e0889c72>] ext3_write_inode+0x22/0x3f [ext3]
Aug 19 12:59:09 sme kernel: [write_inode+48/55] write_inode+0x30/0x37
Aug 19 12:59:09 sme kernel: [<c0177b56>] write_inode+0x30/0x37
Aug 19 12:59:09 sme kernel: [__sync_single_inode+112/443] __sync_single_inode+0x70/0x1bb
Aug 19 12:59:09 sme kernel: [<c0177bcd>] __sync_single_inode+0x70/0x1bb
Aug 19 12:59:09 sme kernel: [sync_sb_inodes+423/628] sync_sb_inodes+0x1a7/0x274
Aug 19 12:59:09 sme kernel: [<c0177f79>] sync_sb_inodes+0x1a7/0x274
Aug 19 12:59:09 sme kernel: [writeback_inodes+145/222] writeback_inodes+0x91/0xde
Aug 19 12:59:09 sme kernel: [<c01780d7>] writeback_inodes+0x91/0xde
Aug 19 12:59:09 sme kernel: [balance_dirty_pages+124/284] balance_dirty_pages+0x7c/0x11c
Aug 19 12:59:09 sme kernel: [<c01451b8>] balance_dirty_pages+0x7c/0x11c
Aug 19 12:59:09 sme kernel: [<e0887f7d>] ext3_ordered_commit_write+0xb6/0xc5 [ext3]
My guess is that this is a hardware fault, but any ideas where it is likely to be? Hard disks or (motherboard) controller?
I have two 320G disks in an active array (no errors reported there) each a master drive on a separate IDE channels, 512M RAM, 1GHz Athlon (perhaps underpowered now) and an aging motherboard. RAM has tested out okay, but I have not yet done a low-level test of the hard drives. I just don't know what those errors mean, but I do know they have started appearing in the last few days, and the server has been crashing - processes suddenly stopping, with a hard reset being the only way to get out of it.
-- Jason