Koozali.org: home of the SME Server

[SOLVED] Disk errors and high IO wait time

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
[SOLVED] Disk errors and high IO wait time
« on: March 16, 2009, 10:03:00 AM »
Just noticed that one of my server is showing unusually high IO wait time, average over a week is 6.5% with peaks of 65%.  Looking at Sysmon, it is obvious that the IO wait time has increased significantly over the past 3-4 weeks.  High IO percentages alternate with low values every hours or so.  Messages log shows:

Mar 16 19:32:57 mx1 squid[4375]: WARNING: Disk space over limit: 176200 KB > 102400 KB
Mar 16 19:32:58 mx1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 16 19:32:58 mx1 kernel: ata1.00: (irq_stat 0x40000001)
Mar 16 19:32:58 mx1 kernel: ata1.00: cmd 25/00:6c:e1:44:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 55296 in
Mar 16 19:32:58 mx1 kernel:          res 51/40:00:2c:45:1c/00:00:1d:00:00/e0 Emask 0x9 (media error)
Mar 16 19:32:58 mx1 kernel: ata1.00: configured for UDMA/133
Mar 16 19:32:58 mx1 kernel: SCSI error : <0 0 0 0> return code = 0x8000002
Mar 16 19:32:58 mx1 kernel: Info fld=0x4000000 (nonstd), Invalid sda: sense = 72 11
Mar 16 19:32:58 mx1 kernel: end_request: I/O error, dev sda, sector 488391905
Mar 16 19:32:58 mx1 kernel: Buffer I/O error on device sda2, logical block 122045765
Mar 16 19:32:58 mx1 kernel: ata1: EH complete

This system is currently at 7.3 level, Raid1, 2 x  250GB SATA.  I was about to upgrade to 7.4 when I noticed this issue, response is very slow at times over ssh. The server is at a remote site, hard to access.  I suspect a hardware issue ( hda?) but am not sure about the  Disk space over limit warning. 

Any assistance would be very appreciated, thanks.

« Last Edit: March 16, 2009, 10:45:33 AM by chris burnat »
- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
Re: Disk errors and high IO wait time
« Reply #1 on: March 16, 2009, 10:28:07 AM »
I had done a reboot of the server at around 19:30.
Just noticed an email in admin:
This is an automatically generated mail message from mdadm running on mx1.tvoceania.com.
A DegradedArray event has been detected on md device /dev/md2.

[root@mx1 ~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb2[1]
      244091520 blocks [2/1] [_U]
     
md1 : active raid1 sda1[0] sdb1[1]
      104320 blocks [2/2] [UU]
     
unused devices: <none>

IO Wait now at 0.6% for past hour, server is responding as it should, - fast.
Need a new drive SATA 250GB... But not sure about WARNING: Disk space over limit: 176200 KB > 102400 KB .
- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.

Offline Stefano

  • *
  • 10,894
  • +3/-0
Re: Disk errors and high IO wait time
« Reply #2 on: March 16, 2009, 10:31:48 AM »
Hi Chris

about squid error: take a look here or search with google for "squid: WARNING: Disk space over limit:"

I suspect your sda disk is broken.. :-)

Ciao
stefano

Offline chris burnat

  • *****
  • 1,135
  • +2/-0
    • http://www.burnat.com
Re: Disk errors and high IO wait time
« Reply #3 on: March 16, 2009, 10:44:57 AM »
Thanks Stefano.
Log noise. Solved. 
- chris
If it does not work out of the box, please fill in a Bug Report @ Bugzilla (http://bugs.contribs.org)  - check: http://wiki.contribs.org/Bugzilla_Help .  Thanks.