Koozali.org: home of the SME Server

Help interpreting logs possible errors etc following SCSI disc upgrade

Offline gbentley

  • *****
  • 482
  • +0/-0
  • Forum Lurker
    • Earth
Anyone give me a hand interpreting this lot following replacing / upgrading a SCSI disc?

From the admin manager panel at the console ;

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Current RAID status:

Personalities : [raid1]
md2 : active raid1 sda2[0] sdb2[1]
      71577536 blocks [2/2] [UU]
md1 : active raid1 sda1[0] sdb1[1]
      104320 blocks [2/2] [UU]
unused devices: <none>

devices: $VAR1 = {
          '/dev/md2' => {
                          'PreferredMinor' => '2',
                          'RaidLevel' => 'raid1',
                          'State' => 'active',
                          'DeviceSize' => '71577536',
                          '1' => '       1       8       18        1      active sync   /dev/sdb2
',
                          'SpareDevices' => '0',
                          '0' => '       0       8        2        0      active sync   /dev/sda2
',
                          'RaidDevices' => '2',
                          'FailedDevices' => '0',
                          'UpdateTime' => 'Mon May  9 18:48:36 2011',
                          'ArraySize' => '71577536',
                          'UUID' => '167d52ff:b2e6714a:b8a80cbe:379c4756',
                          'CreationTime' => 'Mon Apr 10 13:05:35 2006',
                          'WorkingDevices' => '2',
                          'Persistence' => 'Superblock is persistent',
                          'UsedDisks' => [
                                           'sda',
                                           'sdb'
                                         ],
                          'Version' => '00.90.01',
                          'TotalDevices' => '2',
                          'Events' => '0.46268161',
                          'ActiveDevices' => '2'
                        },
          '/dev/md1' => {
                          'PreferredMinor' => '1',
                          'RaidLevel' => 'raid1',
                          'State' => 'clean',
                          'DeviceSize' => '104320',
                          '1' => '       1       8       17        1      active sync   /dev/sdb1
',
                          'SpareDevices' => '0',
                          '0' => '       0       8        1        0      active sync   /dev/sda1
',
                          'RaidDevices' => '2',
                          'FailedDevices' => '0',
                          'UpdateTime' => 'Mon May  9 18:25:18 2011',
                          'ArraySize' => '104320',
                          'UUID' => '69808efe:17a0f8b4:d600945d:3b9e8e2e',
                          'CreationTime' => 'Mon Apr 10 13:05:35 2006',
                          'WorkingDevices' => '2',
                          'Persistence' => 'Superblock is persistent',
                          'UsedDisks' => [
                                           'sda',
                                           'sdb'
                                         ],
                          'Version' => '00.90.01',
                          'TotalDevices' => '2',
                          'Events' => '0.11701',
                          'ActiveDevices' => '2'
                        }
        };

used_disks: $VAR1 = {
          'sda' => 2,
          'sdb' => 2
        };

unclean: /dev/md2 => active
recovering:

 +-------Disk redundancy status as of Monday May  9, 2011 18:48:34----------+on
 ¦ Current RAID status:                                                     ¦
 ¦                                                                          ¦
 ¦ Personalities : [raid1]                                                  ¦
 ¦ md2 : active raid1 sda2[0] sdb2[1]                                       ¦
 ¦       71577536 blocks [2/2] [UU]                                         ¦
 ¦ md1 : active raid1 sda1[0] sdb1[1]                                       ¦
 ¦       104320 blocks [2/2] [UU]                                           ¦
 ¦ unused devices: <none>                                                   ¦
 ¦                                                                          ¦
 ¦                                                                          ¦
 ¦ Only some of the RAID devices are unclean.                               ¦
 ¦                                                                          ¦
 ¦ Manual intervention may be required.                                     ¦
 ¦                                                                          ¦
 ¦                                                                          ¦

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


From CLI ;

[root@server ~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda2[0] sdb2[1]
      71577536 blocks [2/2] [UU]

md1 : active raid1 sda1[0] sdb1[1]
      104320 blocks [2/2] [UU]

unused devices: <none>
[root@server ~]#


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From /var/log/dmsg

scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

(scsi0:A:0): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
(scsi0:A:3): 160.000MB/s transfers (80.000MHz DT, 16bit)
(scsi0:A:15): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
  Vendor: SEAGATE   Model: ST373207LW        Rev: D703
  Type:   Direct-Access                      ANSI SCSI revision: 03
scsi0:A:0:0: Tagged Queuing enabled.  Depth 4
SCSI device sda: 143374650 512-byte hdwr sectors (73408 MB)
SCSI device sda: drive cache: write through
SCSI device sda: 143374650 512-byte hdwr sectors (73408 MB)
SCSI device sda: drive cache: write through
 sda: sda1 sda2
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
  Vendor: HP        Model: C7438A            Rev: ZP5A
  Type:   Sequential-Access                  ANSI SCSI revision: 03
  Vendor: FUJITSU   Model: MBA3300NP         Rev: 0102
  Type:   Direct-Access                      ANSI SCSI revision: 03
scsi0:A:15:0: Tagged Queuing enabled.  Depth 4
SCSI device sdb: 585937500 512-byte hdwr sectors (300000 MB)
SCSI device sdb: drive cache: write back
SCSI device sdb: 585937500 512-byte hdwr sectors (300000 MB)
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2
Attached scsi disk sdb at scsi0, channel 0, id 15, lun 0
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

libata version 2.00 loaded.
ata_piix 0000:00:1f.2: version 2.00ac7
ata_piix 0000:00:1f.2: MAP [ P0 -- P1 -- ]
ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 18 (level, low) -> IRQ 185
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0xFE00 ctl 0xFE12 bmdma 0xFEA0 irq 185
ata2: SATA max UDMA/133 cmd 0xFE20 ctl 0xFE32 bmdma 0xFEA8 irq 185
scsi2 : ata_piix
scsi3 : ata_piix
device-mapper: 4.5.5-ioctl (2006-12-01) initialised: dm-devel@redhat.com
md: raid1 personality registered as nr 3
md: md1 stopped.
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 08 8b 8f 4d 00 00 80 00
Info fld=0x88b8f55, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 143363925
Buffer I/O error on device sda2, logical block 143155080

last line repated 30 odd times with different block nums

md: bind<sdb1>
md: bind<sda1>
raid1: raid set md1 active with 2 out of 2 mirrors
md: md2 stopped.
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 08 8b 8f 4d 00 00 80 00
Info fld=0x88b8f55, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 143363925
Buffer I/O error on device sda2, logical block 143155080

last line repated 30 odd times with different block nums

scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 08 8b 8f 4d 00 00 80 00
Info fld=0x88b8f55, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 143363925
Buffer I/O error on device sda2, logical block 143155080

last line repated 30 odd times with different block nums

md: bind<sdb2>
md: bind<sda2>
raid1: raid set md2 active with 2 out of 2 mirrors

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Have I now found that at the time of replacing the older smaller discs that
the one I have left in to rebuild from does indeed already have problems?





"If you don't know what you want, you end up with a lot you don't."

Offline gbentley

  • *****
  • 482
  • +0/-0
  • Forum Lurker
    • Earth
> SCSI device sda: 143374650 512-byte hdwr sectors (73408 MB)

Is num in red font the total blocks on the disc sda? If so this line ;

> Buffer I/O error on device sda2, logical block 143155080

Seems to suggest the disc has some problems near the end - this info seems pertinent ;

http://lists.us.dell.com/pipermail/linux-poweredge/2007-January/028991.html

About 30 locations blocks are reported with Buffer I/O messages. Subtracting from the
the first reported from the total blocks [sectors?] would give 219570? Hmm, not sure
how to interpret that?

I have done fairly quick checks on sizes of data files available on ibays from a Winstation
[but not as yet any disc error checks with smartctl etc] but have copied it over to a
backup device etc - theres about 55GB of data on the server.

As I am going to repalce the disc anyway is it worth doing extensive tests?

Or is there any quick way to test data integrity [ie check that only areas of the disc that
data is actually in, is checked?]

Any help / tips apreciated thanks!

Edit: just ran # smartctl -d scsi -H /dev/sda

SMART Health Status: OK

Now doing long test  - will report back later :)

« Last Edit: May 10, 2011, 09:25:54 AM by gbentley »
"If you don't know what you want, you end up with a lot you don't."

Offline gbentley

  • *****
  • 482
  • +0/-0
  • Forum Lurker
    • Earth
# smartctl -d scsi -a /dev/sda

smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: SEAGATE  ST373207LW       Version: D703
Serial number: 3KT42CY1
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Tue May 10 08:44:17 2011 BST
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK

Current Drive Temperature:     37 C
Drive Trip Temperature:        68 C
Vendor (Seagate) cache information
  Blocks sent to initiator = 1304467954
  Blocks received from initiator = 1336503664
  Blocks read from cache and sent to initiator = 3897447114
  Number of read and write commands whose size <= segment size = 903394977
  Number of read and write commands whose size > segment size = 1

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               EEC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   188285484        0         0  188285484   188380624      86913.338         670
write:         0        0         0         0          0       3166.838           0
verify:        0        0         0         0          0          0.000           0

Non-medium error count:        3

Error Events logging not supported

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed, segment failed   - 43319                   - [0x4 0x3e 0x3]
# 2  Background short  Completed, segment failed   - 43319                   - [0x4 0x3e 0x3]
# 3  Background short  Completed, segment failed   - 43319                   - [-   -    -]
"If you don't know what you want, you end up with a lot you don't."