Koozali.org: home of the SME Server
Contribs.org Forums => General Discussion => Topic started by: gbentley on May 09, 2011, 08:08:27 PM
-
Anyone give me a hand interpreting this lot following replacing / upgrading a SCSI disc?
From the admin manager panel at the console ;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Current RAID status:
Personalities : [raid1]
md2 : active raid1 sda2[0] sdb2[1]
71577536 blocks [2/2] [UU]
md1 : active raid1 sda1[0] sdb1[1]
104320 blocks [2/2] [UU]
unused devices: <none>
devices: $VAR1 = {
'/dev/md2' => {
'PreferredMinor' => '2',
'RaidLevel' => 'raid1',
'State' => 'active',
'DeviceSize' => '71577536',
'1' => ' 1 8 18 1 active sync /dev/sdb2
',
'SpareDevices' => '0',
'0' => ' 0 8 2 0 active sync /dev/sda2
',
'RaidDevices' => '2',
'FailedDevices' => '0',
'UpdateTime' => 'Mon May 9 18:48:36 2011',
'ArraySize' => '71577536',
'UUID' => '167d52ff:b2e6714a:b8a80cbe:379c4756',
'CreationTime' => 'Mon Apr 10 13:05:35 2006',
'WorkingDevices' => '2',
'Persistence' => 'Superblock is persistent',
'UsedDisks' => [
'sda',
'sdb'
],
'Version' => '00.90.01',
'TotalDevices' => '2',
'Events' => '0.46268161',
'ActiveDevices' => '2'
},
'/dev/md1' => {
'PreferredMinor' => '1',
'RaidLevel' => 'raid1',
'State' => 'clean',
'DeviceSize' => '104320',
'1' => ' 1 8 17 1 active sync /dev/sdb1
',
'SpareDevices' => '0',
'0' => ' 0 8 1 0 active sync /dev/sda1
',
'RaidDevices' => '2',
'FailedDevices' => '0',
'UpdateTime' => 'Mon May 9 18:25:18 2011',
'ArraySize' => '104320',
'UUID' => '69808efe:17a0f8b4:d600945d:3b9e8e2e',
'CreationTime' => 'Mon Apr 10 13:05:35 2006',
'WorkingDevices' => '2',
'Persistence' => 'Superblock is persistent',
'UsedDisks' => [
'sda',
'sdb'
],
'Version' => '00.90.01',
'TotalDevices' => '2',
'Events' => '0.11701',
'ActiveDevices' => '2'
}
};
used_disks: $VAR1 = {
'sda' => 2,
'sdb' => 2
};
unclean: /dev/md2 => active
recovering:
+-------Disk redundancy status as of Monday May 9, 2011 18:48:34----------+on
¦ Current RAID status: ¦
¦ ¦
¦ Personalities : [raid1] ¦
¦ md2 : active raid1 sda2[0] sdb2[1] ¦
¦ 71577536 blocks [2/2] [UU] ¦
¦ md1 : active raid1 sda1[0] sdb1[1] ¦
¦ 104320 blocks [2/2] [UU] ¦
¦ unused devices: <none> ¦
¦ ¦
¦ ¦
¦ Only some of the RAID devices are unclean. ¦
¦ ¦
¦ Manual intervention may be required. ¦
¦ ¦
¦ ¦
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From CLI ;
[root@server ~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda2[0] sdb2[1]
71577536 blocks [2/2] [UU]
md1 : active raid1 sda1[0] sdb1[1]
104320 blocks [2/2] [UU]
unused devices: <none>
[root@server ~]#
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From /var/log/dmsg
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
(scsi0:A:0): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
(scsi0:A:3): 160.000MB/s transfers (80.000MHz DT, 16bit)
(scsi0:A:15): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
Vendor: SEAGATE Model: ST373207LW Rev: D703
Type: Direct-Access ANSI SCSI revision: 03
scsi0:A:0:0: Tagged Queuing enabled. Depth 4
SCSI device sda: 143374650 512-byte hdwr sectors (73408 MB)
SCSI device sda: drive cache: write through
SCSI device sda: 143374650 512-byte hdwr sectors (73408 MB)
SCSI device sda: drive cache: write through
sda: sda1 sda2
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Vendor: HP Model: C7438A Rev: ZP5A
Type: Sequential-Access ANSI SCSI revision: 03
Vendor: FUJITSU Model: MBA3300NP Rev: 0102
Type: Direct-Access ANSI SCSI revision: 03
scsi0:A:15:0: Tagged Queuing enabled. Depth 4
SCSI device sdb: 585937500 512-byte hdwr sectors (300000 MB)
SCSI device sdb: drive cache: write back
SCSI device sdb: 585937500 512-byte hdwr sectors (300000 MB)
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2
Attached scsi disk sdb at scsi0, channel 0, id 15, lun 0
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
libata version 2.00 loaded.
ata_piix 0000:00:1f.2: version 2.00ac7
ata_piix 0000:00:1f.2: MAP [ P0 -- P1 -- ]
ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 18 (level, low) -> IRQ 185
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0xFE00 ctl 0xFE12 bmdma 0xFEA0 irq 185
ata2: SATA max UDMA/133 cmd 0xFE20 ctl 0xFE32 bmdma 0xFEA8 irq 185
scsi2 : ata_piix
scsi3 : ata_piix
device-mapper: 4.5.5-ioctl (2006-12-01) initialised: dm-devel@redhat.com
md: raid1 personality registered as nr 3
md: md1 stopped.
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 08 8b 8f 4d 00 00 80 00
Info fld=0x88b8f55, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 143363925
Buffer I/O error on device sda2, logical block 143155080
last line repated 30 odd times with different block nums
md: bind<sdb1>
md: bind<sda1>
raid1: raid set md1 active with 2 out of 2 mirrors
md: md2 stopped.
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 08 8b 8f 4d 00 00 80 00
Info fld=0x88b8f55, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 143363925
Buffer I/O error on device sda2, logical block 143155080
last line repated 30 odd times with different block nums
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 08 8b 8f 4d 00 00 80 00
Info fld=0x88b8f55, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 143363925
Buffer I/O error on device sda2, logical block 143155080
last line repated 30 odd times with different block nums
md: bind<sdb2>
md: bind<sda2>
raid1: raid set md2 active with 2 out of 2 mirrors
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Have I now found that at the time of replacing the older smaller discs that
the one I have left in to rebuild from does indeed already have problems?
-
> SCSI device sda: 143374650 512-byte hdwr sectors (73408 MB)
Is num in red font the total blocks on the disc sda? If so this line ;
> Buffer I/O error on device sda2, logical block 143155080
Seems to suggest the disc has some problems near the end - this info seems pertinent ;
http://lists.us.dell.com/pipermail/linux-poweredge/2007-January/028991.html
About 30 locations blocks are reported with Buffer I/O messages. Subtracting from the
the first reported from the total blocks [sectors?] would give 219570? Hmm, not sure
how to interpret that?
I have done fairly quick checks on sizes of data files available on ibays from a Winstation
[but not as yet any disc error checks with smartctl etc] but have copied it over to a
backup device etc - theres about 55GB of data on the server.
As I am going to repalce the disc anyway is it worth doing extensive tests?
Or is there any quick way to test data integrity [ie check that only areas of the disc that
data is actually in, is checked?]
Any help / tips apreciated thanks!
Edit: just ran # smartctl -d scsi -H /dev/sda
SMART Health Status: OK
Now doing long test - will report back later :)
-
# smartctl -d scsi -a /dev/sda
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: SEAGATE ST373207LW Version: D703
Serial number: 3KT42CY1
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Tue May 10 08:44:17 2011 BST
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK
Current Drive Temperature: 37 C
Drive Trip Temperature: 68 C
Vendor (Seagate) cache information
Blocks sent to initiator = 1304467954
Blocks received from initiator = 1336503664
Blocks read from cache and sent to initiator = 3897447114
Number of read and write commands whose size <= segment size = 903394977
Number of read and write commands whose size > segment size = 1
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
EEC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 188285484 0 0 188285484 188380624 86913.338 670
write: 0 0 0 0 0 3166.838 0
verify: 0 0 0 0 0 0.000 0
Non-medium error count: 3
Error Events logging not supported
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background long Completed, segment failed - 43319 - [0x4 0x3e 0x3]
# 2 Background short Completed, segment failed - 43319 - [0x4 0x3e 0x3]
# 3 Background short Completed, segment failed - 43319 - [- - -]