Koozali.org: home of the SME Server

hddtemp, raid 5 (4 disks), sme 7.1u3

Offline bhamail

  • ***
  • 46
  • +0/-0
hddtemp, raid 5 (4 disks), sme 7.1u3
« on: May 24, 2007, 12:36:16 PM »
I'm setting up a new server (v7.1 update 3) and I can't get hddtemp to work with my drives.
Code: [Select]
# hddtemp /dev/sda1
/dev/sda1: ATA WDC WD4000YR-01P: S.M.A.R.T. not available


What's odd is that I originally installed SME with just one of these drives, and hddtemp worked fine. (There was no "not available" in the output).

Does hddtemp work with drives in a raid configuration?

Thanks,
Dan

MJB

hddtemp and kernel incompatibilities
« Reply #1 on: May 24, 2007, 03:56:13 PM »
There is an incompatibility between hddtemp and kernels newer than 2.6.19-rc5-mm2:

http://bugzilla.kernel.org/show_bug.cgi?id=7581

The fix requires patching and rebuilding the kernel.  I don't know if I'd want to try this with SME Server. You might want to try an alternative program like smartctl instead:

http://smartmontools.sourceforge.net/

Mike

Offline bhamail

  • ***
  • 46
  • +0/-0
hddtemp, raid 5 (4 disks), sme 7.1u3
« Reply #2 on: May 30, 2007, 05:24:24 AM »
Hi MJB,

Thanks for the reply. I don't think my stock kernel is new enough for this to explain the problem:
Code: [Select]

# uname -r
2.6.9-55.ELsmp


Any other thoughts? By the way, smartctl is able to read the hdd temp:
Code: [Select]

# smartctl -d ata -a /dev/sda
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD4000YR-01PLB0
Serial Number:    WD-WMAMY1624489
Firmware Version: 01.06A01
User Capacity:    400,088,457,216 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   7
ATA Standard is:  Not recognized. Minor revision code: 0x1d
Local Time is:    Tue May 29 23:22:10 2007 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                        was suspended by an interrupting command from host.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                 (10530) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 152) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   225   224   021    Pre-fail  Always       -       5775
  4 Start_Stop_Count        0x0032   100   100   040    Old_age   Always       -       117
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   200   200   051    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3061
 10 Spin_Retry_Count        0x0013   100   100   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0013   100   253   051    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       98
194 Temperature_Celsius     0x0022   119   103   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0012   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0009   200   200   051    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Ideally, I'd like to use hddtemp, if only to have the hdd temp readings show up nicely in smeadmin.

Any other suggestions?

Thanks,
Dan

Offline bhamail

  • ***
  • 46
  • +0/-0
hddtemp, raid 5 (4 disks), sme 7.1u3
« Reply #3 on: May 30, 2007, 06:01:02 AM »
Just found some more info: Forcing the [type] in the hddtemp command seems to help matters:
Code: [Select]
# hddtemp SATA:/dev/sda
/dev/sda: Success

A little odd that SATA doesn't show more, since these are SATA drives. Odder still:
Code: [Select]
# hddtemp PATA:/dev/sda
WARNING: Drive /dev/sda doesn't appear in the database of supported drives
WARNING: But using a common value, it reports something.
WARNING: Note that the temperature shown could be wrong.
WARNING: See --help, --debug and --drivebase options.
WARNING: And don't forget you can add your drive to hddtemp.db
/dev/sda: unknown:  33�C or �F

Does it make sense that forcing PATA should work?
Next question maybe should be directed at sme7admin: What's the best way to force the type in the sme7admin config for hddtemp?

Thanks again,
Dan