Koozali.org: home of the SME Server
Obsolete Releases => SME 8.x Contribs => Topic started by: newburns on November 24, 2012, 06:58:46 PM
-
I finally got PHPSysInfo to work the way I needed it to. It does everything except send email alerts. I will try to list my steps as best as I can remember.
1. Install PHPSysInfo according to the contribs wiki http://wiki.contribs.org/Phpsysinfo (http://wiki.contribs.org/Phpsysinfo)
2. Download the latest PHPSysInfo from sourcefourge, and untar it to /opt/ which will overwrite the directory phpsysinfo.
3. Edit the config page and be sure to set the sensors to 'LMSensors', and set hddtemp to 'TCP'
* Plugins that should be included in xml and output (!!!plugin names are case-sensitive!!!)
* List of plugins should look like "plugin,plugin,plugin". See /plugins directory
* - define('PSI_PLUGINS', 'MDStatus,PS'); // list of plugins
* - define('PSI_PLUGINS', false); //no plugins
* included plugins:
* - MDStatus - show the raid status and whats currently going on
* - PS - show a process tree of all running processes
* - PSStatus - show a graphical representation if a process is running or not
* - Quotas - show a table with all quotas that are active and there current state
* - SMART - show S.M.A.R.T. information from drives that support it
* - BAT - show battery state on a laptop
* - ipmi - show IPMI status
* - UpdateNotifier - show update notifications (only for Ubuntu server)
* - SNMPPInfo - show printers info via SNMP
*/
define('PSI_PLUGINS', 'MDStatus,PSStatus,Quotas,SMART');
* Define the motherboard monitoring program (!!!names are case-sensitive!!!)
* We support the following programs so far
* - LMSensors http://www.lm-sensors.org/
* - Healthd http://healthd.thehousleys.net/
* - HWSensors http://www.openbsd.org/
* - MBMon http://www.nt.phys.kyushu-u.ac.jp/shimizu/download/download.html
* - MBM5 http://mbm.livewiredev.com/
* - Coretemp
* - IPMI http://openipmi.sourceforge.net/
* - K8Temp http://hur.st/k8temp/
* Example: If you want to use lmsensors : define('PSI_SENSOR_PROGRAM', 'LMSensors');
*/
define('PSI_SENSOR_PROGRAM', 'LMSensors');
* Define how to access the monitor program
* Available methods for the above list are in the following list
* default method 'command' should be fine for everybody
* !!! tcp connections are only made local and on the default port !!!
* - LMSensors command, file
* - Healthd command
* - HWSensors command
* - MBMon command, tcp
* - MBM5 file
* - Coretemp command
* - IPMI command
* - K8Temp command
*/
define('PSI_SENSOR_ACCESS', 'command');
/**
* Hddtemp program
* If the hddtemp program is available we can read the temperature, if hdd is smart capable
* !!ATTENTION!! hddtemp might be a security issue
* - define('PSI_HDD_TEMP', 'tcp'); // read data from hddtemp deamon (localhost:7634)
* - define('PSI_HDD_TEMP', 'command'); // read data from hddtemp programm (must be set suid)
*/
define('PSI_HDD_TEMP', 'tcp');
4. I had to use the following code, as stated here http://forums.contribs.org/index.php/topic,22018.msg87606.html#msg87606 (http://forums.contribs.org/index.php/topic,22018.msg87606.html#msg87606)
ln -s /etc/rc.d/init.d/e-smith-service /etc/rc7.d/S86lm_sensors
/sbin/e-smith/config set lm_sensors service status enabled
5. sensors-detect
Because of step 4, there should not be an error at the end of this procedure that states "lm_sensors is not a recognized service"
Be sure to do whatever the program says to do as far as editing files and what not, if nothing, keep hitting 'yes'
6. I had issues with the version of hddtemp, and running all six drives in DAEMON mode, so I updated my version of hddtemp from the EPEL repo. I'm not sure if it is located in another repo, but as most know, I usually yum install from all repos. With the existing version of hddtemp on SME, I was only able to get two drives to work.
hddtemp x86_64 0.3-0.16.beta15.el5 epel 47 k
The current version of hddtemp should work, and all you must do is set your drives to run in daemon. My code uses 6 hard drives, yours will be different based on your drives
i.e. - /dev/sd[x] for SATA
- /dev/hd[x] for PATA)
hddtemp -d /dev/sd[abcdef]
telnet localhost 7634
The code should return some values showing your current hard drive temperatures
7. Now you must edit the SMART plugin config
nano /opt/phpsysinfo/plugins/SMART/SMART.config.php
add your harddrives to the config.
/**
* Smartctl devices to monitor
* If the smartctl support is enabled, those disks information will be displayed
* - define('PSI_PLUGIN_SMART_DEVICES', '/dev/hda,/dev/hdb'); // Will display those two disks informations
*/
define('PSI_PLUGIN_SMART_DEVICES', '/dev/sda, /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde, /dev/sdf');
8. Go to "http://domain.tld/phpsysinfo", input your admin credentials, and you should see a fuly stocked phpsysinfo page, with a lot of useful values. Does a lot of what Sysmon and SME8admin does, but with the options to have more hard drives monitored, exclude mounted filesystems, and monitor UPS!
Let me know if you have issues, and I will try to assist as best I can. :grin:
(http://i1284.photobucket.com/albums/a580/newburns/Systeminformationmtrosemtrosemediatk7495216133.png)
http://i1284.photobucket.com/albums/a580/newburns/Systeminformationmtrosemtrosemediatk7495216133.png (http://i1284.photobucket.com/albums/a580/newburns/Systeminformationmtrosemtrosemediatk7495216133.png)
-
This is very awesome with the PSIAndroid app for android phones. It gives you a snapshot of any SME servers within seconds. Good for Temp monitoring, especially after email notifications from another monitoring service.
Quick story,
I have a portable air conditioner in a closet to cool my three production servers. Sometimes, the power will go pulse and cause the a/c to go off.
After an email from ACPusd about the ups losing power and coming back on within seconds, I was able to monitor the temps from this app while I was out of town.
Normally, I would just use the newly acquired SME8admin, but it doesn't monitor all the temps, just two. And CPU temps does not work. This way, you can manually monitor the temps and voltages and manually intervene when you know there is a problem using something like connectbot for android. All done from a mobile phone!
-
Nice....I just may have to try upgrading :-)
-
How do you upgrade hddtemp ?
I tried "yum install hddtemp" and it said "nothing to do" and then I tried "yum upgrade hddtemp" and it said "no packages marked for update"
btw I have hddtemp-0.3-0.16.beta15.el5.i386 already installed via the sme8admin contrib (I think it was that one that installed it)
-
yum --enablerepo=epel install hddtemp
I found the one I am using from Epel repository.
-
Thank you....trying it now
Edit . Looks like I have the latest version as Epel says I have the current version. hmmm
-
So you are good to go for the rest. Let me know how the rest of the walk through is. Very stored as this is my first one
-
I've got it going for the most part....Just not getting the SMART working.
When I do the hddtemp -d /dev/sda and telnet portion it says "|/dev/sda|3ware Logical Disk 0|NA|*|Connection closed by foreign host." so I guess my 3ware raid card doesn't support SMART but I see SMART is enabled in the BIOS when booting up the server... (btw this is not the server in my sig but my test/dev box)
-
Oh yea. There was an issue with permissions. You can either chmod 777 your smartctl or change your smartctl script.
http://sourceforge.net/projects/phpsysinfo/forums/forum/10/topic/3399227
-
I'll go check out the link
Thanks
-
Oh yea. There was an issue with permissions. You can either chmod 777 your smartctl or change your smartctl script.
http://sourceforge.net/projects/phpsysinfo/forums/forum/10/topic/3399227
777 is E V I L and should always be avoided
smartctl supports 3ware, just read smartctl's man page and use google
-
I didn't need to change chmod as the "stock" 755 works ok...
with my 3ware SATA Raid card I had to use:
smartctl --all -d 3ware,0 /dev/twe0
Which returned the following info: (of course if I changed it to -d 3ware,X (x=0 through 5) I could see all 6 drives)
smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.18-308.20.1.el5] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Western Digital RE Serial ATA
Device Model: WDC WD2500SD-01KCB0
Serial Number: WD-WCAL72178558
Firmware Version: 08.02D08
User Capacity: 250,059,350,016 bytes [250 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Mon Dec 3 07:23:06 2012 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 7293) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 92) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0007 132 131 021 Pre-fail Always - 5900
4 Start_Stop_Count 0x0032 100 100 040 Old_age Always - 55
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0
9 Power_On_Hours 0x0032 031 031 000 Old_age Always - 50386
10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 53
194 Temperature_Celsius 0x0022 118 102 000 Old_age Always - 32
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0012 200 200 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x000a 200 253 000 Old_age Always - 4
200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
I still have problems getting hddtemp to work so more digging :)
-
This is what I get when doing the hddtemp
[root@sme-2 ~]# hddtemp -d /dev/sda
[root@sme-2 ~]# telnet localhost 7634
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
|/dev/sda|3ware Logical Disk 0|NA|*|Connection closed by foreign host.
[root@sme-2 ~]#
-
Are your drives in the hddtemp.db?
There's a simple way to add then using fdisk or smartctl, I just can't find it right now.
http://forums.contribs.org/index.php?topic=37138.0
-
Also, I have temperatures shown in my temperatures category as well as the SMART information. Do you show temps in SMART?
-
Yes I show temps in Smart...... I will take a look at the hddtemp.db shortly.
Thanks for helping a newbie :)
[root@sme-2 ~]# smartctl -a -d 3ware,0 /dev/twe0 | grep -i temp
194 Temperature_Celsius 0x0022 120 103 000 Old_age Always - 30
Edit - Yes my drives are listed in the hddtemp.db
-
Sorry to bump this old thread but it's dealing with what I'm fighting with as I retired the server I was working with before in this thread and now my dev box is a PowerEdge 2950
I have to use the following to string from the command line to pull information from my Dell PERC6/i raid controller (of course changing X to the drive wanted)
smartctl -a /dev/sda -d sat+megaraid,0X
So what do I place in the section for drive monitoring as it just normally says
/dev/sda, /dev/sdb, /dev/sdc
which doesn't work for me..
nor does
(sat+megaraid.00)/dev/sda,
or (megaraid.00)/dev/sda,
or (megaraid.0)/dev/sda,