Koozali.org: home of the SME Server

NUT and APC SmartUPS - comminucations problems with serial port - FIXED

Offline fcrossen

  • 2
  • +0/-0
    • SeeIT Consult
Hi all,

I know that this has been posted before in various formats, but I have symptoms and a fix, so hence the new thread...

The Dell Poweredge server is connected using an APC serial cable and has been running flawlessly for about 9 months now.

SYMPTOM:

Shortly after a YUM update (Dec 02 11:45:58 Updated: e-smith-nutUPS.noarch 2.0.0-1.el4.sme) my APC SmartUPS decided to give problems, popping on and off line until it exhausted it's battery.

The messages log gives no indications of anything untoward pre and post reboot.

Original config:

config show nut

nut=service
    Device=/dev/ttyS0
    MasterPass=<snip>
    Model=apcsmart
    SlavePass=<snip>
    Type=7
    status=enabled

Odd things is that there is nothing untoward in the log files until:

Dec  8 01:43:01 sylvie upsmon[3959]: UPS UPS@localhost on battery
Dec  8 01:43:01 sylvie wall[10673]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 01:43:36 sylvie upsmon[3959]: UPS UPS@localhost on line power
Dec  8 01:43:36 sylvie wall[10681]: wall: user nut broadcasted 1 lines (33 chars)
Dec  8 03:49:10 sylvie upsmon[3959]: UPS UPS@localhost on battery
Dec  8 03:49:10 sylvie wall[11802]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 04:31:56 sylvie upsmon[3959]: UPS UPS@localhost on line power
Dec  8 04:31:56 sylvie wall[7905]: wall: user nut broadcasted 1 lines (33 chars)
Dec  8 04:34:36 sylvie upsmon[3959]: UPS UPS@localhost on battery
Dec  8 04:34:36 sylvie wall[7922]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 04:34:46 sylvie upsmon[3959]: UPS UPS@localhost on line power
Dec  8 04:34:46 sylvie wall[7930]: wall: user nut broadcasted 1 lines (33 chars)
Dec  8 04:35:21 sylvie upsmon[3959]: UPS UPS@localhost on battery
Dec  8 04:35:21 sylvie wall[7949]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 04:35:31 sylvie upsmon[3959]: UPS UPS@localhost on line power
Dec  8 04:35:31 sylvie wall[7957]: wall: user nut broadcasted 1 lines (33 chars)
Dec  8 04:36:06 sylvie upsmon[3959]: UPS UPS@localhost on battery
Dec  8 04:36:06 sylvie wall[7965]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 04:39:41 sylvie upsmon[3959]: UPS UPS@localhost on line power
Dec  8 04:39:41 sylvie wall[7973]: wall: user nut broadcasted 1 lines (33 chars)
Dec  8 04:39:51 sylvie upsmon[3959]: UPS UPS@localhost on battery
Dec  8 04:39:51 sylvie wall[7981]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 04:40:21 sylvie upsmon[3959]: UPS UPS@localhost on line power
Dec  8 04:40:21 sylvie wall[7992]: wall: user nut broadcasted 1 lines (33 chars)
Dec  8 04:40:26 sylvie upsmon[3959]: UPS UPS@localhost on battery
Dec  8 04:40:26 sylvie wall[8000]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 04:40:56 sylvie upsmon[3959]: UPS UPS@localhost on line power
Dec  8 04:40:56 sylvie wall[8008]: wall: user nut broadcasted 1 lines (33 chars)
<snip>


This goes on until the UPS is exhausted.

Dec  8 08:25:27 sylvie upsmon[3959]: UPS UPS@localhost battery is low
Dec  8 08:25:27 sylvie upsd[3954]: Client upsmaster@127.0.0.1 set FSD on UPS [UPS]
Dec  8 08:25:27 sylvie wall[13403]: wall: user nut broadcasted 1 lines (34 chars)
Dec  8 08:25:27 sylvie upsmon[3959]: Executing automatic power-fail shutdown
Dec  8 08:25:27 sylvie wall[13409]: wall: user nut broadcasted 2 lines (43 chars)
Dec  8 08:25:27 sylvie upsmon[3959]: Auto logout and shutdown proceeding
Dec  8 08:25:27 sylvie wall[13412]: wall: user nut broadcasted 1 lines (37 chars)

I checked the hardware by plugging several desktops into the UPS and all went well. However each time I plugged in the server AND attached the serial cable, the UPS would click on and off line as shown above.

After restart:

Dec  8 10:32:08 sylvie apcsmart[3765]: Startup successful
Dec  8 10:32:08 sylvie upsdrvctl: Network UPS Tools - UPS driver controller 2.2.0-
Dec  8 10:32:08 sylvie nut: upsdrvctl startup succeeded
Dec  8 10:32:08 sylvie upsd[3768]: listening on 0.0.0.0 port 3493
Dec  8 10:32:08 sylvie upsd: listening on 0.0.0.0 port 3493
Dec  8 10:32:08 sylvie upsd[3768]: Connected to UPS [UPS]: apcsmart-UPS
Dec  8 10:32:08 sylvie upsd: Connected to UPS [UPS]: apcsmart-UPS
Dec  8 10:32:08 sylvie upsd[3769]: Startup successful
Dec  8 10:32:08 sylvie nut: upsd startup succeeded
Dec  8 10:32:08 sylvie upsmon: UPS: UPS@localhost (master) (power value 1)
Dec  8 10:32:08 sylvie upsmon: Using power down flag file /etc/killpower
Dec  8 10:32:08 sylvie upsmon:
Dec  8 10:32:08 sylvie upsmon[3773]: Startup successful
Dec  8 10:32:08 sylvie upsd[3769]: Connection from 127.0.0.1
Dec  8 10:32:08 sylvie upsd[3769]: Client upsmaster@127.0.0.1 logged into UPS [UPS]
Dec  8 10:32:08 sylvie nut: upsmon startup succeeded
<snip>
Dec  8 10:32:13 sylvie upsmon[3774]: UPS UPS@localhost battery is low
Dec  8 10:32:14 sylvie wall[4370]: wall: user nut broadcasted 1 lines (34 chars)
<snip>
Dec  8 10:43:44 sylvie upsmon[3774]: UPS UPS@localhost on battery
Dec  8 10:43:44 sylvie wall[5077]: wall: user nut broadcasted 1 lines (30 chars)
Dec  8 10:48:14 sylvie upsmon[3774]: UPS UPS@localhost battery is low
Dec  8 10:48:14 sylvie upsd[3769]: Client upsmaster@127.0.0.1 set FSD on UPS [UPS]
Dec  8 10:48:14 sylvie upsmon[3774]: Executing automatic power-fail shutdown
Dec  8 10:48:14 sylvie wall[5232]: wall: user nut broadcasted 1 lines (34 chars)
Dec  8 10:48:14 sylvie wall[5233]: wall: user nut broadcasted 2 lines (43 chars)
Dec  8 10:48:14 sylvie upsmon[3774]: Auto logout and shutdown proceeding
Dec  8 10:48:14 sylvie wall[5238]: wall: user nut broadcasted 1 lines (37 chars)
Dec  8 10:48:19 sylvie upsd[3769]: Host 127.0.0.1 disconnected (read failure)
<snip>


After this I started manually starting and stopping nut to see what the story was. However at that stage I was getting:

3083:Dec  8 10:50:51 sylvie upsdrvctl: Network UPS Tools - UPS driver controller 2.2.0-
3084:Dec  8 10:50:51 sylvie nut: upsdrvctl startup succeeded
3085:Dec  8 10:50:51 sylvie upsd[3756]: listening on 0.0.0.0 port 3493
3086:Dec  8 10:50:51 sylvie upsd: listening on 0.0.0.0 port 3493
3087:Dec  8 10:50:51 sylvie upsd[3756]: Connected to UPS [UPS]: apcsmart-UPS
3088:Dec  8 10:50:51 sylvie upsd: Connected to UPS [UPS]: apcsmart-UPS
3089:Dec  8 10:50:51 sylvie upsd[3757]: Startup successful
3090:Dec  8 10:50:51 sylvie nut: upsd startup succeeded
3091:Dec  8 10:50:51 sylvie upsmon: UPS: UPS@localhost (master) (power value 1)
3092:Dec  8 10:50:51 sylvie upsmon: Using power down flag file /etc/killpower
3093:Dec  8 10:50:51 sylvie upsmon:
3094:Dec  8 10:50:51 sylvie upsmon[3761]: Startup successful
3095:Dec  8 10:50:51 sylvie upsd[3757]: Connection from 127.0.0.1
3096:Dec  8 10:50:51 sylvie upsd[3757]: Client upsmaster@127.0.0.1 logged into UPS [UPS]
3097:Dec  8 10:50:51 sylvie nut: upsmon startup succeeded
3139:Dec  8 10:50:56 sylvie upsmon[3762]: UPS UPS@localhost on battery
3194:Dec  8 10:51:07 sylvie upsd[3757]: Data for UPS [UPS] is stale - check driver
3196:Dec  8 10:51:09 sylvie upsd[3757]: UPS [UPS] data is no longer stale
3197:Dec  8 10:51:11 sylvie upsmon[3762]: UPS UPS@localhost on line power
3201:Dec  8 10:51:19 sylvie upsd[3757]: Data for UPS [UPS] is stale - check driver
3203:Dec  8 10:51:21 sylvie upsd[3757]: UPS [UPS] data is no longer stale
3207:Dec  8 10:51:31 sylvie upsmon[3762]: UPS UPS@localhost on battery
3220:Dec  8 10:51:41 sylvie upsd[3757]: Data for UPS [UPS] is stale - check driver
3222:Dec  8 10:51:44 sylvie upsd[3757]: UPS [UPS] data is no longer stale
3231:Dec  8 10:51:46 sylvie upsmon[3762]: UPS UPS@localhost on line power
3233:Dec  8 10:51:56 sylvie upsmon[3762]: UPS UPS@localhost on battery
3236:Dec  8 10:51:58 sylvie upsd[3757]: Data for UPS [UPS] is stale - check driver
3238:Dec  8 10:52:00 sylvie upsd[3757]: UPS [UPS] data is no longer stale
3239:Dec  8 10:52:01 sylvie upsmon[3762]: UPS UPS@localhost on line power
3241:Dec  8 10:52:11 sylvie upsmon[3762]: UPS UPS@localhost on battery
3244:Dec  8 10:55:24 sylvie upsd[3757]: Data for UPS [UPS] is stale - check driver
3245:Dec  8 10:55:27 sylvie upsmon[3762]: Poll UPS [UPS@localhost] failed - Data stale
3246:Dec  8 10:55:27 sylvie upsmon[3762]: Communications with UPS UPS@localhost lost
3248:Dec  8 10:55:32 sylvie upsmon[3762]: Poll UPS [UPS@localhost] failed - Data stale
3249:Dec  8 10:55:37 sylvie upsmon[3762]: Poll UPS [UPS@localhost] failed - Data stale
3251:Dec  8 10:55:40 sylvie upsd[3757]: UPS [UPS] data is no longer stale
3252:Dec  8 10:55:42 sylvie upsmon[3762]: Communications with UPS UPS@localhost established
3253:Dec  8 10:55:42 sylvie upsmon[3762]: UPS UPS@localhost on line power
3258:Dec  8 11:13:37 sylvie upsmon[3762]: UPS UPS@localhost on battery
3261:Dec  8 11:14:49 sylvie upsd[3757]: Data for UPS [UPS] is stale - check driver
3263:Dec  8 11:14:51 sylvie upsd[3757]: UPS [UPS] data is no longer stale
3264:Dec  8 11:14:52 sylvie upsmon[3762]: UPS UPS@localhost on line power
3266:Dec  8 11:15:02 sylvie upsmon[3762]: UPS UPS@localhost on battery
3269:Dec  8 11:15:49 sylvie upsd[3757]: Data for UPS [UPS] is stale - check driver
3270:Dec  8 11:15:52 sylvie upsmon[3762]: Poll UPS [UPS@localhost] failed - Data stale
3271:Dec  8 11:15:52 sylvie upsmon[3762]: Communications with UPS UPS@localhost lost
3273:Dec  8 11:15:57 sylvie upsmon[3762]: Poll UPS [UPS@localhost] failed - Data stale
3275:Dec  8 11:16:07 sylvie upsd[3757]: Client upsmaster@127.0.0.1 set FSD on UPS [UPS]
3276:Dec  8 11:16:07 sylvie upsmon[3762]: Executing automatic power-fail shutdown
3278:Dec  8 11:16:07 sylvie upsmon[3762]: Auto logout and shutdown proceeding
3280:Dec  8 11:16:12 sylvie upsd[3757]: Host 127.0.0.1 disconnected (read failure)
3299:Dec  8 11:16:14 sylvie ups: upsmon shutdown failed
3300:Dec  8 11:16:14 sylvie upsd[3757]: Signal 15: exiting


Every time the UPS went to battery, the OVERLOAD light would light and the device would pop on and off line. If I used the UPS without the serial cable the UPS was happy.

(I did check the hardware - plugged in several desktops and they were fine. Of course I had no cables of monitoring software for the UPS. I let it alone with no load to allow it to charge up, but the minute I plugged it into the server and reconnected the Serial cable all hell broke loose.)

In the end I tried the USB connectivity:

config show nut:

nut=service
    MasterPass=<snip>
    Model=usbhid-ups
    SlavePass=<snip>
    status=enabled


I started up and all was well:

Dec 10 13:39:26 sylvie upsdrvctl: Network UPS Tools: 0.28 USB communication driver 0.28 - core 0.30 (2.2.0-)
Dec 10 13:39:26 sylvie upsdrvctl:
Dec 10 13:39:26 sylvie upsdrvctl: Detected a UPS: American Power Conversion/Smart-UPS 1500 FW:653.13.I USB FW:7.3
Dec 10 13:39:26 sylvie upsdrvctl: Using subdriver: APC/CyberPower HID 0.9
Dec 10 13:39:27 sylvie usbhid-ups[19817]: Startup successful
Dec 10 13:39:27 sylvie upsdrvctl: Network UPS Tools - UPS driver controller 2.2.0-
Dec 10 13:39:27 sylvie nut: upsdrvctl startup succeeded
Dec 10 13:39:27 sylvie upsd[19820]: listening on 0.0.0.0 port 3493
Dec 10 13:39:27 sylvie upsd[19820]: Connected to UPS [UPS]: usbhid-ups-UPS
Dec 10 13:39:27 sylvie upsd: listening on 0.0.0.0 port 3493
Dec 10 13:39:27 sylvie upsd[19821]: Startup successful
Dec 10 13:39:27 sylvie upsd: Connected to UPS [UPS]: usbhid-ups-UPS
Dec 10 13:39:27 sylvie nut: upsd startup succeeded
Dec 10 13:39:27 sylvie upsmon[19825]: Startup successful
Dec 10 13:39:27 sylvie upsmon: UPS: UPS@localhost (master) (power value 1)
Dec 10 13:39:27 sylvie upsmon: Using power down flag file /etc/killpower
Dec 10 13:39:27 sylvie upsmon:
Dec 10 13:39:27 sylvie upsd[19821]: Connection from 127.0.0.1
Dec 10 13:39:27 sylvie upsd[19821]: Client upsmaster@127.0.0.1 logged into UPS [UPS]
Dec 10 13:39:27 sylvie nut: upsmon startup succeeded

It's been an hour now and all is well - the above messages were the last ups/nut related log entries (apart from the ones I generated by examining the ups).

Sorry about the lengthy post, but this was really annoying me and there isn't anything there that matches my issue. Hopefully my experiences will help someone.

Fran.
--
Francis Crossen
PHP/MySQL/Web programming and consultancy to order

Offline mdo

  • *
  • 355
  • +0/-0
Re: NUT and APC SmartUPS - comminucations problems with serial port - FIXED
« Reply #1 on: December 10, 2008, 06:52:47 PM »
Good to see that it is working through USB but I would expect an APC SmartUPS 1500 to work through serial cable as well (unless there is a problem with the serial cable or the connector).

Quote
nut=service
    Device=/dev/ttyS0
    MasterPass=<snip>
    Model=apcsmart
    SlavePass=<snip>
    Type=7
    status=enabled

Type=7 looks strange to me because it is not a standard setting under SME (and we never used that for any APC SmartUPS). You might have added that parameter manually at some stage?

I would remove the 'Type=7' parameter and try it again via serial.
Michael
...

Offline fcrossen

  • 2
  • +0/-0
    • SeeIT Consult
Re: NUT and APC SmartUPS - comminucations problems with serial port - FIXED
« Reply #2 on: December 11, 2008, 02:59:38 PM »
Thanks for the reply Michael. I don't know where the type=7 came from, but I do know I didn't pull it out of mid air! My nut experiences to date were as the HOWTO suggests: 'It just works'.

Following your advice:

'# service nut stop':

Dec 11 13:27:52 sylvie upsmon[19826]: Signal 15: exiting
Dec 11 13:27:52 sylvie upsd[19821]: Client upsmaster@127.0.0.1 logged out
Dec 11 13:27:52 sylvie nut: upsmon shutdown succeeded
Dec 11 13:27:52 sylvie upsd[19821]: Signal 15: exiting
Dec 11 13:27:52 sylvie nut: upsd shutdown succeeded
Dec 11 13:27:52 sylvie usbhid-ups[19817]: Signal 15: exiting


Changed parameters to serial connection:

'# config show nut':

nut=service
    Device=/dev/ttyS0
    MasterPass=<snip>
    Model=apcsmart
    SlavePass=<snip>
    status=enabled


'# service nut start' (I also changed the cabling, hence the USB disconnect)

Dec 11 13:31:47 sylvie kernel: usb 1-7.2: USB disconnect, address 3
Dec 11 13:33:07 sylvie apcsmart[16934]: Startup successful
Dec 11 13:33:07 sylvie upsdrvctl: Network UPS Tools - UPS driver controller 2.2.0-
Dec 11 13:33:07 sylvie nut: upsdrvctl startup succeeded
Dec 11 13:33:07 sylvie upsd: listening on 0.0.0.0 port 3493
Dec 11 13:33:07 sylvie upsd[16937]: listening on 0.0.0.0 port 3493
Dec 11 13:33:07 sylvie upsd[16937]: Connected to UPS [UPS]: apcsmart-UPS
Dec 11 13:33:07 sylvie upsd: Connected to UPS [UPS]: apcsmart-UPS
Dec 11 13:33:07 sylvie upsd[16938]: Startup successful
Dec 11 13:33:07 sylvie nut: upsd startup succeeded
Dec 11 13:33:07 sylvie upsmon: UPS: UPS@localhost (master) (power value 1)
Dec 11 13:33:07 sylvie upsmon[16942]: Startup successful
Dec 11 13:33:07 sylvie upsd[16938]: Connection from 127.0.0.1
Dec 11 13:33:07 sylvie upsd[16938]: Client upsmaster@127.0.0.1 logged into UPS [UPS]
Dec 11 13:33:07 sylvie upsmon: Using power down flag file /etc/killpower
Dec 11 13:33:07 sylvie upsmon:
Dec 11 13:33:07 sylvie nut: upsmon startup succeeded


All of 5 seconds of grace before:

Dec 11 13:33:12 sylvie upsmon[16943]: UPS UPS@localhost on battery
Dec 11 13:33:12 sylvie wall[16947]: wall: user nut broadcasted 1 lines (30 chars)
Dec 11 13:33:27 sylvie upsmon[16943]: UPS UPS@localhost on line power
Dec 11 13:33:27 sylvie wall[16962]: wall: user nut broadcasted 1 lines (33 chars)
Dec 11 13:33:32 sylvie upsmon[16943]: UPS UPS@localhost on battery
Dec 11 13:33:32 sylvie wall[16970]: wall: user nut broadcasted 1 lines (30 chars)
Dec 11 13:33:53 sylvie upsmon[16943]: UPS UPS@localhost on line power
Dec 11 13:33:53 sylvie wall[17015]: wall: user nut broadcasted 1 lines (33 chars)


Ouch - '# service nut stop' - change cabling again:

Dec 11 13:33:58 sylvie upsmon[16943]: Signal 15: exiting
Dec 11 13:33:58 sylvie upsd[16938]: Client upsmaster@127.0.0.1 logged out
Dec 11 13:33:58 sylvie nut: upsmon shutdown succeeded
Dec 11 13:33:58 sylvie upsd[16938]: Signal 15: exiting
Dec 11 13:33:58 sylvie nut: upsd shutdown succeeded
Dec 11 13:33:58 sylvie apcsmart[16934]: Signal 15: exiting
Dec 11 13:36:12 sylvie kernel: usb 1-7.2: new low speed USB device using address 4
Dec 11 13:36:13 sylvie kernel: hiddev96: USB HID v1.10 Device [American Power Conversion Smart-UPS 1500 FW:653.13.I USB FW:7.3] on usb-0000:00:1d.7-7.2


Changed parameters back to usb connection:

'# config show nut':

nut=service
    MasterPass=<snip>
    Model=usbhid-ups
    SlavePass=<snip>
    status=enabled


'# service nut start':

Dec 11 13:38:49 sylvie upsdrvctl: Network UPS Tools: 0.28 USB communication driver 0.28 - core 0.30 (2.2.0-)
Dec 11 13:38:49 sylvie upsdrvctl:
Dec 11 13:38:49 sylvie upsdrvctl: Detected a UPS: American Power Conversion/Smart-UPS 1500 FW:653.13.I USB FW:7.3
Dec 11 13:38:49 sylvie upsdrvctl: Using subdriver: APC/CyberPower HID 0.9
Dec 11 13:38:51 sylvie usbhid-ups[17299]: Startup successful
Dec 11 13:38:51 sylvie upsdrvctl: Network UPS Tools - UPS driver controller 2.2.0-
Dec 11 13:38:51 sylvie nut: upsdrvctl startup succeeded
Dec 11 13:38:51 sylvie upsd[17302]: listening on 0.0.0.0 port 3493
Dec 11 13:38:51 sylvie upsd: listening on 0.0.0.0 port 3493
Dec 11 13:38:51 sylvie upsd[17302]: Connected to UPS [UPS]: usbhid-ups-UPS
Dec 11 13:38:51 sylvie upsd[17303]: Startup successful
Dec 11 13:38:51 sylvie upsd: Connected to UPS [UPS]: usbhid-ups-UPS
Dec 11 13:38:51 sylvie nut: upsd startup succeeded
Dec 11 13:38:51 sylvie upsmon[17307]: Startup successful
Dec 11 13:38:51 sylvie upsmon: UPS: UPS@localhost (master) (power value 1)
Dec 11 13:38:51 sylvie upsd[17303]: Connection from 127.0.0.1
Dec 11 13:38:51 sylvie upsmon: Using power down flag file /etc/killpower
Dec 11 13:38:51 sylvie upsmon:
Dec 11 13:38:51 sylvie upsd[17303]: Client upsmaster@127.0.0.1 logged into UPS [UPS]
Dec 11 13:38:51 sylvie nut: upsmon startup succeeded


Back to normality!

The serial cable is the same one I've been using up until now, so I doubt it is at fault. Of course I have no continuity tester for 9-pin APC cables so...

Regards,
Francis.
--
Francis Crossen
PHP/MySQL/Web programming and consultancy to order

Offline mdo

  • *
  • 355
  • +0/-0
Re: NUT and APC SmartUPS - comminucations problems with serial port - FIXED
« Reply #3 on: December 11, 2008, 10:35:18 PM »
The only other things that I can think of:

- Make sure that the cable is the proper type: I think it has to be a '940-00024C'
- Check for cable 'damage' - We have had twice a situation (non APC UPS, different brand but serial cable) where the Nut-UPS communication did fail (after it had worked for months) and in these scenarios, the serial cable was squeezed (the UPS had been moved a bit and after that it pushed the cable). After the cable was stretched (and later replaced), nut worked perfectly fine again.
- If all fails, stay with USB interface. We had that once that the serial connection created ongoing issues and swapped over to USB. We blamed the UPS serial port for that (which might be wrong).
Michael
...