Koozali.org: home of the SME Server

Obsolete Releases => SME Server 9.x => Topic started by: calisun on November 24, 2015, 11:19:09 PM

Title: Server 9_64 died after one week online
Post by: calisun on November 24, 2015, 11:19:09 PM
My SME server died last month (after several years of faithful service) at a co-lo facility.
So I got a PC setup real quick to act as a temporary server until I got a new (used) server ready for service.
The temporary PC was only supposed to work as mail server and to host simple HTML page with 503 error message.

The PC:
AMD Dual core E-350 processor
Brand new 750GB SATA hard drive
8GB RAM
New SME 9_64 install without any contribs.

Right after install, for about a week, I was able to receive mail and connect to server-manager remotely.
One week later, I was able to connect to the server, but it would not accept any passwords (known to be good) to ether mail or server manager.
So I asked an employee at co-lo facility to reboot the PC for me, after that everything was down.

Yesterday I retrieved the PC from co-lo facility and I tried to look at log files to see what happened, but I am unable to. SME server will not boot, I just get a constant string of error messages (see attached pics of boot screen).

So where do I start figuring out what happened?
 

Title: Re: Server 9_64 died after one week online
Post by: Stefano on November 24, 2015, 11:24:29 PM
your HD is gone..

restore from backup
Title: Re: Server 9_64 died after one week online
Post by: calisun on November 24, 2015, 11:28:55 PM
your HD is gone..
This is a brand new drive, I just got it two weeks ago specifically to install SME Server for temporary server.


restore from backup
Nothing to restore, this was used as a temporary server only
Title: Re: Server 9_64 died after one week online
Post by: Stefano on November 24, 2015, 11:30:43 PM
This is a brand new drive, I just got it two weeks ago specifically to install SME Server for temporary server.

it doesn't matter, unfortunately

Quote
Nothing to restore, this was used as a temporary server only

well.. I guess that your users are using pop3.. otherwise, all emails are gone (and that's bad)
Title: Re: Server 9_64 died after one week online
Post by: DanB35 on November 25, 2015, 12:41:47 PM
Concur, it looks very much like the hard drive has died, though I guess it's possible it's the controller or the cable.  Infant mortality is not unheard of--did you do any sort of testing of the drive before you installed SME on it?  Badblocks and a long SMART self-test are two good ways to exercise the drive and see if there are any issues out of the gate.

Can you connect that drive to another computer, ideally using a different cable?  If so, see if you can get the SMART data from it--under *nix, you'd use the command 'smartctl -a /dev/whatever'
Title: Re: Server 9_64 died after one week online
Post by: JohnG on November 27, 2015, 04:34:56 PM
This is a brand new drive, I just got it two weeks ago specifically to install SME Server for temporary server.

Being brand new is a reason for it to fail, not a reason for why it shouldn't.
Title: Re: Server 9_64 died after one week online
Post by: janet on November 28, 2015, 01:57:57 AM
calisun

"Brand new 750GB SATA hard drive"

Out of interest what is the brand, model etc of the hard disk ?
full details please for the record & so I don't buy one in future.
Title: Re: Server 9_64 died after one week online
Post by: DanB35 on November 28, 2015, 02:17:57 AM
full details please for the record & so I don't buy one in future.
That really sounds like an overreaction--infant mortality can happen with anything.  No matter what make or model you choose, if you look hard enough, you can find someone who's had one DOA, or fail very soon (I once bought three WD Red 3 TB drives, and two of them were DOA--incorrectly reporting their capacity as 2 TB.  The replacements, and the third drive, have been fine 24x7 for the last 18 months).  Thorough burn-in and testing, redundancy, and backups are all important if you value your data.
Title: Re: Server 9_64 died after one week online
Post by: TerryF on November 28, 2015, 02:50:16 AM
With you janet, some are better than others, or should that be some are worse than others..

Having said that only ever had issues with seagates due to a bug in the firmware, still have two of the drives having installed updated firmware but don't trust them, just used as a test drives.
Title: Re: Server 9_64 died after one week online
Post by: guest22 on November 28, 2015, 03:11:32 AM
sounds like "Russian Roulette" to me... ;-)
Title: Re: Server 9_64 died after one week online
Post by: janet on November 28, 2015, 03:34:09 AM
DanB35

I agree with everything you said.

My comment was brief.
I was really just gathering anecdotal evidence (or I should really say anecdotal stories), to go into my/the collective conscience of good/bad hard disks, for future reference.

If I have gathered enough bad stories about something, then that will contribute more strongly to any decision I might make to avoid or use those products.

Yes I always like to burn any new drive in for a while before putting it into serious service.

Presently WD Blacks are working for me.
Title: Re: Server 9_64 died after one week online
Post by: calisun on January 28, 2016, 05:27:23 AM
So it looks like the drive was not bad after all. After last comments, I just agreed that the drive was dead and left it at that. Yesterday I decided to play around with Android X86. I used my desktop and put in that drive to test. The drive partitioned and formatted without any issues and now Android X86 is running on it great, no slowdown, no hangs, no system error messages.

Unfortunately all SME server data is destroyed, so I can't go back and try to figure out what happened. But the good news, the drive is working fine.
Title: Re: Server 9_64 died after one week online
Post by: brianr on January 28, 2016, 11:10:52 AM
So it looks like the drive was not bad after all. After last comments, I just agreed that the drive was dead and left it at that. Yesterday I decided to play around with Android X86. I used my desktop and put in that drive to test. The drive partitioned and formatted without any issues and now Android X86 is running on it great, no slowdown, no hangs, no system error messages.

Unfortunately all SME server data is destroyed, so I can't go back and try to figure out what happened. But the good news, the drive is working fine.

I would NOT use that drive for anything critical, just because it works now, does not mean it may not fail similarily in the future.

I personally would have run the manufactureres diagnostics on it, in its "broken" state and if something was reported, sent it back for a replacement.