Koozali.org: home of the SME Server
Obsolete Releases => SME Server 8.x => Topic started by: hungree on November 17, 2014, 02:04:05 AM
-
Good day!
I'm a new learner in SME Server and i need your help.
The scenario is below.
We have SME Server 2.6.18-398.e15
- Updated yum thru server manager. (see list below)
- After 3 days diagnosed that RAID setup need re-synching. - done
- After the weekend, i am unable to access sme server.
- Did a restart (10seconds on shut down button)
- While booting sme server stock at "Starting ldap: " with message waiting for slapd to startup.
Tried searching and all i found was discussion regarding the bug.
I really need your help.
===================================================
=== yum reports available updates:=========================
===================================================
at.x86_64 3.1.8-84.el5_11.1 updates
automake.noarch 1.9.6-3.el5 base
bash.x86_64 3.2-33.el5_11.4 updates
centos-release.x86_64 10:5-11.el5.centos base
centos-release-notes.x86_64 5.11-0 base
device-mapper-multipath.x86_64 0.4.7-63.el5 base
dmidecode.x86_64 1:2.12-1.el5 base
e2fsprogs.x86_64 1.39-37.el5 base
e2fsprogs-libs.x86_64 1.39-37.el5 base
glibc.x86_64 2.5-123 base
glibc-common.x86_64 2.5-123 base
httpd.x86_64 2.2.3-91.el5.centos base
hwdata.noarch 0.213.30-1.el5 base
kernel.x86_64 2.6.18-398.el5 base
kpartx.x86_64 0.4.7-63.el5 base
krb5-libs.x86_64 1.6.1-80.el5_11 updates
libgcc.x86_64 4.1.2-55.el5 base
libstdc++.x86_64 4.1.2-55.el5 base
libvolume_id.x86_64 095-14.32.el5 base
lvm2.x86_64 2.02.88-13.el5 base
microcode_ctl.x86_64 2:1.17-7.el5 base
mkinitrd.x86_64 5.1.19.6-82.el5 base
mod_ssl.x86_64 1:2.2.3-91.el5.centos base
nash.x86_64 5.1.19.6-82.el5 base
net-snmp.x86_64 1:5.3.2.2-25.el5_11 updates
net-snmp-libs.x86_64 1:5.3.2.2-25.el5_11 updates
net-snmp-utils.x86_64 1:5.3.2.2-25.el5_11 updates
nscd.x86_64 2.5-123 base
nss.x86_64 3.16.1-4.el5_11 updates
openssl.x86_64 0.9.8e-28.el5.sme smeupdates
perl.x86_64 4:5.8.8-43.el5_11 updates
perl-suidperl.x86_64 4:5.8.8-43.el5_11 updates
php.x86_64 5.3.3-15.el5.sme smeupdates
php-cli.x86_64 5.3.3-15.el5.sme smeupdates
php-common.x86_64 5.3.3-15.el5.sme smeupdates
php-devel.x86_64 5.3.3-15.el5.sme smeupdates
php-gd.x86_64 5.3.3-15.el5.sme smeupdates
php-imap.x86_64 5.3.3-15.el5.sme smeupdates
php-ldap.x86_64 5.3.3-15.el5.sme smeupdates
php-mbstring.x86_64 5.3.3-15.el5.sme smeupdates
php-mysql.x86_64 5.3.3-15.el5.sme smeupdates
php-pdo.x86_64 5.3.3-15.el5.sme smeupdates
php-xml.x86_64 5.3.3-15.el5.sme smeupdates
procmail.x86_64 3.22-17.1.2.el5_10 updates
samba3x.x86_64 3.6.23-6.el5 base
samba3x-client.x86_64 3.6.23-6.el5 base
samba3x-common.x86_64 3.6.23-6.el5 base
samba3x-winbind.x86_64 3.6.23-6.el5 base
shadow-utils.x86_64 2:4.0.17-23.el5 base
squid.x86_64 7:2.6.STABLE21-7.el5_10 updates
tzdata.x86_64 2014i-1.el5 updates
udev.x86_64 095-14.32.el5 base
-
Hi and welcome,
what exactly is your problem now? the waiting for sldap to startup is normal, that you can not access SME Server is not.
Please explain how you try to access.
guest
-
Hi RequestedDeletion,
I see, my problem is, it is not continuing after the ff.; I'm waiting for like 3 hours for it to continue and still nothing changes. We cannot access our files.
Before this happened I had received notification c/o smartd that one of my disk has CurrentPendingSector error. Is this has connection to my problem?
Our server is set-up as RAID1.
Starting ldap : [Ok]
waiting for slapd to startup
waiting for slapd to startup
waiting for slapd to startup
-
It all sounds a lot like an issue on a lower level, hardware.
1. Why was the raid out of sync within 3 days?
2. Why could you not access the server that a power button reset was required
3. Why are you already waiting for 3 hours whilst it should take you about 2 minutes to reach the login prompt
I would reboot in rescue mode or 'single' mode, take a look at the logs and start diagnose your hardware.
guest
-
Before this happened I had received notification c/o smartd that one of my disk has CurrentPendingSector error.
What is the full message that smartd gave?
-
RequestedDeletion,
1. Re-syching is already done last friday.
2. I can but clueless on what i am going to do. My senior has just resigned and did not try to say anything about this server.
3. I'm clueless on what am i going to do.
I already booted in rescue mode. and checked cat /proc/mdstat and given the results
sh-3.2# cat /proc/mdstat
md2: active raid1 sdb2[0] sda2[1]
1953407552 blocks [2/2] [UU]
[>..............................] resync = 0.9% (19150336/1953407552) finish=188.9min speed=170609K/sec
md1: active raid1 sdb1[0] sda1[1]
104320 blocks [2/2] [UU]
unused devices: <none>
the smartd message is:
SUBJECT:SMART error (CurrentPendingSector) detected on host: precision-fs
This email was generated by the smartd daemon running on:
host name: precision-fs
DNS domain: precision.lcl
NIS domain: (none)
The following warning/error was logged by the smartd daemon:
Device: /dev/sdb [SAT], 11 Currently unreadable (pending) sectors
For details see host's SYSLOG.
You can also use the smartctl utility for further investigation.
Another email message will be sent in 1 days if the problem persists
-
1. Re-syching is already done last friday.
yes, I understand that, I would wonder why you had to re-sync in the first place.
2. My senior has just resigned and did not try to say anything about this server.
That's the biggest "error"...
3. I'm clueless on what am i going to do.
Boot in 'single' mode and check the logs.
I already booted in rescue mode. and checked cat /proc/mdstat and given the results
So something is wrong "Device: /dev/sdb [SAT], 11 Currently unreadable (pending) sectors"
-
Hwfang,
I did a re-sync because there's a status in the server-manager saying "server needs to restart if you reconfigure the settings" it is something like that.
Boot in 'single' mode and check the logs.
I'm already in rescue mode. if it is the same as the single mode. I'm sorry but what's the command in checking the logs?
So something is wrong "Device: /dev/sdb [SAT], 11 Currently unreadable (pending) sectors"
Is it ok if i removed this one and replaced it with new one? Can i use the server while there's only one hard disk attached?
-
for single mode you need to add the word "single" to the boot parameters. This can be done when you boot from HD and the GRUB is presenting the options. See instructions below the main menu.
Yes, you can swap the HD
-
Please see: http://wiki.contribs.org/Raid#Replacing_and_Upgrading_Hard_Drive_after_HD_fail
-
RequestedDeletion,
I'm sorry but i cannot distinguish where to add the single in the ff:
root (hd0,0)
kernel /vmlinuz-2.6.18-398.el5 ro root=/dev/main/root
initrd /initrd-2.6.18-398.el5.img
-
root (hd0,0)
kernel /vmlinuz-2.6.18-398.el5 ro root=/dev/main/root
initrd /initrd-2.6.18-398.el5.img
kernel /vmlinuz-2.6.18-398.el5 ro root=/dev/main/root single
then enter and press 'b' for boot.
-
Hwfang,
Sir what log will i view and what is the command also.
-
All logs are in /var/log
You might want to start with examining /var/log/messages
"cat /var/log/messages | less" to browse the messages log using arrow keys, 'q' to quit
"cat /var/log/messages | grep error"
"cat /var/log/messages | grep raid"
etc.
-
Hwfang,
After booting up into single mode, system detected filesystem errors.
Checking filesystems
EXT3-fs error (device dm-0): ext3_lookup: unlinked inode 113476142 in dir #11347 5859
EXT3-fs error (device dm-0): ext3_lookup: unlinked inode 113476142 in dir #11347 5859
/dev/main/root contains a file system with errors, checked forced.
/dev/main/root: | | 0.1%
-
Hwfang,
It is now running temporarily after I removed the affected hard drive.