Koozali.org: home of the SME Server
Obsolete Releases => SME Server 8.x => Topic started by: larieu on July 20, 2011, 12:25:54 AM
-
long story short
on one unattended server lot's of power lost happen (at beginning UPS had batteries to save it but after several long power loss batteries failed - and after that moment several power loss occurred at about 5 minutes)
I found the server into the state that it gives me the error at boot with kernel panic
used the install CD and went into recovery mode
here partition is only ro
here first attempt to fsck gives me the error with " no /etc/fstab"
copied /mnt/sysimage/etc/fstab into /etc/fstab
in this moment fsck worked
after reboot I receive the error with /sbin/init
again recovery mode - this time /mnt/sysimage is in rw
it seems that /sbin/init is ok and has the right ownership/rights
any further steps (work around) will be appreciate
thanks in advance
-
more data
the exact error is at follows
no fstab.sys, mounting internal defaults
Swithcing to new root and running init
unmounting old /dev
unmounting old /proc
unmounting old /sys
exec of init (/sbin/init) failed!!!: No souch file or directory
Kernel panic - not syncing: Attempted to kil init!
I noticed that compared with an working server - the mtab is different
mine
cat /mnt/sysimage/etc/mtab
/dev/mapper/main-root / ext3 rw,usrquota,grpquota 0 0
/dev/md1 /boot ext3 rw 0 0
sysfs /sys sysfs rw 0 0
proc /proc proc rw 0 0
compared with other one
cat /etc/mtab
/dev/mapper/main-root / ext3 rw,usrquota,grpquota 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
devpts /dev/pts devpts rw,gid=5,mode=620 0 0
/dev/md1 /boot ext3 rw 0 0
tmpfs /dev/shm tmpfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
also tring to chroot /mnt/sysimage gives error
chroot: cannot run command '/bin/sh' : No such file or directory
-
more data
found also that /mnt/sysimage/etc/sysconfig folder is missing
lots of it is under /mnt/sysimage/lost+found preceded by some #numbers
I wander if doing one "upgrade" with the boot cd will solve this
(boot from cd and instead going in linux rescue - just type sme)
-
more data
found also that /mnt/sysimage/etc/sysconfig folder is missing
lots of it is under /mnt/sysimage/lost+found preceded by some #numbers
I wander if doing one "upgrade" with the boot cd will solve this
(boot from cd and instead going in linux rescue - just type sme)
IMHO you should reinstall and restore. you suffered a major filesystem corruption, I would not try to recover..
my 2c
-
Thanks Stefano
In this moment I try to do something like "manual affa"
I started to backup on network
/home
/opt
/etc/passwd
does anyone know what else I need to to backup ? (how affa job looks)
does anyone know how to see what contribs I had on old install?
after this I want to reinstall and put back data and "manual rise" ( copy data over new install)
any "guide" is welcomed - I had an White night and it is quite hard to find correct inputs on wiki
Thanks
-
found some info here
http://wiki.contribs.org/Backup_server_config (http://wiki.contribs.org/Backup_server_config)
does anyone know if the USB can be FAT32 or it must be ext3?
-
IMHO you should reinstall and restore. you suffered a major filesystem corruption, I would not try to recover..
I agree. Restore from backup.
-
larieu
See http://wiki.contribs.org/USBDisks
-
:-x
unfortunately the backup server was in the same area and is worst as this (hdd no boot at all no possible to read from it)
I have manually copied from production server
/home
/etc/e-smith
/opt
/var/log
/root
/etc/group
/etc/gshadow
/etc/passwd
/etc/samba/secrets.tdb
/etc/samba/smbpasswd
/etc/shadow
/etc/smbpasswd
/etc/ssh
/etc/sudoers
I hope in /var/log I'll see contribs (under yum)
and also hope I can recover all data from this bunch of files
-
it seems that almost worked
installed fresh
update
put some of the contribs I remembered about
then
copied all data into one temporary folder
issued "signal-event pre-restore"
put all data on correct location
"signal-event post-upgrade"
reboot
I had a minor problem with eth0/1 solved by console (I have 4 ETH on that MB and it was wrong allocated by default)
then I received in var/log/messages something about getgrnam and squid - found on forum that could be from /etc/group rights - chmode 644 - solved
apparently I have only one problem
I cannot receive mails from internet on port 25 (not any mail form external servers cann reach my users - ex Yahoo, google...)
If I send mails from internal/external location with Thunderbird correctly set up - works (SMTPS authentication seems to be ok)
If I go in /webmail and try to send mail I receive this error
Failed to connect to localhost:25 [SMTP: Failed to connect socket: Connection refused (code: -1, response: )]
this error is displayed also in /var/log/messages
mail HORDE[11357]: [imp] Failed to connect to localhost:25 [SMTP: Failed to connect socket: Connection refused (code: -1, response: )] [pid 11357 on line 93 of "/home/httpd/html/horde/imp/lib/UI/Compose.php"]
but apart from this no any errors occur
and /var/log/qpsmtpd/current is empty
any help will be appreciated
-
this error is displayed also in /var/log/messages
mail HORDE[11357]: [imp] Failed to connect to localhost:25 [SMTP: Failed to connect socket: Connection refused (code: -1, response: )] [pid 11357 on line 93 of "/home/httpd/html/horde/imp/lib/UI/Compose.php"]
but apart from this no any errors occur
and /var/log/qpsmtpd/current is empty
It seems you don't have a working qpsmtpd service. You need to find out why not. Here are some steps to get you started:
rpm -q qpsmtpd smeserver-qpsmtpd
rpm -V qpsmtpd smeserver-qpsmtpd
sv status /service/qpsmtpd
config show qpsmtpd
-
Thanks Charlie
I figured out
Some custom templates from additional contribs which was missing into the "fresh install"
installing that contribs solved the problem
apparently the qpsmtb was started but ....
now another issue
I have several users with some problems to access some folders in email, to read or write email, or write emails from mobile phone (I think all are related to the same issue)
for example
one user ( "problem.user")
cannot see anything in inbox - but is able to see everything else on subsequent folders
apparently not any mail arrive to it
but if i go into
/home/e-smith/files/users/problem.user/Maildir/cur
I can see all his mails
also all files there has appropriate ownership/rights (read write by the same user)
any suggestion?
for inbox is clear
thus with problem to send from Thunderbird - Send folder has the same behavior
thus with problem to send from mobile - Sent Messages has the same behavior
thus with problem to send from webmail - send-mail ....
unfortunately I'm not able to catch any error ( or I'm looking for something which is not related)
where is the log where to look for related error
Thanks
-
/home/e-smith/files/users/problem.user/Maildir/cur
I can see all his mails
also all files there has appropriate ownership/rights (read write by the same user)
any suggestion?
Check ownership and access permission for all the directories back to his home directory.
-
Thanks Charlie
it seems that folders should have 755 ( I don't expected this) (others and group should be grant the search power)
I also found something strange
If you need to inject some recovered mails (or from some backup) is better to put it in "new" section instead of "cur"
I had funny manifestation in Thunderbird trying to inject them directly in cur
and to be clear if somebody get here ALL MAILS should have 600 (only read write from user)
for thus which will need this
to achieve this go into the root folder of users and issue the commands
find username -type d -exec chown username:username {} \;
find username -type f -exec chown username:username {} \;
find username -type d -exec chmod 755 {} \;
find username -type f -exec chmod 600 {} \;
continuing this "story"
it exist some "signal-event" which will rebuild the default folder architecture in case it doesn't exist any more for some users (lost in the power issue)?
- to be more precise - Some users (around 10) had problems with emails due to fact that the "default" /cur /new /tmp missing; also some had issue with the fact that they missing .Sent or .Trash ....
I remember that I saw somewhere on wiki that it exist one script to rebuilt ownership and rights on /home/e-smith/files/users but I'm not able to find it now - and I am not able to remember if that script also solved this issue
-
it exist some "signal-event" which will rebuild the default folder architecture in case it doesn't exist any more for some users (lost in the power issue)?
No, there is no such thing.
-
I realized now that in the process of recovery I forgot about the
/var/lib/mysql
I must stress that I didn't find mediawiki mysql database into /home/e-smith (with the other ones from "standard" sme install)
I think that in wiki should have in backup zone something about this "recovery case"
and what should be done in recover / backup process
Also I think in wiki should we have a section with backup / recover each contrib
How can I help / get help with this?
-
larieu
I realized now that in the process of recovery I forgot about the
/var/lib/mysql
I must stress that I didn't find mediawiki mysql database into /home/e-smith (with the other ones from "standard" sme install)
I think that in wiki should have in backup zone something about this "recovery case"
and what should be done in recover / backup process
You should have followed all the steps here
http://wiki.contribs.org/Backup_server_config#Manually_transferring_configuration_information
which includes the important command
signal-event pre-backup
This dumps all the mysql databases into a file which is included in the backup (or copied if using the list of standard folders), which is then used during the restore process to recreate all the mysql databases
Also I think in wiki should we have a section with backup / recover each contrib
How can I help / get help with this?
That is covered here
http://wiki.contribs.org/Backup_server_config#Backup_and_Restore_concepts.2C_issues_and_other_information
where it says
The sme backup & restore concept is that only data (users & ibays) and configuration (including all mysql dbs and custom templates) is backed up & restored. All contribs (ie installed as rpms) are not backed up and normal templates (which are installed by the system or add on contribs) are not backed up.
A restore is done to a fresh installation of sme server, and then contribs are reinstalled. As configuration data is restored, you should not need to setup those reinstalled contribs again.
If you use any add on contrib for backup purposes eg Affa or Dar2 or whatever, then after doing a fresh reinstall of the sme OS, the first thing you do is reinstall the backup contrib and do some basic configuration manually. Then you would restore using that contrib, then you can reinstall all other contribs etc.
-
A restore is done to a fresh installation of sme server, and then contribs are reinstalled.
If you store all the contrib rpms in an i-bay, and install them from there, then it is very simple to do a restore and then restore all contribs.
-
larieu
If you store all the contrib rpms in an i-bay, and install them from there, then it is very simple to do a restore and then restore all contribs.
The advice of Charlie is also in the wiki
http://wiki.contribs.org/Backup_server_config#Backup_and_Restore_concepts.2C_issues_and_other_information
where it says
It's good admin practise to create an ibay specifically to house a copy of every rpm you install, so you can easily reinstall those from your backed up and restored copies, although with yum repos being available now, it's not so necessary to do this. It does provide a reminder of everything you have installed though, in case you forget. You should also do this if you wish to retain specific versions of rpms and source rpms, and also include copies of dependency rpms for good measure too.
-
Good point Charlie, makes too much sense now. Not :???: anymore
-
Mary
yes it exist this indication
signal-event pre-backup
but in my case I wasn't able to use it - it wasn't an functional server - just access to hdd recovery data
and yes I have used that pages
http://wiki.contribs.org/Backup_server_config#Backup_and_Restore_concepts.2C_issues_and_other_information (http://wiki.contribs.org/Backup_server_config#Backup_and_Restore_concepts.2C_issues_and_other_information)
http://wiki.contribs.org/Backup_server_config#Manually_transferring_configuration_information (http://wiki.contribs.org/Backup_server_config#Manually_transferring_configuration_information)
to take all data possible to be necessary after backup
If any note will be placed in wiki around that "signal-event pre-backup" stressing that this command make following actions..... will be excellent
Or at least one link to another wiki page where each signal-event command is explained
CharlieBrady
as usually your concepts are master - unfortunately I didn't use this prior of this problem ...
I'll take it into count from now on
I will try to find some guide around here "how to do it"
-
Started with a fully Yummed SME 8.0
Faced a similar situation today after attempting to follow instructions in:
http://mirror.canada.pialasse.com/contribs/dmay/smeserver/6.x/howto/smeserver-iso-howto.html (http://mirror.canada.pialasse.com/contribs/dmay/smeserver/6.x/howto/smeserver-iso-howto.html)
and then doing a dist-upgrade and reboot.
The fstab and mtab seemed to have been overwritten and SELinux Policy was missing as well.
The swap partition also got locked.
The yum that did the damage was:
yum install \
anaconda \
anaconda-runtime \
httpd-devel \
autoconf \
automake \
cpp \
gcc \
gcc-c++ \
gettext \
glibc-devel \
glibc-kernheaders \
kernel-devel \
libacl \
libacl-devel \
libattr \
libstdc++ \
libtool \
m4 \
mkisofs \
ncurses-devel \
pam-devel \
Changes under advice from:
http://wiki.centos.org/HowTos/I_need_the_Kernel_Source (http://wiki.centos.org/HowTos/I_need_the_Kernel_Source)