Koozali.org: home of the SME Server

Loosing internet connection after post-upgrade of server

Offline beast

  • *
  • 245
  • +0/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #60 on: August 07, 2021, 12:39:39 PM »
You are completely missing the point here.

Once you restore your settings are borked because something is wrong on v9, and hence your backup, which means the v10 system is unable to run the dns services correctly. Read my comments above, and read your logs.

Nothing will fix that.

You set a manual DNS which bodges it's way round the real issue which is borked users & dns services, but doesn't fix it.

Some solutions:

1. Fix your v9 before backup (still not seen logs from there)
2. Copy data manually and re-create users
3. Restore v9 from known good backup, then migrate

Note the templates & newrpms we want are from v9, not v10, so we can try and see how and what you broke on v9.

That is where the real issue lies. All the issues with v10 are due to broken data from v9.

We are unlikely to be able to fix your v10 currently.

SME9 logs are here: https://drive.google.com/file/d/1dntyKV2X4zvqB2EPgeDPBee6HDxFT3fI/view?usp=sharing

Sorry if I misunderstand things. But then the instructions are not clear enough for me.

I still think it is a valid way to find out the problem on the SME10 and in this way get a hint about the root course. Also in this way possible to avoid the same problem for others. But also fine to follow this other way around and try to find errors on the old server.

Offline ReetP

  • *
  • 3,731
  • +5/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #61 on: August 07, 2021, 03:13:43 PM »
I still think it is a valid way to find out the problem on the SME10 and in this way get a hint about the root course. Also in this way possible to avoid the same problem for others. But also fine to follow this other way around and try to find errors on the old server.

i've already said the likely root cause, but again:

From your logs on v10 we can already see the damage is done by a bad backup from your v9 box. Broken dns is just a symptom of that damage. The restore does not appear to complete correctly and certain vital bits are broken.

This is likely due to something breaking during install on v9. Owncloud, which does not show in your list, but we know you installed, is a candidate. I suspect something else - samba/other ldap - we can't see and you have not mentioned as well.

You installed something, it broke, you uninstalled, and thought it was ok. But wasn't.

============

As a result you aren't going to be able to repair your broken v10 in a hurry.

I'll take a look at your v9 logs and see if there is anything obvious.
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline beast

  • *
  • 245
  • +0/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #62 on: August 07, 2021, 04:15:37 PM »
i've already said the likely root cause, but again:

From your logs on v10 we can already see the damage is done by a bad backup from your v9 box. Broken dns is just a symptom of that damage. The restore does not appear to complete correctly and certain vital bits are broken.

This is likely due to something breaking during install on v9. Owncloud, which does not show in your list, but we know you installed, is a candidate. I suspect something else - samba/other ldap - we can't see and you have not mentioned as well.

You installed something, it broke, you uninstalled, and thought it was ok. But wasn't.

============

As a result you aren't going to be able to repair your broken v10 in a hurry.

I'll take a look at your v9 logs and see if there is anything obvious.

This is what I have in my documentation that I have installed at some point (some are not installed):

Code: [Select]
Denyhost http://wiki.contribs.org/Denyhosts
Awstats http://wiki.contribs.org/AWStats
WBL https://wiki.contribs.org/Email_Whitelist-Blacklist_Control
Webhosting https://wiki.koozali.org/Webhosting
Phpmyadmin https://wiki.contribs.org/PHPMyAdmin
"ModDeflate
" https://wiki.koozali.org/Mod_Deflate
Owncloud http://wiki.contribs.org/OwnCloud
Smart monitor http://wiki.contribs.org/Monitor_Disk_Health
User manager http://wiki.contribs.org/UserManager
Højere versioner af programmer http://wiki.contribs.org/Software_collections
Højere PHP version http://wiki.contribs.org/PHP_Software_Collections
remoteuseraccess http://wiki.contribs.org/Remoteuseraccess
Rkhunter http://wiki.contribs.org/Rkhunter
Letsencrypt https://wiki.contribs.org/index.php?title=Letsencrypt
SRP https://wiki.contribs.org/Email#DKIM_Setup_-_qpsmtpd_version_.3E.3D_0.96
PHP intl (needed by owncloud) yum install php73-php-intl
PHP apcu (needed by owncloud) yum --enablerepo=remi install php73-php-pecl-apcu
Git https://wiki.contribs.org/Git
Git flow yum --enablerepo=epel install gitflow
DKIM + DMARC + SPF https://wiki.contribs.org/Email#DKIM_Setup_-_qpsmtpd_version_.3E.3D_0.96
UPS https://wiki.contribs.org/Uninterruptable_Power_Supply#tab=SME_9

Offline ReetP

  • *
  • 3,731
  • +5/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #63 on: August 07, 2021, 06:12:52 PM »
OK, I just had a look through your logs and what a tangled web you have woven.

The problem with logs is they can find you out. Where was this mentioned? Not anywhere in your lists.

Quote
beastserver yum[5367]: Erased: smeserver-affa

I'd like you to open these two logs and then tell us what you have really been up to, because it isn't anything like you have outlined here so far.

Code: [Select]
cat messages.20210711114349
cat messages.20210711114350

I'm pretty sure you have seen a mountain of errors already but are hoping that you can just resolve them on v10.

Your problems started long before you ever got to v10. You are not going to resolve them on v10 either.

My little guess.

You borked your v9, probably with owncloud but I think you might have had wordpress on there somewhere at some point? maybe that was the issue and you got hacked - so decided to restore it from an affa server. But unfortunately that backup was also borked.

So you decided that you would run a standard backup and then install that on v10 and then fix it from there.

That's my rough guess - might not be quite right, but not far away.

Sorry to sound a bit cynical but you could have saved us 5 pages of running around and comments if you had told us the whole situation from the start.

My suggestion is you create a new v10, re-create all your users and groups, take your basic data and email directories and copy them to the new server, and update the file usernames and groups accordingly.

...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline Jean-Philippe Pialasse

  • *
  • 2,762
  • +11/-0
  • aka Unnilennium
    • http://smeserver.pialasse.com
Re: Loosing internet connection after post-upgrade of server
« Reply #64 on: August 07, 2021, 11:28:28 PM »
install fresh sme10
use lazy admin tool to migrate users /domains /host. 

not sure shadow migration will work as your shadow passwrd files are borked.  you might need to redefine password for your users.

would suggest not to migrate config db and network db.

then copy using rsync your ibays/users data. 

Offline beast

  • *
  • 245
  • +0/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #65 on: August 08, 2021, 08:17:41 AM »
OK, I just had a look through your logs and what a tangled web you have woven.

The problem with logs is they can find you out. Where was this mentioned? Not anywhere in your lists.

I'd like you to open these two logs and then tell us what you have really been up to, because it isn't anything like you have outlined here so far.

Code: [Select]
cat messages.20210711114349
cat messages.20210711114350

I'm pretty sure you have seen a mountain of errors already but are hoping that you can just resolve them on v10.

Your problems started long before you ever got to v10. You are not going to resolve them on v10 either.

My little guess.

You borked your v9, probably with owncloud but I think you might have had wordpress on there somewhere at some point? maybe that was the issue and you got hacked - so decided to restore it from an affa server. But unfortunately that backup was also borked.

So you decided that you would run a standard backup and then install that on v10 and then fix it from there.

That's my rough guess - might not be quite right, but not far away.

Sorry to sound a bit cynical but you could have saved us 5 pages of running around and comments if you had told us the whole situation from the start.

My suggestion is you create a new v10, re-create all your users and groups, take your basic data and email directories and copy them to the new server, and update the file usernames and groups accordingly.

I'm a little in shock here. Out of the blue, you accuse me of a hidden agenda to entice others to help me. In the best faith, I have reported on my problems. I have a perfectly working SME9 server as far as I know. With the time I have spent on this issue, I might as well have moved it all manually to another Linux distribution. The only problem I have had is that the server has become too hot a few times due to faulty ventilation which has caused it to be necessary to replace hard drives. This together with the missing updates on SME9 made me decide to try and upgrade and use SSDs on the new system.

The reason I have "Erased: smeserver-affa" is because I in the past have used Affa to move to new hardware when I upgrade to a new version as I have 2 identical servers to have as little down time as possible.

The 2 requested files are here:
https://drive.google.com/file/d/1cr8b3KmmomxwPh85s8XbctY4lytMredI/view?usp=sharing
https://drive.google.com/file/d/1V3XArH8PeV-BW8Mj74awqDcsvw45_zp3/view?usp=sharing
« Last Edit: August 08, 2021, 08:35:13 AM by beast »

Offline ReetP

  • *
  • 3,731
  • +5/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #66 on: August 08, 2021, 11:52:36 AM »
Quote
I'm a little in shock here. Out of the blue, you accuse me of a hidden agenda to entice others to help me. In the best faith, I have reported on my problems. I have a perfectly working SME9 server as far as I know.

I'm just trying to discover what happened so we can try to fix an issue.

I've just said you haven't told us the whole story, which is true.

Because of that we've wasted a lot of time we could have spent on other things.

This is a classic:

https://xyproblem.info/

If we had known all this at the start (and the reason I asked several times for more info including my gist on giving proper information) we could have told you almost immediately your original server was broken.

You must have had an idea yourself - a brief look in the logs would have told you things were not right. I'm sure you must have seen stuff when messing with owncloud and your drives.

I'm still not sure you have have told us the entire chain of events.

I don't actually care about what happened  beyond the fact that the information is important for us to diagnose the issue. It isn't to poke fun at you. It is to save us all time.

I could go back and document some of it - broken drives, affa restores, etc etc. - but it's pointless as we now know the issue, and solution.

This was never just a simple backup/restore to 10, or an issue with dns.

Yes, you could manually transfer to another linux box - you are going to have to do a manual transfer regardless - but if we had known all the info at the start we could have told you that almost immediately you opened this thread. That would have saved the most time.

I hope your migration goes well.
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline beast

  • *
  • 245
  • +0/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #67 on: August 08, 2021, 12:12:32 PM »
I'm just trying to discover what happened so we can try to fix an issue.

I've just said you haven't told us the whole story, which is true.

Because of that we've wasted a lot of time we could have spent on other things.

I have a working SME9 system as far as I know. I do not look in the logs unless I have a problem. And even if I did I do not have the knowledge to know where to look and to know if it is a serious problem when I have no problems with the server. I am no Linux expert in any way - just a daily user. I do not know what to tell you to begin with as I did not know that I had problems with my SME9 installation. How do you expect me to tell the whole story when I did not know the whole story at that point in time? I am really sorry that we all have wasted time on this problem. I just try to make this software great as I think it is a good concept.

Offline beast

  • *
  • 245
  • +0/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #68 on: August 08, 2021, 12:17:06 PM »
install fresh sme10
use lazy admin tool to migrate users /domains /host. 

not sure shadow migration will work as your shadow passwrd files are borked.  you might need to redefine password for your users.

would suggest not to migrate config db and network db.

then copy using rsync your ibays/users data.

Thank you and I hope is it OK to ask here again if I have some troubles in the process?

Offline TerryF

  • grumpy old man
  • *
  • 1,826
  • +6/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #69 on: August 08, 2021, 03:05:54 PM »
Thank you and I hope is it OK to ask here again if I have some troubles in the process?

Mate, we've all been there at one time or another...take JPs advice, start a new and ditch the sme9 backup, its tainted to the nth degree
--
qui scribit bis legit

Offline Jean-Philippe Pialasse

  • *
  • 2,762
  • +11/-0
  • aka Unnilennium
    • http://smeserver.pialasse.com
Re: Loosing internet connection after post-upgrade of server
« Reply #70 on: August 08, 2021, 04:35:20 PM »
Please do not hesitate. 
Please just open a new thread with reference to this one, so it will be easier for latter reference to see the question and the answer and not a long succession of issues. 

Offline bunkobugsy

  • *
  • 280
  • +4/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #71 on: August 09, 2021, 10:40:25 PM »
Your problems could be related to Realtek NIC using buggy r8169 driver, you could try installing kmod-r8168 that overrides it.
If you see r8169 in dmesg or messages after a boot do this:

wget https://elrepo.org/linux/elrepo/el7/x86_64/RPMS/kmod-r8168-8.049.02-1.el7_9.elrepo.x86_64.rpm
yum localinstall kmod-r8168-8.049.02-1.el7_9.elrepo.x86_64.rpm
signal-event reboot

Offline ReetP

  • *
  • 3,731
  • +5/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #72 on: August 09, 2021, 10:57:46 PM »
Your problems could be related to Realtek NIC using buggy r8169 driver,

In this particular instance, unfortunately not :lol:
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Offline TerryF

  • grumpy old man
  • *
  • 1,826
  • +6/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #73 on: August 11, 2021, 02:02:40 AM »
worth investigating

message.log
Aug  5 08:46:02 beastserver kernel: md: bind<sdb1>
Aug  5 08:46:02 beastserver kernel: r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
Aug  5 08:46:02 beastserver kernel: r8169 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
Aug  5 08:46:02 beastserver kernel: r8169 0000:04:00.0: eth0: RTL8168f/8111f at 0xffffc900068a8000, 08:60:6e:e5:3c:35, XID 08000800 IRQ 35
Aug  5 08:46:02 beastserver kernel: r8169 0000:04:00.0: eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]

and
Aug  5 08:46:04 beastserver kernel: r8169 0000:04:00.0: firmware: requesting rtl_nic/rtl8168f-1.fw
Aug  5 08:46:04 beastserver kernel: r8169 0000:04:00.0: eth1: link down
Aug  5 08:46:04 beastserver kernel: r8169 0000:04:00.0: eth1: link down
Aug  5 08:46:04 beastserver kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Aug  5 08:46:06 beastserver kernel: r8169 0000:04:00.0: eth1: link up
« Last Edit: August 11, 2021, 02:04:14 AM by TerryF »
--
qui scribit bis legit

Offline sfhwan

  • 16
  • +0/-0
Re: Loosing internet connection after post-upgrade of server
« Reply #74 on: January 13, 2023, 06:46:01 PM »
Just curious. Is the problem reported in this thread solved already?

I encountered similar issues on migrating 3 sme9 servers to sme10 last year using Migratehelper (2 of them have this issue while the 3rd one is unaffected). The affected sme10 servers are running fine until reboot (due to software update or other reasons), as described in this thread.

It is puzzling to observe that the internet connection is so drastically affected by clearing and filling the DNS entry (alternately) in the admin console. At one point, I thought the problem could be related to the dnscache template and also the resolv.conf file but could not dig deeper due to limited knowledge in this area.

Now I still have to bring the internet connection after reboot by filling or clearing the DNS entry in admin console ... sth obviously not desirable if working remotely.