Koozali.org: home of the SME Server
Obsolete Releases => SME 8.x Contribs => Topic started by: Jáder on October 10, 2013, 11:55:58 PM
-
I have a Zabbix server running lattest (2.0.9) package from FWS (Hi Daniel).
Right now I have to let it down (but I need it running) because it spikes my LA and my linux server start to be very slow.
Watch this:
[root@leopardo ~]# uptime
18:32:08 up 2 days, 4:11, 1 user, load average: 5.73, 5.28, 5.29
[root@leopardo ~]# /etc/init.d/zabbix-server stop
Shutting down zabbix server: [ OK ]
[root@leopardo ~]# uptime
18:32:28 up 2 days, 4:11, 1 user, load average: 5.58, 5.28, 5.28
[root@leopardo ~]# uptime
18:32:35 up 2 days, 4:11, 1 user, load average: 5.53, 5.27, 5.28
[root@leopardo ~]# uptime
18:33:23 up 2 days, 4:12, 1 user, load average: 4.05, 4.94, 5.17
[root@leopardo ~]# uptime
18:37:19 up 2 days, 4:16, 1 user, load average: 0.25, 2.37, 4.06
Using top , htop and vmstat I can see the disc bounding status... top show the mysql doiong lots of IOs.
So I have a mysql tunning problem or a Zabbix misconfiguration problem.
From Zabbix console I can see:
Parameter Value Details
Zabbix server is running Yes localhost:10051
Number of hosts (monitored/not monitored/templates) 76 14 / 7 / 55
Number of items (monitored/disabled/not supported) 1497 798 / 51 / 648
Number of triggers (enabled/disabled)[problem/unknown/ok] 470 438 / 32 [6 / 0 / 432]
Number of users (online) 9 1
Required server performance, new values per second 10.08 -
PHP option memory_limit 50M Minimum required PHP memory limit is 128M (configuration option "memory_limit")
The last line error, must be a Zabbix problem because:
[root@leopardo ~]# grep memory_limit /etc/php.ini
memory_limit = 250M
[root@leopardo ~]#
Anyone (Daniel?) can help me to point what´s the next step to debug this problem?
Regards
Jáder
-
Hi Jadder.
There's obviously something wrong with your setup. Either you're very low on RAM, or your disks are very slow. To give you an idea, I'm at 100 item/sec, with an average load of only 0.3 (and my Zabbix server is running as a KVM box, on top of a qcow2 image, which is itself stored on a replicated GlusterFS pool, so really, nothing optimized for performances).
First thing you should try: convert your database to InnoDB, it helped me lowering the load as InnoDB locks are more fine. To do so, stop your Zabbix server and:
db configuration setprop mysqld InnoDB enabled
expand-template /etc/my.cnf
sv t /service/mysqld
sleep 5
for T in $(mysql zabbixserverdb -e 'show tables' | grep -v Tables_in_zabbixserverdb); do mysql zabbixserverdb -e "alter table $T ENGINE=InnoDB";done
(Take a bakup of your database before just in case).
There're a lot more optimizations you could try but for such a low activity server (only ~10 items/sec) it really shouldn't be necessary. Just tell me if switching to InnoDB helped.
Regards, Daniel
-
Hi Daniel
Thanks for your reply.
I'm in process of convert Zabbix tables to InnoDB... has been running for a couple of hours now.
I've discovered my MySQL zabbixDB is big... dump was 1GB sql file!
I suspect this problem started after an upgrade of Zabbix.
I have 8GB RAM on server... but still running SME8 32bits with PAE kernel.
HW is a HP N40L w/2x1TB SATA drive (Seagate w/64MB cache) .
Tested spped of hdd with hdparm and it appears to report a normal number (>100).
Let's wait conversion finish... more on a few hours.
Regards
Jáder
-
I'm in process of convert Zabbix tables to InnoDB... has been running for a couple of hours now.
I've discovered my MySQL zabbixDB is big... dump was 1GB sql file!
Quite strange, shouldn't take that long for only 1G (yes, Zabbix eats disk space, my dump is ~12Go). You probably have a slow disk access.
I suspect this problem started after an upgrade of Zabbix.
Any error running the upgrade script ? on thing which could explain such bad performances could be that the indexes haven't been created correctly
I have 8GB RAM on server... but still running SME8 32bits with PAE kernel.
8Go should be more than enough. x86_64 is better of course, but for such a low traffic Zabbix install, it shouldn't make any difference
Once the DB is converted, check if perf is OK. If it's still not fine, here's the my.cnf I use on my Zabbix server (it's not a SME Server but a CentOS 6.4 x86_64):
[mysqld]
pid-file=/var/run/mysqld/mysqld.pid
basedir=/usr
datadir=/var/lib/mysql
innodb_data_home_dir = /var/lib/mysql/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /var/lib/mysql/
innodb_buffer_pool_size = 1500M
innodb_additional_mem_pool_size = 256M
innodb_log_file_size = 500M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 0
innodb_lock_wait_timeout = 180
innodb_file_per_table
innodb_flush_method=O_DIRECT
max_allowed_packet=128M
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
user=mysql
skip-name-resolve
default-storage-engine = MyISAM
max_connections = 300
table_cache = 2048
sort_buffer_size = 512k
join_buffer_size = 512k
read_buffer_size = 512k
read_rnd_buffer_size=256k
myisam_sort_buffer_size = 64M
skip-external-locking
key_buffer_size = 256M
thread_stack = 256K
thread_cache_size = 4
tmp_table_size=256M
max_heap_table_size=512M
query_cache_limit = 1M
query_cache_size = 256M
query_cache_type = 1
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
[isamchk]
key_buffer = 16M
-
Daniel
I suspect an index problem could be the cause. How to verify that?
Here is my conversion timing:
root@leopardo ~]# ps gaux|grep "mysql zabbixdb"
root 1079 0.0 0.0 4032 704 pts/1 S+ 09:05 0:00 grep --color=auto mysql zabbixdb
root 25648 0.0 0.0 8416 1788 pts/0 S+ 05:34 0:00 mysql zabbixdb -e alter table history ENGINE=InnoDB
[root@leopardo ~]# ps gaux|grep "mysql zabbixdb"
root 1351 0.0 0.0 8416 1784 pts/0 S+ 09:13 0:00 mysql zabbixdb -e alter table history_str ENGINE=InnoDB
root 1527 0.0 0.0 4032 704 pts/1 S+ 09:17 0:00 grep --color=auto mysql zabbixdb
[root@leopardo ~]# ps gaux|grep "mysql zabbixdb"
root 1789 0.0 0.0 8416 1784 pts/0 S+ 09:21 0:00 mysql zabbixdb -e alter table history_uint ENGINE=InnoDB
root 3458 0.0 0.0 4036 804 pts/1 S+ 10:03 0:00 grep --color=auto mysql zabbixdb
[root@leopardo ~]# uptime
10:03:25 up 2 days, 19:42, 2 users, load average: 1.81, 1.77, 1.73
[root@leopardo ~]# ps gaux|grep "mysql zabbixdb"
root 1789 0.0 0.0 8416 1784 pts/0 S+ 09:21 0:00 mysql zabbixdb -e alter table history_uint ENGINE=InnoDB
root 3859 0.0 0.0 4036 808 pts/1 S+ 10:13 0:00 grep --color=auto mysql zabbixdb
[root@leopardo ~]# ps gaux|grep "mysql zabbixdb"
root 5107 0.0 0.0 4032 716 pts/1 S+ 10:40 0:00 grep --color=auto mysql zabbixdb
So I started Zabbix again and:
[root@leopardo ~]# uptime
10:40:44 up 2 days, 20:19, 2 users, load average: 0.41, 1.22, 1.53
(reverse-i-search)`st': affa --status
[root@leopardo ~]# /etc/init.d/zabbix-server start
Starting zabbix server: [ OK ]
[root@leopardo ~]# uptime
11:05:14 up 2 days, 20:44, 2 users, load average: 2.53, 2.33, 2.14
So performance still a problem!
Here is my.cnf
#
# * The SMEServer doesn't have any need for initscript-only
# options, so we don't use mysql_server and mysql.server
# sections.
#
###########################################################
[mysqld]
pid-file=/var/run/mysqld/mysqld.pid
basedir=/usr
datadir=/var/lib/mysql
innodb_data_home_dir = /var/lib/mysql/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /var/lib/mysql/
innodb_log_arch_dir = /var/lib/mysql/
innodb_buffer_pool_size = 16M
innodb_additional_mem_pool_size = 2M
innodb_log_file_size = 5M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50
socket=/var/lib/mysql/mysql.sock
# networking is enabled
log-error=/var/log/mysqld.log
max_allowed_packet=16M
user=mysql
[mysqld_safe]
-
So, InnoDB already cut your load by nearly a half, not that bad ;-)
You should try to add a custom-template for /etc/my.cnf, and increase innodb_buffer_pool_size (you can use 2000M) and innodb_additional_mem_pool_size (256M). You should also add the innodb_flush_method=O_DIRECT directive (which prevent InnoDB data to be cached twice, in the host page cache and the MySQL internal cache). There are other parameters which could help, but those are the ones which probably will have the bigest impact on perf.
Regards, Daniel
-
Daniel,
I´ve created custom my.cnf with those suggested parameters.
It appears to have changed nothing in LA... still close to 2.
I forgot to tell you: this zabbix server has several proxies sending data to him and have been down for months. I´m not sure if this load isn´t normal because it´s receiving old data!
And what about the indexes of zabbix? How can I be sure they were all created on those multiple update/upgrades ?
Where should I find errors about missing indexes?
I´ve seen /var/log/zabbix/zabbix_server.log but there´re nothing usefull there.
Regards
Jáder
-
Daniel,
I´ve created custom my.cnf with those suggested parameters.
It appears to have changed nothing in LA... still close to 2.
Well, not sure, you can try the other parameters from my sample config to see if it changes anything. Still, even without any optimization, you shouldn't have such a high load with the small set of data your Zabbix server receive. You probably have something wrong. Maybe disks working in legacy IDE mode instead of AHCI ?
I forgot to tell you: this zabbix server has several proxies sending data to him and have been down for months. I´m not sure if this load isn´t normal because it´s receiving old data!
Shouldn't be an issue. The default config for proxy is to only keep 24 hours of data if they cant contact the server
And what about the indexes of zabbix? How can I be sure they were all created on those multiple update/upgrades ?
I'm not sure how to check that, but did you see any error while running the script to upgrade from 1.8 to 2.0 (the database script /usr/share/doc/zabbix-server-2.0.9/dbpatches/2.0/mysql/upgrade has to be run manually to upgrade from 1.8 to 2.0, see the online doc on zabbix.com)
-
I´ll try to mimic your settings later.
For now I have tested hdd speed and they appear to be ok:
/dev/md1:
Timing cached reads: 5424 MB in 2.00 seconds = 2714.22 MB/sec
Timing buffered disk reads: 100 MB in 1.07 seconds = 93.58 MB/sec
[root@leopardo my.cnf]# hdparm -tT /dev/md1
/dev/md1:
Timing cached reads: 5608 MB in 2.00 seconds = 2803.39 MB/sec
Timing buffered disk reads: 100 MB in 1.19 seconds = 84.36 MB/sec
I´ll try to find out if I lost some upgrade script or got an unnoticed error.
-
ok... fresh new info.
I'm sure the problem is IO bounding. When Zabbix Server is running, I've +45% IOwait (as seen in Sysmon graph).
I'm not sure why Zabbix is doing that with so low new values per sec (<10).
I think I'll search zabbix support lists... if you have no other tips Daniel.
Regards
Jáder
-
I´ve runned upgrade scripts for 1.8 to 2.0 and 2.0 to 2.0.9 but now my web interface is gone! :(
[root@leopardo ~]# rpm -qa|grep zabbix
smeserver-zabbix-server-0.1-12.el4.sme
zabbix-2.0.9-1.el5.fws
zabbix-web-2.0.9-1.el5.fws
zabbix-server-2.0.9-1.el5.fws
zabbix-server-mysql-2.0.9-1.el5.fws
zabbix-web-mysql-2.0.9-1.el5.fws
I also changed innodb_buffer_pool_size from 2000M to 3000M .
The IO loads appears to be gone... but w/o web interface... it´s useless anyway.
-
I´ve runned upgrade scripts for 1.8 to 2.0 and 2.0 to 2.0.9
How did you run this ? Also, there's an upgrade script for 1.8 -> 2.0 but not for 2.0 -> 2.0.9, so, I don't know what you tried. You should have done something like this (as root)
cd /usr/share/doc/zabbix-server-2.0.9/dbpatches/2.0/mysql/
./upgrade zabbixdb
but now my web interface is gone! :(
What do you mean by "gone" ?
If you have damaged your database, you probably have to restore your last backup and try the upgrade again
-
How did you run this ? Also, there's an upgrade script for 1.8 -> 2.0 but not for 2.0 -> 2.0.9, so, I don't know what you tried. You should have done something like this (as root)
cd /usr/share/doc/zabbix-server-2.0.9/dbpatches/2.0/mysql/
./upgrade zabbixdb
I´ve done this...but it generates an error:
[root@leopardo my.cnf]# cd /usr/share/doc/zabbix-server-2.0.9/dbpatches/2.0/mysql/
[root@leopardo mysql]# ./upgrade zabbixdb
WARNING: backup your database before performing upgrade
This is an UNSUPPORTED Zabbix upgrade script from 1.8 to 2.0 for MySQL
It does the following things:
1. Updates indexes that might require changes;
2. Patches the database from 1.8 schema to 2.0 schema;
3. Adds 'Disabled' and 'Debug' usergroup if any missing;
4. Checks for hosts not belonging to any group and adds them to one if any found.
Usage: pass required MySQL parameters to this script (like database, user, password etc).
Continue ? (y/n) y
Patching the database
ERROR 1033 (HY000) at line 1: Incorrect information in file: './zabbixdb/acknowledges.frm'
Failed to patch Zabbix database. Restore from backup
But if I start it I can see the zabbix server running and with a low LA:
[root@leopardo my.cnf]# uptime
11:55:29 up 1:58, 2 users, load average: 0.14, 0.13, 0.18
[root@leopardo my.cnf]# ps gaux|grep zabbix
zabbix 5891 0.0 0.0 47164 2512 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5914 0.0 0.0 47164 2080 ? S 10:33 0:01 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5915 0.0 0.0 47164 1136 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5916 0.0 0.0 49704 2284 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5918 0.0 0.0 49704 2212 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5919 0.0 0.0 49704 2212 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5920 0.0 0.0 49704 2276 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5942 0.0 0.0 49704 2276 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5944 0.0 0.0 49736 2984 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5948 0.0 0.0 47252 1288 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5949 0.0 0.0 47276 2004 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5951 0.0 0.0 47252 1288 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5954 0.0 0.0 47276 2024 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5955 0.0 0.0 47276 2000 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5959 0.0 0.0 47548 1368 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5961 0.0 0.0 47164 1152 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5963 0.0 0.0 47164 1244 ? S 10:33 0:01 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5965 0.0 0.0 47164 1176 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5967 0.0 0.0 47172 1180 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5969 0.0 0.0 49608 2212 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5970 0.0 0.0 47300 1548 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5984 0.0 0.0 47164 1168 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5987 0.0 0.0 47168 1152 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
zabbix 5990 0.0 0.0 47164 960 ? S 10:33 0:00 zabbix_server_mysql -c /etc/zabbix/zabbix_server.conf
root 13796 0.0 0.0 4036 804 pts/1 R+ 13:01 0:00 grep --color=auto zabbix
[root@leopardo my.cnf]# uptime
13:01:59 up 3:04, 2 users, load average: 0.10, 0.12, 0.14
[root@leopardo my.cnf]#
What do you mean by "gone" ?
If you have damaged your database, you probably have to restore your last backup and try the upgrade again
I mean it returns a 404 error on web interface.
I think I´ve this error before, somehting about expand templates ...
EDIT: just drop database zabbixdb and return it from backup. mysqlcheck now say ok.
But no web interface so far! :(