zabbix-server突然down掉处理过程

参考文档

https://blog.51cto.com/seekerwolf/2357220
https://www.centos.bz/2017/08/zabbix-zbx_mem_malloc-out-of-memory/

环境

CentOS 7.5.1804
Zabbix 3.4.15

现象
zabbix web图形界面显示zabbix server is not running,进入zabbix服务器 systemctl status zabbix-server显示服务down掉

排查
1.试着重新启动zabbix-server,启动不成功
2.查看日志/var/log/zabbix/zabbix_server.log

首先发现如下错误

zabbix_server [7988]: cannot open log: cannot create semaphore set: [28] No space left on device
zabbix_server [8008]: cannot open log: cannot create semaphore set: [28] No space left on device
zabbix_server [8022]: cannot open log: cannot create semaphore set: [28] No space left on device

根据网上资料修改kernel.sem参数,默认如下

[root@zabbix zabbix]# cat /proc/sys/kernel/sem
250     32000   32      128

其中含义分别如下

250       SEMMSL    max semaphores per array     信号集容纳最大信号数量   
32000     SEMMNS    max semaphores system wide   所有信号的最大数量 
32        SEMOPM    max ops per semop call       调用单个信号集中最大信号数量 
128       SEMMNI    max number of arrays         信号集的最大值

直接增加1倍,并使其生效

[root@zabbix zabbix]# echo 'kernel.sem = 500 64000 64 256' >> /etc/sysctl.conf
[root@zabbix zabbix]# sysctl -p
kernel.sem = 500 64000 64 256

最后杀掉zabbix占用的共享信道

[root@zabbix zabbix]# ipcs -s|wc -l
143
[root@zabbix zabbix]# ipcs -s | grep zabbix | awk '{print $2}' | xargs -n 1 ipcrm -s
[root@zabbix zabbix]# ipcs -s|wc -l
10

3.第二次启动zabbix-server发现如下错误

8599:20200401:140301.791 using configuration file: /etc/zabbix/zabbix_server.conf
8599:20200401:140301.798 current database version (mandatory/optional): 03040000/03040007
8599:20200401:140301.799 required mandatory version: 03040000
8599:20200401:140302.063 __mem_malloc: skipped 0 asked 120 skip_min 4294967295 skip_max 0
8599:20200401:140302.063 [file:dbconfig.c,line:90] zbx_mem_malloc(): out of memory (requested 120 bytes)
8599:20200401:140302.063 [file:dbconfig.c,line:90] zbx_mem_malloc(): please increase CacheSize configuration parameter

报错显示zabbix分配内存时出现了oom,默认zabbix服务器内存是够用的,幸好有提示请增加CacheSize,查资料发现可以在配置文件里增加该值,默认是8M,增加到1024M,第三次启动zabbix-server终于能正常启动了。

[root@zabbix zabbix]# grep CacheSize /etc/zabbix/zabbix_server.conf 
### Option: VMwareCacheSize
# VMwareCacheSize=8M
### Option: CacheSize
CacheSize=1024M
### Option: HistoryCacheSize
# HistoryCacheSize=16M
### Option: HistoryIndexCacheSize
# HistoryIndexCacheSize=4M
### Option: TrendCacheSize
# TrendCacheSize=4M
### Option: ValueCacheSize
# ValueCacheSize=8M