1.被动健康检查
Nginx自带有健康检查模块:ngx_http_upstream_module,可以做到基本的健康检查,配置如下:
upstream cluster{ server 172.16.0.23:80 max_fails=1 fail_timeout=10s; server 172.16.0.24:80 max_fails=1 fail_timeout=10s; # max_fails=1和fail_timeout=10s 表示在单位周期为10s钟内,中达到1次连接失败,那么接将把节点标记为不可用,并等待下一个周期(同样时常为fail_timeout)再一次去请求,判断是否连接是否成功。 # fail_timeout为10s,max_fails为1次。 }server { listen 80; server_name xxxxxxx.com; location / { proxy_pass http://cluster; }}
Nginx只有当有访问时后,才发起对后端节点探测。如果本次请求中,节点正好出现故障,Nginx依然将请求转交给故障的节点,然后再转交给健康的节点处理。所以不会影响到这次请求的正常进行。但是会影响效率,因为多了一次转发,而且自带模块无法做到预警。
2.主动健康检查(需使用第三方模块)
主动地健康检查,nignx定时主动地去ping后端的服务列表,当发现某服务出现异常时,把该服务从健康列表中移除,当发现某服务恢复时,又能够将该服务加回健康列表中。淘宝有一个开源的实现nginx_upstream_check_module模块
官网:http { upstream cluster1 { # simple round-robin server 192.168.0.1:80; server 192.168.0.2:80; check interval=3000 rise=2 fall=5 timeout=1000 type=http; check_http_send "HEAD / HTTP/1.0\r\n\r\n"; check_http_expect_alive http_2xx http_3xx; } upstream cluster2 { # simple round-robin server 192.168.0.3:80; server 192.168.0.4:80; check interval=3000 rise=2 fall=5 timeout=1000 type=http; check_keepalive_requests 100; check_http_send "HEAD / HTTP/1.1\r\nConnection: keep-alive\r\n\r\n"; check_http_expect_alive http_2xx http_3xx; } server { listen 80; location /1 { proxy_pass http://cluster1; } location /2 { proxy_pass http://cluster2; } location /status { check_status; access_log off; allow SOME.IP.ADD.RESS; deny all; } }}
3.集成第三方模块部署
3.1、下载nginx_upstream_check_module模块
#进入nginx安装目录 cd /usr/local/nginx #下载nginx_upstream_check_module模块 wget https://codeload.github.com/yaoweibin/nginx_upstream_check_module/zip/master #wget https://github.com/yaoweibin/nginx_upstream_check_module/archive/master.zip #解压 unzip mastercd nginx-1.12 # 进入nginx的源码目录# -p0,是“当前路径” -p1,是“上一级路径”patch -p1 < ../nginx_upstream_check_module-master/check_1.11.5+.patch#nginx -V 可以查看原有配置 输出 ./configure --prefix=/usr/local/nginx#增加upstream_check模块./configure --prefix=/usr/local/nginx --add-module=../nginx_upstream_check_module-mastermake/usr/local/nginx/sbin/nginx -t # 检查下是否有问题注意 check版本和Nginx版本要求有限制 1.12以上版本的nginx,补丁为check_1.11.5+.patch 具体参考github https://github.com/yaoweibin/nginx_upstream_check_module
3.2.修改配置文件,让nginx_upstream_check_module模块生效
http { upstream cluster1 { # simple round-robin server 192.168.0.1:80; server 192.168.0.2:80; check interval=3000 rise=2 fall=5 timeout=1000 type=http; check_http_send "HEAD / HTTP/1.0\r\n\r\n"; check_http_expect_alive http_2xx http_3xx; } upstream cluster2 { # simple round-robin server 192.168.0.3:80; server 192.168.0.4:80; check interval=3000 rise=2 fall=5 timeout=1000 type=http; check_keepalive_requests 100; check_http_send "HEAD / HTTP/1.1\r\nConnection: keep-alive\r\n\r\n"; check_http_expect_alive http_2xx http_3xx; } server { listen 80; location /1 { proxy_pass http://cluster1; } location /2 { proxy_pass http://cluster2; } location /status { check_status; access_log off; allow SOME.IP.ADD.RESS; deny all; } }}
3.3重载nginx
访问http://nginx/nstatus
人为把其中的一个节点关掉刷新http://nginx/nstatus
udp反向代理时健康检查的问题,另一位大神在上面nginx_upstream_check_module的基础上作了修改,实现了在第4层的代理tcp和udp时的健康检查。