一、keepalived简介:
keepalived是一个类似于layer3, 4 & 5交换机制的软件,也就是我们平时说的第3层、第4层和第5层交换。Keepalived的作用是检测web服务器的状态,如果有一台web服务器死机,或工作出现故障,Keepalived将检测到,并将有故障的web服务器从系统中剔除,当web服务器工作正常后Keepalived自动将web服务器加入到服务器群中,这些工作全部自动完成,不需要人工干涉,需要人工做的只是修复故障的web服务器。
工作原理
Layer3,4&5工作在IP/TCP协议栈的IP层,TCP层,及应用层,原理分别如下:
Layer3:Keepalived使用Layer3的方式工作式时,Keepalived会定期向服务器群中的服务器发送一个ICMP的数据包(既我们平时用的Ping程序),如果发现某台服务的IP地址没有激活,Keepalived便报告这台服务器失效,并将它从服务器群中剔除,这种情况的典型例子是某台服务器被非法关机。Layer3的方式是以服务器的IP地址是否有效作为服务器工作正常与否的标准。
Layer4:如果您理解了Layer3的方式,Layer4就容易了。Layer4主要以TCP端口的状态来决定服务器工作正常与否。如web server的服务端口一般是80,如果Keepalived检测到80端口没有启动,则Keepalived将把这台服务器从服务器群中剔除。
Layer5:Layer5就是工作在具体的应用层了,比Layer3,Layer4要复杂一点,在网络上占用的带宽也要大一些。Keepalived将根据用户的设定检查服务器程序的运行是否正常,如果与用户的设定不相符,则Keepalived将把服务器从服务器群中剔除。
二、实验步骤:
1.创建管理节点在node1上,建立双机互信node1和node2,然后同步时间,安装keepalived
1
2
3
4
|
[root@node1~] # ansible all -m yum -a 'name=keepalived state=present' [root@node1keepalived] # rpm -qc keepalived /etc/keepalived/keepalived .conf // 生成的主配置文件 /etc/sysconfig/keepalived |
2.在node1上配置文件需要做一下修改
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
global_defs{ notification_email { root@localhost // 收邮件人,可以定义多个 } notification_email_from kaadmin@localhost // 发邮件人可以伪装 smtp_server 127.0.0.1 // 发送邮件的服务器地址 smtp_connect_timeout 30 // 连接超时时间 router_id LVS_DEVEL } vrrp_instanceVI_1 { // 每一个vrrp_instance就是定义一个虚拟路由器的 state MASTER // 由初始状态状态转换为master状态 interface eth0 virtual_router_id 51 // 虚拟路由的 id 号,一般不能大于255的 priority 100 // 初始化优先级 advert_int 1 // 初始化通告 authentication { // 认证机制 auth_type PASS auth_pass 1111 // 密码 } virtual_ipaddress { // 虚拟地址vip 172.16.2.8 } } |
3.把配置文件复制到node2上一份,并修改初始状态和优先级
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
[root@node1keepalived] # scp keepalived.conf node2:/etc/keepalived/ [root@node2~] # cd /etc/keepalived/ [root@node2keepalived] # ls keepalived.conf [root@node2keepalived] # vim keepalived.conf vrrp_instanceVI_1 { state BACKUP // 初始化状态 interface eth0 virtual_router_id 51 priority 99 // 优先级,一定要比master的优先级要低 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 172.16.2.8 } } |
在node1上开始启动服务[root@node1 ~]# servicekeepalived start
然后检查ip地址
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
[root@node1~] # ip addr show 1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link /loopback 00:00:00:00:00:00 brd00:00:00:00:00:00 inet 127.0.0.1 /8 scope host lo inet6 ::1 /128 scope host valid_lft forever preferred_lft forever 2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000 link /ether 00:0c:29:4e:22:fb brdff:ff:ff:ff:ff:ff inet 172.16.2.1 /16 brd 172.16.255.255 scopeglobal eth0 inet 172.16.2.8 /32 scopeglobal eth0 inet 172.16.10.8 /16 brd 172.16.255.255 scopeglobal secondary eth0:0 inet6 fe80::20c:29ff:fe4e:22fb /64 scopelink valid_lft forever preferred_lft forever 3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link /ether 2e:79:b3:b2:3e:31 brdff:ff:ff:ff:ff:ff |
4.现在把node1的keepalived停掉
[root@node1keepalived]# service keepalived stop
Stoppingkeepalived: [ OK ]
验证node2是否把virtual_ipaddress拿走
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[root@node2~] # ip addr show 1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link /loopback 00:00:00:00:00:00 brd00:00:00:00:00:00 inet 127.0.0.1 /8 scope host lo inet6 ::1 /128 scope host valid_lft forever preferred_lft forever 2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000 link /ether 00:0c:29:74:c7:7b brdff:ff:ff:ff:ff:ff inet 172.16.2.16 /16 brd172.16.255.255 scope global eth0 inet 172.16.2.8 /32 scopeglobal eth0 inet6 fe80::20c:29ff:fe74:c77b /64 scopelink valid_lft forever preferred_lft forever 3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link /ether0a :b1:ef:7b:93:18 brd ff:ff:ff:ff:ff:ff |
验证成功
可以在配置文件中手动通过vrrp_script定义一个外围的检测机制,并在vrrp_instance中通过定义track_script来追踪脚本执行过程,实现节点转移
实验测试在/etc/keepalived/keepalived.conf中做一下修改
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
global_defs{ notification_email { root@localhost } notification_email_from kaadmin@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id LVS_DEVEL } vrrp_script chk_maintainace { // 检测机制的脚本名称为chk_maintainace script "[[ -e/etc/keepalived/down ]] && exit 1 || exit 0" // 可以是个脚本路径,也可以是脚本命令 interval 1 // 每隔1秒中检测一次 weight -2 // 优先级减2 } vrrp_instanceVI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 172.16.2.8 } track_script { // 调用外围脚本,追踪外围脚本执行过程 chk_maintainace } } [root@node1 keepalived] # touch down //在node1上创建down文件 [root@node1 keepalived] # ls down keepalived.conf keepalived.conf.bak |
在node2上做同样的操作,但不创建down文件,之后一起重启服务
1
2
3
4
5
6
7
|
[root@node1 keepalived] # ansible all -m shell -a 'service keepalivedrestart' node2.magedu.com| success | rc=0 >> Stoppingkeepalived: [FAILED] Startingkeepalived: [ OK ] node1.magedu.com| success | rc=0 >> Stoppingkeepalived: [ OK ] Startingkeepalived: [ OK ] |
进行检测
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[root@node2keepalived] # ip addr show 1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link /loopback 00:00:00:00:00:00 brd00:00:00:00:00:00 inet 127.0.0.1 /8 scope host lo inet6 ::1 /128 scope host valid_lft forever preferred_lft forever 2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000 link /ether 00:0c:29:74:c7:7b brdff:ff:ff:ff:ff:ff inet 172.16.2.16 /16 brd172.16.255.255 scope global eth0 inet 172.16.2.8 /32 scopeglobal eth0 inet6 fe80::20c:29ff:fe74:c77b /64 scopelink valid_lft forever preferred_lft forever 3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link /ether0a :b1:ef:7b:93:18 brd ff:ff:ff:ff:ff:ff |
此时将node1中/etc/keepalived/下的down删除,进行查看
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
[root@node1keepalived] # ls down keepalived.conf keepalived.conf.bak [root@node1keepalived] # rm down rm :remove regular empty file
[root@node1keepalived] # ls keepalived.conf keepalived.conf.bak [root@node1 keepalived] # ip addr show 1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link /loopback 00:00:00:00:00:00 brd00:00:00:00:00:00 inet 127.0.0.1 /8 scope host lo inet6 ::1 /128 scope host valid_lft forever preferred_lft forever 2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000 link /ether 00:0c:29:4e:22:fb brdff:ff:ff:ff:ff:ff inet 172.16.2.1 /16 brd 172.16.255.255 scopeglobal eth0 inet 172.16.2.8 /32 scopeglobal eth0 inet 172.16.10.8 /16 brd 172.16.255.255scope global secondary eth0:0 inet6 fe80::20c:29ff:fe4e:22fb /64 scopelink valid_lft forever preferred_lft forever 3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link /ether0a :bd:4f:a9:ed:67 brd ff:ff:ff:ff:ff:ff |
验证成功
三、详细介绍以下四个功能实现操作
1.如何在状态转换时进行通知?
2.如何配置Ipvs?
3.如何对某特定服务做高可用
4.如何实现基于多虚拟路由的master/master模型?
1.要在状态转换是进行通知,需要定义通知脚本可以在
vrrp_sync_group{
}中定义,也可以在
vrrp_instance{
}中定义
通过man keepalived命令可以查看通知脚本定义的两种方法
第一种
# to MASTER transition
notify_master /path/to_master.sh
# to BACKUP transition
notify_backup /path/to_backup.sh
# FAULT transition
notify_fault "/path/fault.sh VG_1"
第二种
#arguments
# $1 ="GROUP"|"INSTANCE"
# $2 = name of group or instance
# $3 = target state of transition
# ("MASTER"|"BACKUP"|"FAULT")
notify /path/notify.sh
例如:
转换为MASTER的状态通知
1
2
3
4
5
6
7
8
9
10
11
|
#!/bin/bash # vip=172.16.2.8 contact= 'root@localhost' thisip= ifconfigeth0 | awk '/inet addr:/{print $2}' | awk -F: '{print $2}'
notify(){ mailbody= "vrrp transaction, $vipfloated to $thisip." subject= "$thisip is to be $vipmaster" echo $mailbody | mail -s $subject $contact } notify |
其他状态转换类似
下面用一个脚本notify.sh实现状态转换通知的简单示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
#!/bin/bash #Author: MageEdu <linuxedu@foxmail.com> #description: An example of notify script # vip=172.16.2.8 contact= 'root@localhost' notify(){ mailsubject= " hostname to be $1: $vipfloating" mailbody= " date '+%F %H:%M:%S': vrrptransition, hostname changed to be $1" echo $mailbody | mail -s "$mailsubject" $contact } case "$1" in master) notify master exit 0 ;; backup) notify backup exit 0 ;; fault) notify fault exit 0 ;; *) echo 'Usage: basename $0{master|backup|fault}' exit 1 ;; esac |
进行测试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
[root@node1keepalived] # ./notify.sh backup [root@node1keepalived] # mail HeirloomMail version 12.4 7 /29/08 . Type ? forhelp. "/var/spool/mail/root" :6 messages 1 new 6 unread U 1centos@stu2.magedu.c Sat Aug 1709:34 17 /644 "*** SECURITY" U 2Cron Daemon Tue Aug 2700:01 22 /747 "Cron <root@s" U 3Cron Daemon Fri Aug 3000:01 22 /747 "Cron <root@s" U 4Mail Delivery System Fri Aug 3017:42 91 /2751 "Undelivered " U 5Cron Daemon Tue Sep 3 00:01 22 /747 "Cron<root@s" >N 6 root Thu Sep 26 21:19 18 /700 "node1.magedu" &6 Message 6: Fromroot@node1.magedu.com Thu Sep 2621:19:32 2013 Return-Path:<root@node1.magedu.com> X-Original-To:root@localhost Delivered-To:root@localhost.magedu.com Date:Thu, 26 Sep 2013 21:19:32 +0800 To:root@localhost.magedu.com Subject:node1.magedu.com to be backup: 172.16.2.8 floating User-Agent:Heirloom mailx 12.4 7 /29/08 Content-Type:text /plain ; charset=us-ascii From:root@node1.magedu.com (root) Status:R 2013-09-26 21:19:32: vrrp transition, node1.magedu.com changed to bebackup &quit Held6 messages in /var/spool/mail/root Youhave mail in /var/spool/mail/root |
通过传参数master|backup|fault验证都可以成功
在配置文件keepalived.conf中进行脚本调用
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
vrrp_instanceVI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 172.16.2.8 } track_script { chk_maintainace } notify_master "/etc/keepalived/notify.shmaster" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" } |
为node2提供同样的配置然后进行测试
[root@node1keepalived]# ls
down keepalived.conf keepalived.conf.bak notify.sh
[root@node1keepalived]# rm -f down
[root@node1keepalived]# mail
>N18 root Thu Sep 2621:57 18/700 "node1.magedu.comto be master: 172.16.2.8 floating"截取了一条
验证都可以成功
2、如何配置ipvs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
virtual_server172.16.2.8 80{ delay_loop 6 lb_algo rr lb_kind NAT nat_mask 255.255.0.0 persistence_timeout 0 protocol TCP # real_server 172.16.2.1 80 { weight 1 HTTP_GET { url { path / state_code 200 } connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } real_server 172.16.2.16 80 { weight 1 HTTP_GET { url { path / state_code 200 } connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } |
在node2上做同样的修改,启动httpd服务,keepalived能自动生成规则,然后查看ipvsadm规则
1
2
3
4
5
6
7
|
[root@node1keepalived] # ipvsadm -L -n IPVirtual Server version 1.2.1 (size=4096) ProtLocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 172.16.2.8:80 rr -> 172.16.2.1:80 Local 1 0 0 -> 172.16.2.16:80 Masq 1 0 0 |
3、如何对某特定服务做高可用?以nginx为例进行讲解
在两个节点上安装nginx
[root@node1~]# ansible all -m yum -a 'name=nginx state=present'
启动nginx服务,启动之前注意要停止httpd服务
1
2
3
4
5
|
[root@node1~] # ansible all -m shell -a 'service nginx start' node2.magedu.com| success | rc=0 >> Startingnginx: [ OK ] node1.magedu.com| success | rc=0 >> Startingnginx: [ OK ] |
对node1和node2中/etc/keepalived/下的notify.sh脚本进行修改
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
#!/bin/bash #Author: MageEdu <linuxedu@foxmail.com> #description: An example of notify script # vip=172.16.2.8 contact= 'root@localhost' notify(){ mailsubject= " hostname to be $1: $vipfloating" mailbody= " date '+%F %H:%M:%S': vrrptransition, hostname changed to be $1" echo $mailbody | mail -s "$mailsubject" $contact } case "$1" in master) notify master /etc/rc .d /init .d /nginx start exit 0 ;; backup) notify backup /etc/rc .d /init .d /nginx stop exit 0 ;; fault) notify fault /etc/rc .d /init .d /nginx stop exit 0 ;; *) echo 'Usage: basename $0`{master|backup|fault}' exit 1 ;; esac |
然后启动keepalived服务,可以看到在node1上80端口开始启用
[root@node1keepalived]# ss -tanl | grep :80
LISTEN 0 128 *:80 *:*
然后在/etc/keepalive/下创建down文件,看nginx服务是否可以转移到node2上
1
2
3
4
5
6
7
8
|
[root@node1keepalived] # ls keepalived.conf keepalived.conf.bak notify.sh [root@node1keepalived] # touch down [root@node1keepalived] # ss -tanl | grep :80 [root@node1keepalived] # 在node2上进行查看 [root@node2keepalived] # ss -tanl | grep :80 LISTEN 0 128 *:80 *:* |
验证成功,说明实现了nginx的高可用服务
总结:要对某特定服务做高可用有两个要点
一是:要提供监控服务脚本
二是:在vrrp实例中追踪服务
修改配置文件keepalived.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
vrrp_script chk_nginx { script "killall -0 nginx" interval 1 weight -2 } vrrp_instanceVI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 172.16.2.8 } track_script { chk_maintainace chk_nginx } |
在node2上做同样的修改
测试:
[root@node2keepalived]# killall nginx
Youhave new mail in /var/spool/mail/root
[root@node2keepalived]# ss -tanl | grep :80
[root@node2keepalived]#
在node1上
[root@node1keepalived]# ss -tanl | grep :80
LISTEN 0 128 *:80 *:*
验证成功
4、如何实现基于多虚拟路由的master/master模型?
要实现双主模型需要定义两个vrrp_instance,在node1的配置文件中要一下修改:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
|
vrrp_instanceVI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 172.16.2.8 } track_script { chk_maintainace chk_nginx } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.shbackup" notify_fault "/etc/keepalived/notify.shfault" } vrrp_instance VI_2 { state BACKUP interface eth0 virtual_router_id 55 priority 99 advert_int 1 authentication { auth_type PASS auth_pass 2111 } virtual_ipaddress { 172.16.2.18 } track_script { chk_maintainace chk_nginx } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.shfault" } |
在node2上做同样的修改,重启keepalived,进行测试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
[root@node1keepalived] # service nginx status nginx(pid 28688) is running... [root@node1keepalived] # ip addr show 1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link /loopback 00:00:00:00:00:00 brd00:00:00:00:00:00 inet 127.0.0.1 /8 scope host lo inet6 ::1 /128 scope host valid_lft forever preferred_lft forever 2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000 link /ether 00:0c:29:4e:22:fb brdff:ff:ff:ff:ff:ff inet 172.16.2.1 /16 brd 172.16.255.255 scopeglobal eth0 inet 172.16.2.8 /32 scopeglobal eth0 inet 172.16.10.8 /16 brd 172.16.255.255scope global secondary eth0:0 inet6 fe80::20c:29ff:fe4e:22fb /64 scopelink valid_lft forever preferred_lft forever 3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link /ether 6a:7a:4f:e0:c1:8a brdff:ff:ff:ff:ff:ff Youhave new mail in /var/spool/mail/root |
在node2上
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
[root@node2keepalived] # service nginx start Startingnginx: [ OK ] [root@node2keepalived] # ip addr show 1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link /loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1 /8 scope host lo inet6 ::1 /128 scope host valid_lft forever preferred_lft forever 2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000 link /ether 00:0c:29:74:c7:7b brd ff:ff:ff:ff:ff:ff inet 172.16.2.16 /16 brd172.16.255.255 scope global eth0 inet 172.16.2.18 /32 scopeglobal eth0 inet6 fe80::20c:29ff:fe74:c77b /64 scopelink valid_lft forever preferred_lft forever 3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link /ether3a :4e:e8:4c:57:04 brd ff:ff:ff:ff:ff:ff 让node2的keepalived停掉,查看地址是否发生转移 [root@node1keepalived] # ip addr show 1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link /loopback 00:00:00:00:00:00 brd00:00:00:00:00:00 inet 127.0.0.1 /8 scope host lo inet6 ::1 /128 scope host valid_lft forever preferred_lft forever 2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000 link /ether 00:0c:29:4e:22:fb brd ff:ff:ff:ff:ff:ff inet 172.16.2.1 /16 brd172.16.255.255 scope global eth0 inet 172.16.2.8 /32 scopeglobal eth0 inet 172.16.2.18 /32 scopeglobal eth0 inet 172.16.10.8 /16 brd 172.16.255.255scope global secondary eth0:0 inet6 fe80::20c:29ff:fe4e:22fb /64 scopelink valid_lft forever preferred_lft forever 3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link /ether 6a:7a:4f:e0:c1:8a brdff:ff:ff:ff:ff:ff Youhave new mail in /var/spool/mail/root |
验证成功
总结:以上是我操作的过程,不足之处多多指点!!
本文出自 “时光的印记” 博客,请务必保留此出处http://lanlian.blog.51cto.com/6790106/1303195