参考资料:

http://patrick-tang.blogspot.com/2012/06/redis-keepalived-failover-system.html

http://deidara.blog.51cto.com/400447/302402

背景介绍:

目前,Redis还没有一个类似于MySQL Proxy或Oracle RAC的官方HA方案。

Redis作者有一个名为Redis Sentinel的计划(http://redis.io/topics/sentinel),据称将会有监控,报警和自动故障转移三大功能,非常不错。

但可惜的是短期内恐怕还不能开发完成。

因此,如何在出现故障时自动转移是一个需要解决的问题。

通过对网上一些资料的搜索,有建议采用HAProxy或Keepalived来实现的,事实上如果是做Failover而非负载均衡的话,Keepalived的效率肯定是超过HAProxy的,所以我决定采用Keepalived的方案。

环境介绍:

Master: 10.6.1.143

Slave: 10.6.1.144

Virtural IP Address (VIP): 10.6.1.200

设计思路:

当 Master 与 Slave 均运作正常时, Master负责服务,Slave负责Standby;

当 Master 挂掉,Slave 正常时, Slave接管服务,同时关闭主从复制功能;

当 Master 恢复正常,则从Slave同步数据,同步数据之后关闭主从复制功能,恢复Master身份,于此同时Slave等待Master同步数据完成之后,恢复Slave身份。

然后依次循环。

需要注意的是,这样做需要在Master与Slave上都开启本地化策略,否则在互相自动切换的过程中,未开启本地化的一方会将另一方的数据清空,造成数据完全丢失。

下面,是具体的实施步骤:

在Master和Slave上安装Keepalived

$ sudo apt-get install keepalived

修改Master和Slave的/etc/hosts文件

$ sudo vim /etc/hosts

1

127.0.0.1   localhost

2

10.6.1.143  redis

3

10.6.1.144  redis-slave

默认安装完成keepalived之后是没有配置文件的,因此我们需要手动创建:

首先,在Master上创建如下配置文件:

$ sudo vim /etc/keepalived/keepalived.conf

01

vrrp_script chk_redis {

02

                script "/etc/keepalived/scripts/redis_check.sh"   ###监控脚本

03

                interval 2                                        ###监控时间

04

}

05

vrrp_instance VI_1 {

06

        state MASTER                            ###设置为MASTER

07

        interface eth0                          ###监控网卡   

08

        virtual_router_id 51

09

        priority 101                            ###权重值

10

        authentication {

11

                     auth_type PASS             ###加密

12

                     auth_pass redis            ###密码

13

        }

14

        track_script {

15

                chk_redis                       ###执行上面定义的chk_redis

16

        }

17

        virtual_ipaddress {

18

             10.6.1.200                         ###VIP

19

        }

20

        notify_master /etc/keepalived/scripts/redis_master.sh

21

        notify_backup /etc/keepalived/scripts/redis_backup.sh

22

        notify_fault  /etc/keepalived/scripts/redis_fault.sh

23

        notify_stop   /etc/keepalived/scripts/redis_stop.sh

24

}

然后,在Slave上创建如下配置文件:

$ sudo vim /etc/keepalived/keepalived.conf

01

vrrp_script chk_redis {

02

                script "/etc/keepalived/scripts/redis_check.sh"   ###监控脚本

03

                interval 2                                        ###监控时间

04

}

05

vrrp_instance VI_1 {

06

        state BACKUP                                ###设置为BACKUP

07

        interface eth0                              ###监控网卡

08

        virtual_router_id 51

09

        priority 100                                ###比MASTRE权重值低

10

        authentication {

11

                     auth_type PASS

12

                     auth_pass redis                ###密码与MASTRE相同

13

        }

14

        track_script {

15

                chk_redis                       ###执行上面定义的chk_redis

16

        }

17

        virtual_ipaddress {

18

             10.6.1.200                         ###VIP

19

        }

20

        notify_master /etc/keepalived/scripts/redis_master.sh

21

        notify_backup /etc/keepalived/scripts/redis_backup.sh

22

        notify_fault  /etc/keepalived/scripts/redis_fault.sh

23

        notify_stop   /etc/keepalived/scripts/redis_stop.sh

24

}

在Master和Slave上创建监控Redis的脚本

$ sudo mkdir /etc/keepalived/scripts

$ sudo vim /etc/keepalived/scripts/redis_check.sh

01

#!/bin/bash

02

 

03

ALIVE=`/opt/redis/bin/redis-cli PING`

04

if [ "$ALIVE" == "PONG" ]; then

05

  echo $ALIVE

06

  exit 0

07

else

08

  echo $ALIVE

09

  exit 1

10

fi

编写以下负责运作的关键脚本:

notify_master /etc/keepalived/scripts/redis_master.sh

notify_backup /etc/keepalived/scripts/redis_backup.sh

notify_fault /etc/keepalived/scripts/redis_fault.sh

notify_stop /etc/keepalived/scripts/redis_stop.sh

因为Keepalived在转换状态时会依照状态来呼叫:

当进入Master状态时会呼叫notify_master

当进入Backup状态时会呼叫notify_backup

当发现异常情况时进入Fault状态呼叫notify_fault

当Keepalived程序终止时则呼叫notify_stop

首先,在Redis Master上创建notity_master与notify_backup脚本:

$ sudo vim /etc/keepalived/scripts/redis_master.sh

01

#!/bin/bash

02

 

03

REDISCLI="/opt/redis/bin/redis-cli"

04

LOGFILE="/var/log/keepalived-redis-state.log"

05

 

06

echo "[master]" >> $LOGFILE

07

date >> $LOGFILE

08

echo "Being master...." >> $LOGFILE 2>&1

09

 

10

echo "Run SLAVEOF cmd ..." >> $LOGFILE

11

$REDISCLI SLAVEOF 10.6.1.144 6379 >> $LOGFILE  2>&1

12

sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态

13

 

14

echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE

15

$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1

$ sudo vim /etc/keepalived/scripts/redis_backup.sh

01

#!/bin/bash

02

 

03

REDISCLI="/opt/redis/bin/redis-cli"

04

LOGFILE="/var/log/keepalived-redis-state.log"

05

 

06

echo "[backup]" >> $LOGFILE

07

date >> $LOGFILE

08

echo "Being slave...." >> $LOGFILE 2>&1

09

 

10

sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色

11

echo "Run SLAVEOF cmd ..." >> $LOGFILE

12

$REDISCLI SLAVEOF 10.6.1.144 6379 >> $LOGFILE  2>&1

接着,在Redis Slave上创建notity_master与notify_backup脚本:

$ sudo vim /etc/keepalived/scripts/redis_master.sh

01

#!/bin/bash

02

 

03

REDISCLI="/opt/redis/bin/redis-cli"

04

LOGFILE="/var/log/keepalived-redis-state.log"

05

 

06

echo "[master]" >> $LOGFILE

07

date >> $LOGFILE

08

echo "Being master...." >> $LOGFILE 2>&1

09

 

10

echo "Run SLAVEOF cmd ..." >> $LOGFILE

11

$REDISCLI SLAVEOF 10.6.1.143 6379 >> $LOGFILE  2>&1

12

sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态

13

 

14

echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE

15

$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1

$ sudo vim /etc/keepalived/scripts/redis_backup.sh

01

#!/bin/bash

02

 

03

REDISCLI="/opt/redis/bin/redis-cli"

04

LOGFILE="/var/log/keepalived-redis-state.log"

05

 

06

echo "[backup]" >> $LOGFILE

07

date >> $LOGFILE

08

echo "Being slave...." >> $LOGFILE 2>&1

09

 

10

sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色

11

echo "Run SLAVEOF cmd ..." >> $LOGFILE

12

$REDISCLI SLAVEOF 10.6.1.143 6379 >> $LOGFILE  2>&1

然后在Master与Slave创建如下相同的脚本:

$ sudo vim /etc/keepalived/scripts/redis_fault.sh

1

#!/bin/bash

2

 

3

LOGFILE=/var/log/keepalived-redis-state.log

4

 

5

echo "[fault]" >> $LOGFILE

6

date >> $LOGFILE

$ sudo vim /etc/keepalived/scripts/redis_stop.sh

1

#!/bin/bash

2

 

3

LOGFILE=/var/log/keepalived-redis-state.log

4

 

5

echo "[stop]" >> $LOGFILE

6

date >> $LOGFILE

给脚本都加上可执行权限:

$ sudo chmod +x /etc/keepalived/scripts/*.sh

脚本创建完成以后,我们开始按照如下流程进行测试:

1.启动Master上的Redis

$ sudo /etc/init.d/redis start

2.启动Slave上的Redis

$ sudo /etc/init.d/redis start

3.启动Master上的Keepalived

$ sudo /etc/init.d/keepalived start

4.启动Slave上的Keepalived

$ sudo /etc/init.d/keepalived start

5.尝试通过VIP连接Redis:

$ redis-cli -h 10.6.1.200 INFO

连接成功,Slave也连接上来了。

role:master

slave0:10.6.1.144,6379,online

6.尝试插入一些数据:

$ redis-cli -h 10.6.1.200 SET Hello Redis

OK

从VIP读取数据

$ redis-cli -h 10.6.1.200 GET Hello

"Redis"

从Master读取数据

$ redis-cli -h 10.6.1.143 GET Hello

"Redis"

从Slave读取数据

$ redis-cli -h 10.6.1.144 GET Hello

"Redis"

下面,模拟故障产生:

将Master上的Redis进程杀死:

$ sudo killall -9 redis-server

查看Master上的Keepalived日志

$ tailf /var/log/keepalived-redis-state.log

[fault]

Thu Sep 27 08:29:01 CST 2012

同时Slave上的日志显示:

$ tailf /var/log/keepalived-redis-state.log

[master]

Fri Sep 28 14:14:09 CST 2012

Being master....

Run SLAVEOF cmd ...

OK

Run SLAVEOF NO ONE cmd ...

OK

然后我们可以发现,Slave已经接管服务,并且担任Master的角色了。

$ redis-cli -h 10.6.1.200 INFO

$ redis-cli -h 10.6.1.144 INFO

role:master

然后我们恢复Master的Redis进程

$ sudo /etc/init.d/redis start

查看Master上的Keepalived日志

$ tailf /var/log/keepalived-redis-state.log

[master]

Thu Sep 27 08:31:33 CST 2012

Being master....

Run SLAVEOF cmd ...

OK

Run SLAVEOF NO ONE cmd ...

OK

同时Slave上的日志显示:

$ tailf /var/log/keepalived-redis-state.log

[backup]

Fri Sep 28 14:16:37 CST 2012

Being slave....

Run SLAVEOF cmd ...

OK

可以发现目前的Master已经再次恢复了Master的角色,故障切换以及自动恢复都成功了。