使用自动化安装部署ZCache server集群过程中,会在各主机用户间建立密钥授权SSH,基于该授权开发了相关脚本来进行集群的基本管理操作。

redis-cli

这是ZCache的客户端工具,可以用来连接ZCache server,使用redis-cli -h可以查看详细使用参数说明

[email protected][/usr/local/redis/bin]#./redis-cli -h
redis-cli 3.2.10-2.1.0

Usage: redis-cli [OPTIONS] [cmd [arg [arg ...]]]
  -h <hostname>      Server hostname (default: 127.0.0.1).
  -p <port>          Server port (default: 6379).
  -s <socket>        Server socket (overrides hostname and port).
  -a <password>      Password to use when connecting to the server.
  -r <repeat>        Execute specified command N times.
  -i <interval>      When -r is used, waits <interval> seconds per command.
                     It is possible to specify sub-second times like -i 0.1.
  -n <db>            Database number.
  -x                 Read last argument from STDIN.
  -d <delimiter>     Multi-bulk delimiter in for raw formatting (default: \n).
  -c                 Enable cluster mode (follow -ASK and -MOVED redirections).
  --raw              Use raw formatting for replies (default when STDOUT is
                     not a tty).
  --no-raw           Force formatted output even when STDOUT is not a tty.
  --csv              Output in CSV format.
  --stat             Print rolling stats about server: mem, clients, ...
  --latency          Enter a special mode continuously sampling latency.
  --latency-history  Like --latency but tracking latency changes over time.
                     Default time interval is 15 sec. Change it using -i.
  --latency-dist     Shows latency as a spectrum, requires xterm 256 colors.
                     Default time interval is 1 sec. Change it using -i.
  --lru-test <keys>  Simulate a cache workload with an 80-20 distribution.
  --slave            Simulate a slave showing commands received from the master.
  --rdb <filename>   Transfer an RDB dump from remote server to local file.
  --pipe             Transfer raw Redis protocol from stdin to server.
  --pipe-timeout <n> In --pipe mode, abort with error if after sending all data.
                     no reply is received within <n> seconds.
                     Default timeout: 30. Use 0 to wait forever.
  --bigkeys          Sample Redis keys looking for big keys.
  --scan             List all keys using the SCAN command.
  --pattern <pat>    Useful with --scan to specify a SCAN pattern.
  --intrinsic-latency <sec> Run a test to measure intrinsic system latency.
                     The test will run for the specified amount of seconds.
  --eval <file>      Send an EVAL command using the Lua script at <file>.
  --ldb              Used with --eval enable the Redis Lua debugger.
  --ldb-sync-mode    Like --ldb but uses the synchronous Lua debugger, in
                     this mode the server is blocked and script changes are
                     are not rolled back from the server memory.
  --help             Output this help and exit.
  --version          Output version and exit.

Examples:
  cat /etc/passwd | redis-cli -x set mypasswd
  redis-cli get mypasswd
  redis-cli -r 100 lpush mylist x
  redis-cli -r 100 -i 1 info | grep used_memory_human:
  redis-cli --eval myscript.lua key1 key2 , arg1 arg2 arg3
  redis-cli --scan --pattern '*:12345*'

  (Note: when using --eval the comma separates KEYS[] from ARGV[] items)

When no command is given, redis-cli starts in interactive mode.
Type "help" in interactive mode for information on available commands
and settings.

这个工具主要有两种使用方式,一是在执行参数中包含操作命令,另一种是不包含操作命令,而是以人机交互终端形式存在。

在执行参数中包含操作命令
redis-cli执行完操作命令后即退出


示例:执行cluster info命令查看集群状态

[email protected][/usr/local/redis/bin]#redis-cli -h 10.45.82.64 -p 7670 cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:3
cluster_size:3
cluster_current_epoch:3
cluster_my_epoch:1
cluster_stats_messages_sent:22970
cluster_stats_messages_received:22970



执行cluster nodes命令查看集群节点信息
[email protected][/usr/local/redis/bin]#redis-cli -h 10.45.82.64 -p 7670 cluster nodes
9fe944c229221823e4f50f3ce94f76c2c98d890b 10.45.82.64:7672 master - 0 1509429776801 3 connected 10923-16383
e169a287a75fee34b2739e11d64901d6b918b105 10.45.82.64:7671 master - 0 1509429777804 2 connected 5461-10922
259c81ba95cd00621855ffa6857152ef6d200426 10.45.82.64:7670 myself,master - 0 0 1 connected 0-5460




文档中的节点信息参考(包含从节点)
579deae82267f8e1ef22603ad946c1b91a812be0 10.45.61.25:7380 myself,master - 0 0 10 connected 10928-16383
cabdb7d74b50f5752e8cf2bf5dc33fc99794285e 10.45.61.28:7381 slave 579deae82267f8e1ef22603ad946c1b91a812be0 0 1421196621852 10 connected
5c1a57c791ebe6b86a3ee250ad0c344852590f92 10.45.61.29:7380 master - 0 1421196619848 3 connected 6-5460
447d270fa311afa7bdcb4059ed00abc66f608938 10.45.61.28:7380 master - 0 1421196621852 8 connected 0-5 5461-10927
0519d80798dfea983033cc345cd583252cc75622 10.45.61.25:7381 slave 5c1a57c791ebe6b86a3ee250ad0c344852590f92 0 1421196620348 4 connected
f6db896b323339d96094f433f1138882d4c121fb 10.45.61.29:7381 slave 447d270fa311afa7bdcb4059ed00abc66f608938 0 1421196620850 8 connected

cluster nodes输出的每一行包含以下信息:

节点 ID :例如 259c81ba95cd00621855ffa6857152ef6d200426。

ip:port :节点的 IP 地址和端口号, 例如 10.45.82.64:7670。

flags :节点的角色(例如 master 、 slave 、 myself )以及状态(例如 fail ,等等)。

如果节点是一个从节点的话, 那么跟在 flags 之后的将是主节点的节点 ID : 例如 10.45.61.29:7381 的主节点的节点 ID 就是 447d270fa311afa7bdcb4059ed00abc66f608938。

集群最近一次向节点发送PING命令之后, 过去了多长时间还没接到回复。

节点最近一次返回 PONG 回复的时间。

节点的配置纪元(configuration epoch) 。

本节点的网络连接情况:例如 connected。

节点目前包含的槽:例如 10.45.61.25:7380 目前包含号码为 10928至 16383的哈希槽。

执行参数中不包含操作命令
redis-cli执行后即进入人机交互模式,在该模式下,操作人员可以连续执行各种操作命令,直到调用‘exit’命令来退出


示例:在人机交互模式执行set和get命令
[email protected][/usr/local/redis/bin]#redis-cli -h 10.45.82.64 -p 7670 -c           
10.45.82.64:7670> set keyxh valueXh
-> Redirected to slot [14963] located at 10.45.82.64:7672
OK
10.45.82.64:7672> get keyxh
"valueXh"
10.45.82.64:7672> exit
[email protected][/usr/local/redis/bin]#

集群运行信息的采集rc-getinfo.sh

用创建的缓存用户登录! 即创建ZCache server的时候安装参数中配置的用户:

#集群归属的系统用户;redis程序归属的系统用户,不存在时自动创建。
redis_cluster_user = cache
#集群归属用户的密码;redis程序归属用户的密码,自动创建用户时设置。
redis_cluster_user_password = cache
localhost:/usr/local/redis/bin $ ./rc-getinfo.sh
OK

====================================================================================================
                total        used        free
------------------------------------------------
[10.45.82.64]
    Mem:      7994292      567496     5898040
    Swap:     2097148           0     2097148
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Server node[10.45.82.64:7670]
Pid       :      43214, Role     :       master, Uptime    :      4.43(h), Version    : 3.2.10-2.1.0
ClientNum :          1, AllKeys  :            0, ExpireKeys:            0, AvgTtl     :            0
ServerRss :     9792kB, SysMem   :  7994292(kB), SysMemUse :   567496(kB), SysMemFree :  5898040(kB)
ServerSwap:      0(kB), SysSwap  :  2097148(kB), SysSwapUse:        0(kB), SysSwapFree:  2097148(kB)
BgsaveTime:         -1, AofEnable:            0, AofRewTime:           -1, WriteStatus:           ok
AllCommand:       1505, EvictKeys:            0, HitKeys   :            0, MissKeys   :            0
##############################################SLOWLOGS##############################################
ID                      Time Duration(us)    Command
-------------------------------------------------------------------------------------
#############################################CONFIG DIFF############################################
Parameter                           Default                        Value                         
---------------------------------------------------------------------------------------------------
auto-aof-rewrite-percentage         "200"                          "0"                           
lua-time-limit                      "5000"                         "1000"                        
node-fail-delay                     "300"                          "1000"                        
repl-diskless-sync-delay            "5"                            "3"                           
dir                                 "./"                           "/home/redis/data/7670"       
client-output-buffer-limit          "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60" "normal 0 0 0 slave 0 0 0 pubsub 33554432 8388608 60"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Server node[10.45.82.64:7671]
Pid       :      43279, Role     :       master, Uptime    :      4.43(h), Version    : 3.2.10-2.1.0
ClientNum :          1, AllKeys  :            0, ExpireKeys:            0, AvgTtl     :            0
ServerRss :     9796kB, SysMem   :  7994292(kB), SysMemUse :   567496(kB), SysMemFree :  5898040(kB)
ServerSwap:      0(kB), SysSwap  :  2097148(kB), SysSwapUse:        0(kB), SysSwapFree:  2097148(kB)
BgsaveTime:         -1, AofEnable:            0, AofRewTime:           -1, WriteStatus:           ok
AllCommand:       2991, EvictKeys:            0, HitKeys   :            0, MissKeys   :            0
##############################################SLOWLOGS##############################################
ID                      Time Duration(us)    Command
-------------------------------------------------------------------------------------
#############################################CONFIG DIFF############################################
Parameter                           Default                        Value                         
---------------------------------------------------------------------------------------------------
auto-aof-rewrite-percentage         "200"                          "0"                           
lua-time-limit                      "5000"                         "1000"                        
node-fail-delay                     "300"                          "1000"                        
repl-diskless-sync-delay            "5"                            "3"                           
dir                                 "./"                           "/home/redis/data/7671"       
client-output-buffer-limit          "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60" "normal 0 0 0 slave 0 0 0 pubsub 33554432 8388608 60"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Server node[10.45.82.64:7672]
Pid       :      43344, Role     :       master, Uptime    :      4.43(h), Version    : 3.2.10-2.1.0
ClientNum :          1, AllKeys  :            1, ExpireKeys:            0, AvgTtl     :            0
ServerRss :     9804kB, SysMem   :  7994292(kB), SysMemUse :   567496(kB), SysMemFree :  5898040(kB)
ServerSwap:      0(kB), SysSwap  :  2097148(kB), SysSwapUse:        0(kB), SysSwapFree:  2097148(kB)
BgsaveTime:         -1, AofEnable:            0, AofRewTime:           -1, WriteStatus:           ok
AllCommand:       2992, EvictKeys:            0, HitKeys   :            1, MissKeys   :            0
##############################################SLOWLOGS##############################################
ID                      Time Duration(us)    Command
-------------------------------------------------------------------------------------
#############################################CONFIG DIFF############################################
Parameter                           Default                        Value                         
---------------------------------------------------------------------------------------------------
auto-aof-rewrite-percentage         "200"                          "0"                           
lua-time-limit                      "5000"                         "1000"                        
node-fail-delay                     "300"                          "1000"                        
repl-diskless-sync-delay            "5"                            "3"                           
dir                                 "./"                           "/home/redis/data/7672"       
client-output-buffer-limit          "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60" "normal 0 0 0 slave 0 0 0 pubsub 33554432 8388608 60"

###########################################COMMANDS STATS###########################################
Command                             Calls                 Usec     UsecPerCall
-----------------------------------------------------------------------------------
cluster                              3028               645780          213.27 (!)
auth                                    4                   14            3.50
ping                                    6                   15            2.50
set                                     1                   30           30.00
command                                 1                  965          965.00 (!)
get                                     1                    8            8.00
info                                 4489                97136           21.64
****************************************************************************************************

****************************************************************************************************
* SERVER                      ROLE STATUS       KEYS  CLIENTS   USEMEM(kB)    OPS NET(kBps)  SLOTS *
* ------------------------------------------------------------------------------------------------ *
* 10.45.82.64:7672          master     ok          1        1         9804      0         0   5460 *
*  |                                                                                               *
* 10.45.82.64:7671          master     ok          0        1         9796      0         0   5461 *
*  |                                                                                               *
* 10.45.82.64:7670          master     ok          0        1         9792      1      0.06   5460 *
*  |                                                                                               *
****************************************************************************************************

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! Please check the warning followed:                                                               !
! 0: There are too more masters on 10.45.82.64!                                                    !
! 1: Master[10.45.82.64:7672] has no slave ,or on the same host!                                   !
! 2: Master[10.45.82.64:7671] has no slave ,or on the same host!                                   !
! 3: Master[10.45.82.64:7670] has no slave ,or on the same host!                                   !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

All statistic information of ZCache is in ZCache-cache.tar.gz !!!

启动集群

#从集群主机上启动: 可以在集群中任何一台主机上,以集群所属系统用户登录 
localhost:/usr/local/redis/bin $ ./rc-start.sh
2017-11-07 15:12:13  *  Begin to start redis cluster on user [  cache  ]
2017-11-07 15:12:13  *   Start node [ 10.45.82.64:7672 ] by [ redis-server redis-7672.conf ]...
2017-11-07 15:12:13  *   Start node [ 10.45.82.64:7671 ] by [ redis-server redis-7671.conf ]...
2017-11-07 15:12:13  *   Start node [ 10.45.82.64:7670 ] by [ redis-server redis-7670.conf ]...
2017-11-07 15:12:14  *   [ 10.45.82.64:7672 ] start ok .
2017-11-07 15:12:14  *   [ 10.45.82.64:7671 ] start ok .
2017-11-07 15:12:14  *   [ 10.45.82.64:7670 ] start ok .
 -  100%    Time elapsed: 1 (s)
2017-11-07 15:12:14  *  ...................................Start done


脚本会自动通过登录的系统用户,来查找到该主机上集群节点的配置文件,并进一步获取到集群包含的各节点及主机信息,最后通过ssh密钥授权来登录到各主机启动节点。
rc-start.sh启动时会检测各节点的启动情况,只有启动完成才返回;如果存在节点启动失败,将会输出该节点的启动日志信息,并请求人工介入处理。
#从非集群主机上启动:使用ZCache集群安装包中rcmanage.py脚本,可以实现从非集群主机上远程启动ZCache server集群
[email protected][/home/rcinstall]#./rcmanage.py -a start -n 10.45.82.64 -u cache -p cache
Redis Cluster Installation Tool 

Begining start for redis cluster by [email protected]... 

Finish start: 2017-11-07 15:24:01  *  ...................................Start done


rcmanage.py使用ssh密码授权方式来登录到指定集群主机上,调用rc-start.sh脚本来启动ZCache server集群

停止集群

#从集群主机上停止:可以在集群中任何一台主机上,以集群所属系统用户登录
localhost:/usr/local/redis/bin $ ./rc-stop.sh
2017-11-07 15:32:20  *  Begin to stop redis cluster on user [  cache  ]
2017-11-07 15:32:20  *   Stop automan
2017-11-07 15:32:20  *   Shutdown redis node [ 10.45.82.64:7670 ]
2017-11-07 15:32:20  *   Shutdown redis node [ 10.45.82.64:7671 ]
2017-11-07 15:32:20  *   Shutdown redis node [ 10.45.82.64:7672 ]
2017-11-07 15:32:20  *  ...................................Stop done


脚本会自动通过登录的系统用户,来查找到该主机上集群节点的配置文件,并进一步获取到集群包含的各节点信息,最后通过远程服务调用,来停止集群中的各个节点
#从非集群主机上停止:使用ZCache集群安装包中rcmanage.py脚本,可以实现从非集群主机上远程停止ZCache server集群。
[email protected][/home/rcinstall]#./rcmanage.py -a stop -n 10.45.82.64 -u cache -p cache
Redis Cluster Installation Tool 

Begining stop for redis cluster by [email protected]... 

Finish stop: 2017-11-07 15:33:51  *  ...................................Stop done


rcmanage.py使用ssh密码授权方式来登录到指定集群主机上,调用rc-stop.sh脚本来停止ZCache server集群

卸载集群

#从集群主机上卸载:要卸载集群,可以在集群中任何一台主机上,以集群所属系统用户登录。
#rc-destroy.sh


脚本会自动通过登录的系统用户,来查找到该主机上集群节点的配置文件,并进一步获取到集群包含的各节点及主机信息,最后通过ssh密钥授权来登录到各主机卸载节点。
#从非集群主机上卸载:使用ZCache集群安装包中rcmanage.py脚本,可以实现从非集群主机上远程卸载ZCache server集群
./rcmanage.py -a destroy -n 10.45.82.64 -u cache -p cache

rcmanage.py使用ssh密码授权方式来登录到指定集群主机上,调用rc-destroy.sh脚本来卸载ZCache server集群

过期键值清理

过期键值清理使用rc-cleanexpire.sh脚本来处理,该脚本在集群安装过程中一同被安装到各主机的$REDIS_HOME/bin目录中(REDIS_HOME默认为/usr/local/redis)。

过期键值清理可以在集群中任何一台主机上,以集群所属系统用户登录。执行rc-cleanexpire.sh脚本(无参数):脚本会自动通过登录的系统用户,来查找到该主机上集群节点的配置文件,并进一步获取到集群包含的各节点及主机信息,最后通过ssh密钥授权来登录到各主机进行过期键值清理。

localhost:/usr/local/redis/bin $ ./rc-cleanexpire.sh
2017-11-08 10:57:19  *  [10.45.6.24:7670]: Start to clean expired keys...
2017-11-08 10:57:19  *  [10.45.6.24:7671]: Start to clean expired keys...
2017-11-08 10:57:19  *  [10.45.6.24:7672]: Start to clean expired keys...
2017-11-08 10:57:19  *  [10.45.82.64:7670]: Start to clean expired keys...
2017-11-08 10:57:19  *  [10.45.82.64:7671]: Start to clean expired keys...
2017-11-08 10:57:19  *  [10.45.82.64:7672]: Start to clean expired keys...
2017-11-08 10:57:19  *  [10.45.82.64:7680]: Start to clean expired keys...
2017-11-08 10:57:20  *  [10.45.82.64:7681]: Start to clean expired keys...
2017-11-08 10:57:20  *  [10.45.82.64:7682]: Start to clean expired keys...
2017-11-08 10:57:20  *  [10.45.82.64:7690]: Start to clean expired keys...
2017-11-08 10:57:20  *  [10.45.82.64:7692]: Start to clean expired keys...
<10.45.6.24:7670     > | [##################################################]100%
<10.45.6.24:7671     > | [##################################################]100%
<10.45.6.24:7672     > | [##################################################]100%
<10.45.82.64:7670    > | [##################################################]100%
<10.45.82.64:7671    > | [##################################################]100%
<10.45.82.64:7672    > | [##################################################]100%
<10.45.82.64:7680    > | [##################################################]100%
<10.45.82.64:7681    > | [##################################################]100%
<10.45.82.64:7682    > | [##################################################]100%
<10.45.82.64:7690    > | [##################################################]100%
<10.45.82.64:7692    > | [##################################################]100%
2017-11-08 10:57:19  *  [10.45.6.24:7670]: Clean expired keys: 0
2017-11-08 10:57:19  *  [10.45.6.24:7672]: Clean expired keys: 0
2017-11-08 10:57:19  *  [10.45.6.24:7671]: Clean expired keys: 0
2017-11-08 10:57:20  *  [10.45.82.64:7670]: Clean expired keys: 0
2017-11-08 10:57:21  *  [10.45.82.64:7671]: Clean expired keys: 0
2017-11-08 10:57:21  *  [10.45.82.64:7672]: Clean expired keys: 0
2017-11-08 10:57:21  *  [10.45.82.64:7680]: Clean expired keys: 0
2017-11-08 10:57:21  *  [10.45.82.64:7681]: Clean expired keys: 0
2017-11-08 10:57:21  *  [10.45.82.64:7682]: Clean expired keys: 0
2017-11-08 10:57:21  *  [10.45.82.64:7690]: Clean expired keys: 0
2017-11-08 10:57:21  *  [10.45.82.64:7692]: Clean expired keys: 0

在线修改节点参数

在线修改节点运行参数使用rc-setconfig.sh脚本来处理,该脚本在集群安装过程中一同被安装到各主机的$REDIS_HOME/bin目录中(REDIS_HOME默认为/usr/local/redis)。

在线修改节点运行参数可以在集群中任何一台主机上,以集群所属系统用户登录。执行rc-setconfig.sh脚本:脚本会自动通过登录的系统用户,来查找到该主机上集群节点的配置文件,并进一步获取到集群包含的各节点及主机信息,最后连接到服务节点通过config命令来修改节点运行参数。

localhost:/usr/local/redis/bin $ ./rc-setconfig.sh
Usage: rc-setconfig.sh [-m|-s|-n <IP:PORT>] [-wh] confname [confvalue]
   # -m -s -n 三个选项互斥,如果三个选项都未设置则将修改所有节点的运行参数
   -m
       only modify masters config [if not set -m&-s&-n,will modify all nodes]
   -s
       only modify slaves config [if not set -m&-s&-n,will modify all nodes]
   -n IP:PORT
       only modify the node[IP:PORT] config
   -w
       write the config file after modify config online
   -h
       show usage

Examples:
   1. set all server nodes online: loglevel=warning :
      $ rc-setconfig.sh loglevel warning
   2. set all server nodes online & write to config file: loglevel=notice :
      $ rc-setconfig.sh -w loglevel notice
   3. set master nodes online: loglevel=notice :
      $ rc-setconfig.sh -m loglevel notice
   4. set node<10.45.43.200:7370> online: loglevel=notice :
      $ rc-setconfig.sh -n 10.45.43.200:7370 loglevel notice
   5. get loglevel on all server nodes:
      $ rc-setconfig.sh loglevel

When no confvalue is given,rc-setconfig.sh just show the config.



localhost:/usr/local/redis/bin $ ./rc-setconfig.sh -m -w loglevel debug
2017-11-08 11:03:04  *  Process config on master nodes ...
2017-11-08 11:03:04  *  [10.45.6.24:7670]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.6.24:7671]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.6.24:7672]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7670]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7671]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7672]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7680]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7681]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7682]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7690]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  [10.45.82.64:7692]: CONFIG SET loglevel debug : [write config file: OK]
2017-11-08 11:03:04  *  Get the value of <loglevel> :
================================================================================
 SERVER                | loglevel
--------------------------------------------------------------------------------
 10.45.6.24:7670       | "debug"
 10.45.6.24:7671       | "debug"
 10.45.6.24:7672       | "debug"
 10.45.82.64:7670      | "debug"
 10.45.82.64:7671      | "debug"
 10.45.82.64:7672      | "debug"
 10.45.82.64:7680      | "debug"
 10.45.82.64:7681      | "debug"
 10.45.82.64:7682      | "debug"
 10.45.82.64:7690      | "debug"
 10.45.82.64:7692      | "debug"
================================================================================

按时间点回档数据

当缓存集群开启了AOF持久化时,可以使用该工具进行数据恢复处理。例如因为误操作,缓存中有大量的数据被删除或修改,此时可以停掉业务程序处理,进行紧急数据恢复,使用rc-recover.sh工具来恢复ZCache至指定的时间点。

在集群中任何一台主机上,以集群所属系统用户登录。执行rc-recover.sh工具,它会自动通过登录的系统用户,来查找到该主机上集群节点的配置文件,并进一步获取到集群包含的各节点及主机信息,最后并发调用各节点的数据回档恢复处理。

localhost:/usr/local/redis/bin $ ./rc-recover.sh
Usage: rc-recover.sh [OPTIONS]
   -g <groupid>       Recover group id.
   -t <timepoint>     Recover to the timepoint.
   -c                 Clear data before recover.

Examples:
   $ rc-recover.sh -g 1 -t 20170101010101
   $ rc-recover.sh -g 1 -t 20170101010101 -c

注:AOF持久化新增配置参数aof-expire-time:aof文件失效时长(单位:秒),默认该参数取值为0,即只保留当前有效的aof文件,历史无效aof文件将被自动删除;当该参数配置>0时,无效的aof文件将会保留指定时长后才会被删除。【保留历史无效的aof文件主要是用于数据的备份及回档恢复需要】

集群自管理

自动管理工具(rc-automan.sh)主要负责检测ZCache server集群的运行状态,当发现集群主节点挂掉,且无法自动切换时,会主动进行修复,完成主从故障切换。而当节点挂掉时,其也能够根据设置来自动拉起服务节点。

增加的设置参数为automan:

# automan: 集群自管理参数
# 0 - 关闭自动重启和故障切换, 
# 1 - 开启自动重启, 
# 2 - 开启自动故障切换,  
# 3 - 开启自动重启和故障切换
automan 3
Usage: rc-automan.sh [-d|-i interval]
   -d
       show debug information
   -i interval
       check cluster status interval [default 10 (s)]
   -h
       show usage

Examples:
   1. manage zcache in login user
      $ rc-automan.sh
   2. check zcache status every 60 seconds
      $ rc-automan.sh -i 60

只有当ZCache server集群的主节点故障无法自动恢复时,自管理工具才会来帮助集群完成主从切换。【类似于有中心的分布式系统;这样可以解决无中心分布式系统(ZCache server)在特定场景下(主节点存活数不超过总数的一半,不能进行选举处理)无法自愈的问题】

自动管理工具同时会在ZCache server所在的所有主机的集群用户下运行,但其中只有一个实例(MASTER)进行自管理维护,其他实例处于STANDBY状态,只有当MASTER进程挂掉时,其他STANDBY实例才会产生新的MASTER进行接管。另外MASTER进程还会检查并自动拉起挂掉的STANDBY实例。

注:自管理工具无需人工调用,其会在rc-start.sh执行时被同时拉起。

集群节点信息查看

查看集群节点信息用rc-tool.sh脚本来处理,该脚本在集群安装过程中一同被安装到各主机的$REDIS_HOME/bin目录中(REDIS_HOME默认为/usr/local/redis)。

集群节点信息查看可以在集群中任何一台主机上,以集群所属系统用户登录。执行rc-tool.sh nodes脚本(参数为nodes):脚本会自动通过登录的系统用户,来查找到该主机上集群节点的配置文件,并进一步获取到集群包含的各节点及主机信息并进行格式化展示

localhost:/usr/local/redis/bin $ ./rc-tool.sh
Usage: ./rc-tool.sh <config|nodes|cmds> ip port


localhost:/usr/local/redis/bin $ ./rc-tool.sh nodes 10.45.6.24 7670

NODEID             SERVER                 ROLE            PING-TIME     PONG-TIME     EPOCH  STATE        SLOTS 
-------------------------------------------------------------------------------------------------------------------
6568131fda...      10.45.6.24:7670        myself,master   0             0             11     connected    0-909 5461-6372
abdf90c344...      10.45.6.24:7671        master          0             1510203448712 13     connected    2731-3640 9102 13654-14563
ce7adb006f...      10.45.6.24:7672        master          0             1510203447711 12     connected    910 8193-9101 10923-11833
259c81ba95...      10.45.82.64:7670       master          0             1510203445706 1      connected    3641-5460
e169a287a7...      10.45.82.64:7671       master          0             1510203446709 2      connected    9103-10922
9fe944c229...      10.45.82.64:7672       master          0             1510203445706 3      connected    14564-16383
4aeb672cac...      10.45.82.64:7680       master          0             1510203448714 8      connected    11834-13653
6d604078c1...      10.45.82.64:7681       master          0             1510203447712 6      connected    6373-8192
1223f1c83c...      10.45.82.64:7682       master          0             1510203448714 7      connected    911-2730
718125f4d9...      10.45.82.64:7690       master          0             1510203444697 15     connected    
2b7ea1255c...      10.45.82.64:7692       master          0             1510203448713 14     connected

工具会格式化展示集群中各节点信息,以树状结构来显示主从节点的关系,并且根据主节点的IP来进行排序,方便用户查找指定节点。

results matching ""

    No results matching ""