velero定期备份Kubernetes
LiuSw Lv6

velero定期备份Kubernetes

一.velero 简介

Velero 是一个云原生的灾难恢复和迁移工具,采用 Go 语言编写,可以安全的备份、恢复和迁移Kubernetes集群资源和持久卷。Velero 是西班牙语,意思是帆船,非常符合 Kubernetes 社区的命名风格。

Velero目前包含以下特性:

  • 支持Kubernetes集群数据备份和恢复
  • 支持复制当前Kubernetes集群的资源到其它Kubernetes集群
  • 支持复制生产环境到开发以及测试环境

Velero组件一共分两部分,分别是服务端和客户端。

  • 服务端:运行在Kubernetes集群中
  • 客户端:运行在本地的velero命令行工具,需要在机器上已配置好kubectl及集群kubeconfig

velero使用场景

  • 灾备场景:提供备份恢复k8s集群的能力
  • 迁移场景:提供拷贝集群资源到其他集群的能力(复制同步开发,测试,生产环境的集群配置,简化环境配置)

velero与etcd备份区别

  • 直接备份 Etcd 是将集群的全部资源备份起来,而 Velero 可以对 Kubernetes 集群内对象级别进行备份。
  • 除了对 Kubernetes 集群进行整体备份外,Velero 还可以通过对 TypeNamespaceLabel
    等对象进行分类备份或者恢复。

Velero 在 Kubernetes 集群中创建了很多 CRD 以及相关的控制器,进行备份恢复等操作实质上是对相关 CRD 的操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@master1 ~] kubectl -n velero get crds -l component=velero

NAME CREATED AT
backups.velero.io 2021-08-06T08:29:34Z
backupstoragelocations.velero.io 2021-08-06T08:29:34Z
deletebackuprequests.velero.io 2021-08-06T08:29:34Z
downloadrequests.velero.io 2021-08-06T08:29:34Z
podvolumebackups.velero.io 2021-08-06T08:29:34Z
podvolumerestores.velero.io 2021-08-06T08:29:34Z
resticrepositories.velero.io 2021-08-06T08:29:34Z
restores.velero.io 2021-08-06T08:29:34Z
schedules.velero.io 2021-08-06T08:29:34Z
serverstatusrequests.velero.io 2021-08-06T08:29:35Z
volumesnapshotlocations.velero.io 2021-08-06T08:29:35Z

Velero 工作原理图如下图所示,当用户执行备份命令时,备份过程说明如下:

  1. 调用自定义资源 API 创建备份对象(1)。
  2. BackupController 控制器检测到生成的备份对象时(2)执行备份操作(3)。
  3. 将备份的集群资源和存储卷快照上传到 Velero 的后端存储(4)和(5)。

二.velero 安装

velero安装分为两部分,velero CLI命令行安装及服务端安装,前提条件:

  • 准备一个生产k8s集群cluster1,及一个灾备k8s集群cluster2,当然也可以在同一个集群备份恢复
  • 在灾备环境部署minio对象存储,当然也可以使用公有云对象存储

1.minio对象存储安装

velero依赖对象存储保存备份数据,这里部署minio替代公有云对象存储。灾备环境准备一个节点部署minio对象存储,两个集群velero服务端将指向同一个对象存储的同一个bucket,灾备时 cluster1向桶内备份数据,cluster2向桶内读取备份数据进行恢复。

方式1:

可以在k8s集群内直接使用yaml或官方minio operator部署,这里在集群外使用docker安装minio对象存储 :

1
2
3
4
5
6
7
8
docker run -d --name minio \
--restart always \
-p 9000:9000 \
-p 9001:9001 \
-e "MINIO_ROOT_USER=minio" \
-e "MINIO_ROOT_PASSWORD=minio123" \
-v minio-data:/data \
minio/minio server /data --console-address ":9001"

方式2:(此文章为方式2安装minio服务)

1.搭建centos中的minio服务

下载:https://dl.min.io/server/minio/release/linux-amd64/minio

1
2
3
mkdir -p /opt/minio/data
chmod 777 ./minio
cp minio /usr/local/bin

2.直接启动,替换下列地址为自己环境地址

1
2
3
export MINIO_ROOT_USER=minio
export MINIO_ROOT_PASSWORD=minio123
minio server /data/minio --address 192.168.137.14:9000 --console-address 192.168.137.14:9001

3.确认没问题后将minio后台启动注册为服务开机自启

创建/opt/minio/minio.conf配置文件,替换下列地址为自己环境地址

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cd /opt/minio
cat > minio.conf <<EOF
# Volume to be used for Minio server.

MINIO_VOLUMES="/opt/minio/data"

# Use if you want to run Minio on a custom port.

MINIO_OPTS="--address 192.168.11.159:9000 --console-address 192.168.11.159:9001"

# Access Key of the server.

MINIO_ACCESS_KEY=minio

# Secret key of the server.

MINIO_SECRET_KEY=minio123
EOF
  • MINIO_VOLUMES为存储数据路径
  • MINIO_OPTS为启动参数,需要替换成自己的地址
  • MINIO_ACCESS_KEY为登录用户
  • MINIO_SECRET_KEY为登录密码

注册minio为centos系统服务,注意下列路径要对应

vi /lib/systemd/system/minio.service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
[Unit]

Description=Minio

Documentation=https://docs.minio.io

Wants=network-online.target

After=network-online.target

AssertFileIsExecutable=/usr/local/bin/minio

[Service]

WorkingDirectory=/opt/minio/data

EnvironmentFile=-/opt/minio/minio.conf

ExecStartPre=/bin/bash -c "if [ -z \"${MINIO_VOLUMES}\" ]; then echo \"Variable MINIO_VOLUMES not set in /opt/minio/minio.conf\"; exit 1; fi"

ExecStart=/usr/local/bin/minio server $MINIO_OPTS $MINIO_VOLUMES

# Let systemd restart this service always

Restart=always

# Specifies the maximum file descriptor number that can be opened by this process

LimitNOFILE=65536

# Disable timeout logic and wait until process is stopped

TimeoutStopSec=infinity

SendSIGKILL=no

[Install]

WantedBy=multi-user.target

4.启动服务

1
2
3
4
systemctl daemon-reload
systemctl start minio
systemctl status minio
systemctl enable minio

5.启动成功后直接访问http://ip:9001

6.登录创建名为velero的Buckets

2.安装velero CLI客户端

1
2
3
wget https://github.com/vmware-tanzu/velero/releases/download/v1.5.2/velero-v1.5.2-linux-amd64.tar.gz
tar -zxvf velero-v1.5.2-linux-amd64.tar.gz
cp velero-v1.5.2-linux-amd64 /usr/local/bin/

查看velero CLI版本

1
[root@master1 velero]# velero version

3.安装velero服务端

1.本地创建minio对象存储访问凭证

1
2
3
4
5
cat  >credentials-velero <<EOF
[default]
aws_access_key_id = minio
aws_secret_access_key = minio123
EOF

2.使用velero CLI在生产和灾备集群安装velero,指向同一个对象存储:(需要修改最后的s3Url)

1
2
3
4
5
6
7
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.2.0 \
--bucket velero \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://192.168.11.159:9000

本地内网环境需要导入velero的镜像包velero/velerov:1.5.2

--use-restic参数指定使用官方默认支持的restic将持久化卷备份到对象存储,而公有云厂商提供的k8s有对应的CSI实现和velero插件支持 (此次没有使用)

3.查看集群中创建的pods

1
2
3
[root@master1 velero]# kubectl -n velero get pods
NAME READY STATUS RESTARTS AGE
velero-67fdfbdf65-7gfn5 1/1 Running 0 31m

4.查看velero版本

1
2
3
4
5
6
[root@master1 velero]# velero version
Client:
Version: v1.5.2
Git commit: e115e5a191b1fdb5d379b62a35916115e77124a4
Server:
Version: v1.5.2

velero安装完成

3.卸载velero服务

若需要重新安装请先卸载velero

1
2
kubectl delete namespace/velero clusterrolebinding/velero
kubectl delete crds -l component=velero

三.velero备份

在集群备份应用及pv卷,首先创建veleroCLI中自带的示例应用:

1
2
[root@master1 velero]# cd velero-v1.5.2-linux-amd64/examples/nginx-app
[root@master1 velero]# kubectl apply -f with-pv.yaml

查看创建的应用及持久卷,注意这里使用storageclass动态申请持久卷,在恢复的目标集群,建议提供相同名称的storageclass:

1
2
3
4
5
6
7
[root@master1 nginx-app]# kubectl -n nginx-example get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-5ccc99bffb-swvr2 2/2 Running 0 6m59s

[root@product-cluster nginx-app]# kubectl -n nginx-example get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-fd5eee69-ca7f-4e94-a749-7d4e04dc4a7c 50Mi RWO Delete Bound nginx-example/nginx-logs longhorn 6m54s

执行备份,备份nginx-example这个命名空间下的所有资源:

1
velero backup create mybackup-01 --include-namespaces=nginx-example

查看备份百分比进度

1
velero backup describe mybackup-01

查看备份的所有资源列表

1
velero backup describe mybackup-01 --details

备份完成后可以查看备份日志

1
velero backup logs mybackup-01

查看备份任务最终执行状态

1
velero backup get

四.velero恢复

在集群中删除一些pod或者deploy后恢复集群(删除刚刚创建的nginx)

1
velero restore create --from-backup=mybackup-01

查看恢复的应用及数据

1
2
3
4
5
6
7
kubectl -n nginx-example get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-5ccc99bffb-swvr2 2/2 Running 0 7m10s

kubectl -n nginx-example get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-c9713a25-a3f6-4179-89df-fd6e27ea5055 5Gi RWO Delete Bound ns-panda/mysql-pv-claim longhorn 31h

五.velero常用命令

velero参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Velero is a tool for managing disaster recovery, specifically for Kubernetes
cluster resources. It provides a simple, configurable, and operationally robust
way to back up your application state and associated data.

If you're familiar with kubectl, Velero supports a similar model, allowing you to
execute commands such as 'velero get backup' and 'velero create schedule'. The same
operations can also be performed as 'velero backup get' and 'velero schedule create'.

Usage:
velero [command]

Available Commands:
backup Work with backups
backup-location Work with backup storage locations
bug Report a Velero bug
client Velero client related commands
completion Output shell completion code for the specified shell (bash or zsh).
create Create velero resources
delete Delete velero resources
describe Describe velero resources
get Get velero resources
help Help about any command
install Install Velero
plugin Work with plugins
restic Work with restic
restore Work with restores
schedule Work with schedules
snapshot-location Work with snapshot locations
version Print the velero version and associated image

Flags:
--add_dir_header If true, adds the file directory to the header
--alsologtostderr log to standard error as well as files
--features stringArray Comma-separated list of features to enable for this Velero process. Combines with values from $HOME/.config/velero/config.json if present
-h, --help help for velero
--kubeconfig string Path to the kubeconfig file to use to talk to the Kubernetes apiserver. If unset, try the environment variable KUBECONFIG, as well as in-cluster configuration
--kubecontext string The context to use to talk to the Kubernetes apiserver. If unset defaults to whatever your current-context is (kubectl config current-context)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--log_file string If non-empty, use this log file
--log_file_max_size uint Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
--logtostderr log to standard error instead of files (default true)
--master --kubeconfig (Deprecated: switch to --kubeconfig) The address of the Kubernetes API server. Overrides any value in kubeconfig. Only required if out-of-cluster.
-n, --namespace string The namespace in which Velero should operate (default "velero")
--skip_headers If true, avoid header prefixes in the log messages
--skip_log_headers If true, avoid headers when opening log files
--stderrthreshold severity logs at or above this threshold go to stderr (default 2)
-v, --v Level number for the log level verbosity
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging

Use "velero [command] --help" for more information about a command.

备份集群下的所有资源:

1
velero backup create mybackup-01

备份单个namespace下的所有资源:

1
velero backup create mybackup-01 --include-namespaces=mynamespace

备份多个namespace下的所有资源:

1
velero backup create mybackup --include-namespaces=mynamespaceA,mynamespaceB

创建一小时后自动删除的备份

1
velero backup create mybackup --include-namespaces=mynamespace --ttl=1h

获取备份列表

1
velero backup get

获取mybackup这个备份的详情信息

1
velero backup describe mybackup

删除mybackup这个备份

1
velero backup delete mybackup

从某一个备份上恢复,如果备份有多个命名空间也可以只恢复其中一个

1
velero restore create --from-backup=mybackup-01 --include-namespaces=mynamespace

每天1点创建一个72小时删除的mynamespace备份

1
velero schedule create myschedule --schedule="0 1 * * *" --ttl 72h --include-namespaces=mynamespace

查看备份位置配置

1
kubectl -n velero get BackupStorageLocation -o yaml

六.定时检查集群并备份

定时执行脚本如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
#!/bin/sh
########################################################################################
# 脚本功能: 检查k8s环境是否正常,进行集群备份
# 作者: liuSw
# 日期: 2021-12-08
# 版本: v1.0
########################################################################################

# 写入定时任务函数
function addCron {

# 文件路径
me=$(cd "`dirname $0`";pwd)/$(basename $0)

# 定时任务文件
cron=/var/spool/cron/$(whoami)

# 创建定时任务,每小时执行
if [ ! -f ${cron} ];then
echo "00 23 * * * ${me} " >>${cron}
fi

# 添加定时任务
if [ `grep -c "$(basename $0)" ${cron}` -eq 0 ];then
echo "00 23 * * * ${me} " >>${cron}
fi
}

# 添加定时任务打开此函数注释
#addCron

# 内网或者外网邮件地址
MailAddr=

# 监控日志地址
LogFile=/tmp/backup-k8s.log

# 默认不清理日志,打开此选项即每次检查清理上次日志
# [ -f /tmp/k8s-temp.log ] && :> ${LogFile}

# 需要备份的集群namespace,默认为全部备份
nameSpaces="kube-system,kubesphere-system"

# 备份时间
Date=`date +%Y-%m-%d-%H-%M`

# 巡检集群节点是否都为Ready状态
if [ `kubectl get nodes |grep -wv Ready|grep -vc NAME` -ne 0 ];then
echo -e "\n`date +%Y"-"%m"-"%d" "%H":"%M` [Error] K8s Node is not Ready." >>${LogFile}
kubectl get nodes |grep -wv Ready | grep -v NAME >>${LogFile}
#exit
# 发送邮件
#kubectl get nodes |grep -wv Ready | grep -v NAME | mail -s "Subject" ${MailAddr}
fi

# 巡检k8s基础组件是否为Running状态
for p in kube-apiserver- kube-controller-manager- kube-proxy- kube-scheduler- etcd-;do
if [ `kubectl get pod -n kube-system|grep "$p"|grep -vc "Running"` -ne 0 ];then
echo -e "\n`date +%Y"-"%m"-"%d" "%H":"%M` [Error] K8s kube-pod is not Running." >>${LogFile}
kubectl get pod -n kube-system|grep "$p" | grep -v "Running" >>${LogFile}
# 发送邮件
#kubectl get pod -n kube-system|grep "$p" | mail -s "Subject" ${MailAddr}
fi
done

# 巡检正在运行的pod是否出现问题
for i in OutOfcpu Evicted Terminating Error Pening;do
if [ `kubectl get pods -A |grep -c "${i}"` -gt 3 ];then
echo -e "\n`date +%Y"-"%m"-"%d" "%H":"%M` [Error] K8s pod is Error." >>${LogFile}
kubectl get pods -A | grep "${i}" >>${LogFile}
# 发送邮件
#kubectl get pods -A | grep "${i}" | mail -s "Subject" ${MailAddr}
fi
done

# 备份集群 每天点执行
velero backup create ${Date} --exclude-namespaces="velero" >>${LogFile}

# 等待备份完成获取状态
sleep 60

# 获取备份状态
if [ `velero backup get|grep -w ${Date}|awk '{print $2}'`"x" == "Completedx" ];then
echo "`date +%Y"-"%m"-"%d" "%H":"%M` [info] velero backup Successful." >>${LogFile}
else
echo "`date +%Y"-"%m"-"%d" "%H":"%M` [error] velero backup faild!" >>${LogFile}
# 发送邮件
#echo "[error] ${Date} velero backup faild!" | mail -s "Subject" ${MailAddr}
fi

# 查看邮件不提示
cat /var/spool/mail/root >/dev/null
exit

参考: https://blog.csdn.net/networken/article/details/120368546

 评论