Flink集群搭建
Flink是一个框架和分布式处理引擎,用于对无界和有界数据流进行有状态计算。
在安装spark之前,需要安装hadoop集群环境
集群列表
| 服务器 | 
地址 | 
角色 | 
备注 | 
| hadoop1 | 
192.168.11.81 | 
master slaves | 
32G 12C 800G | 
| hadoop2 | 
192.168.11.82 | 
master slaves | 
32G 12C 800G | 
| hadoop3 | 
192.168.11.83 | 
slaves | 
32G 12C 800G | 
一.基础环境设置
关闭防火墙
1 2
   | systemctl stop firewalld systemctl disable firewalld
   | 
 
关闭selinux
1 2
   | sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config setenforce 0
   | 
 
添加hosts
1 2 3 4
   | vi /etc/hosts 192.168.11.81 hadoop1 192.168.11.82 hadoop2 192.168.11.83 hadoop3
   | 
 
二.安装flink集群
下载flink安装包
1 2 3
   | https://archive.apache.org/dist/flink/
  https://archive.apache.org/dist/flink/flink-1.14.4/flink-1.14.4-bin-scala_2.12.tgz
   | 
 
上传解压flink安装包
1
   | tar -zxvf flink-1.14.4-bin-scala_2.12.tgz -C /data
   | 
 
配置环境变量
1 2 3 4 5 6 7
   | cat >>/etc/profile <<EOF # flink export FLINK_HOME=/data/flink-1.14.4 export PATH=$PATH:$FLINK_HOME/bin EOF
  source /etc/profile
   | 
 
配置flink参数
1 2 3 4
   | cd $FLINK_HOME/conf vi flink-conf.yaml
  jobmanager.rpc.address: hadoop1
   | 
 
配置工作节点
1 2 3 4
   | vi workers
  hadoop2 hadoop3
   | 
 
注:以上配置需要复制到其它节点中
启动服务
1 2
   | cd $FLINK_HOME/bin ./start-cluster.sh
   | 
 
关闭服务
1 2
   | cd $FLINK_HOME/bin ./stop-cluster.sh
   | 
 
五.web可视化
1
   | http://192.168.11.81:8081
   | 
 
六.配置flink集群Standalone HA模式
1 2 3 4 5
   | cd $FLINK_HOME/conf vi masters
  hadoop2:8081 hadoop1:8081
   | 
 
1 2 3 4 5
   | vi workers
  hadoop1 hadoop2 hadoop3
   | 
 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
   | cd $FLINK_HOME/conf vi flink-conf.yaml
 
  jobmanager.rpc.address: westgis181 jobmanager.heap.size: 1024m taskmanager.heap.size: 1024m taskmanager.numberOfTaskSlots: 3 parallelism.default: 5
 
  high-availability: zookeeper high-availability.storageDir: hdfs:///user/flink/recovery high-availability.zookeeper.quorum: 192.168.11.160:2181 high-availability.zookeeper.path.root: /flink
 
  state.backend: filesystem state.checkpoints.dir: hdfs://westgis181:9000/flink/flink-checkpoints jobmanager.execution.failover-strategy: region
 
  rest.port: 8081 rest.address: westgis181
 
  io.tmp.dirs: /tmp
 
  jobmanager.archive.fs.dir: hdfs:///user/flink/recovery/completed-jobs/ historyserver.web.address: westgis182 historyserver.web.port: 8082 historyserver.archive.fs.dir: hdfs:///user/flink/recovery/completed-jobs/ historyserver.archive.fs.refresh-interval: 1000
   | 
 
1 2 3 4 5 6
   | cd $FLINK_HOME/conf vi zoo.cfg
  server.1=192.168.11.160:2888:3888 server.2=192.168.11.161:2888:3888 server.3=192.168.11.162:2888:3888
   | 
 
根据自身hadoop版本和flink版本下载相应的依赖包,然后上传到$FLINK_HOME /lib目录下,最后分发给其他节点。
下载地址
1
   | https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.6.5-10.0/
   | 
 
启动服务
1 2
   | cd $FLINK_HOME/bin ./start-cluster.sh
   | 
 
验证
1 2 3 4 5 6 7 8 9 10 11 12 13
   | [root@hadoop1 conf]# jps 7616 NodeManager 7074 DataNode 7331 ResourceManager 708 TaskManagerRunner 10340 TaskManagerRunner 25132 Jps 7955 Master 20599 TaskManagerRunner   6939 NameNode 9243 StandaloneSessionClusterEntrypoint   8123 TaskManagerRunner 9565 TaskManagerRunner
   | 
 
1 2 3 4 5 6
   | [root@hadoop2 ~]# jps 5378 Worker 7331 Jps 5140 NodeManager 6107 TaskManagerRunner  5022 DataNode
   |