Browse Source

update deepops

k8s
liutension 2 years ago
parent
commit
c6d6afca1d
21 changed files with 237 additions and 160 deletions
  1. +111
    -56
      deepops/README_zh.md
  2. +11
    -8
      deepops/config.example/group_vars/all.yml
  3. BIN
      deepops/deepops-19.10.zip
  4. +3
    -3
      deepops/playbooks/openi-octopus-add-node.yml
  5. +3
    -3
      deepops/playbooks/openi-octopus-create-ha-k8s-cluster.yml
  6. +7
    -7
      deepops/playbooks/openi-octopus-create-single-k8s-cluster.yml
  7. +3
    -3
      deepops/requirements.yml
  8. +10
    -0
      deepops/roles/checkAndResetAllNodeInK8s/tasks/main.yml
  9. +2
    -7
      deepops/roles/facts/tasks/main.yml
  10. +1
    -1
      deepops/roles/installCalico/templates/calico-v3.10.yaml.j2
  11. +3
    -9
      deepops/roles/installDocker/tasks/main.yml
  12. +34
    -0
      deepops/roles/installNvidiaDriver/tasks/files/nvidia-docker
  13. +3
    -1
      deepops/roles/joinK8sMaster/tasks/main.yml
  14. +2
    -0
      deepops/roles/joinK8sNode/tasks/main.yml
  15. +2
    -7
      deepops/roles/k8s-gpu-plugin/tasks/main.yml
  16. +2
    -12
      deepops/roles/nfs/tasks/main.yml
  17. +2
    -4
      deepops/roles/nfs/tasks/server.yml
  18. +2
    -3
      deepops/roles/nvidia-dgx/handlers/main.yml
  19. +1
    -1
      deepops/roles/setClusterCommon/files/kubeletForNvidiaPodGPUMetricsExporter
  20. +0
    -0
      deepops/roles/setClusterCommon/files/rc.local
  21. +35
    -35
      deepops/roles/setClusterCommon/tasks/main.yml

+ 111
- 56
deepops/README_zh.md View File

@@ -22,6 +22,7 @@ Openi-Octopus集群使用ansible-playbook可以自动化部署K8s GPU 高可用

![单master的K8s集群](./singlek8s.png)


## 高可用的K8s集群

本方案采用KeepAlived + Haproxy + Kubeadm 安装高可用K8s集群
@@ -30,6 +31,7 @@ Openi-Octopus集群使用ansible-playbook可以自动化部署K8s GPU 高可用

![高可用的K8s集群](./hak8s.png)


### 高可用loadbalancer集群(负载均衡层)

本方案采用 keepalived + haproxy 实现高可用的load balancer集群,在每个master节点上都部署一套 keepalived + haproxy。
@@ -55,6 +57,17 @@ haproxy配合keepalived, 在load balancer层实现一套多个haproxy的Master-B

# 使用方法

## 前提条件

1. 准备加入K8s集群的所有节点可使用ssh互相连接
2. 所有节点配置脚本登录的ssh用户在安装集群阶段设置为相同ssh密码
3. 所有节点需要有一个固定ip
4. 所有节点可以ping 114.114.114.114
5. 所有节点都可以ping www.baidu.com
6. 所有节点都可以互相ping
7. 所有节点的内网ip网卡名字要相同(如所有节点的内网ip网卡的名字都是“eno1”)


## 一、把deepops自动化安装脚本下载到机器中


@@ -66,21 +79,31 @@ $ 输入sudo密码

```



## 三、先配置,再安装

# 如何配置
# 1. 如何配置

ubuntu安装[ansible](http://www.ansible.com.cn/docs/intro_installation.html)

```
$ sudo apt-get install software-properties-common
$ sudo apt-add-repository ppa:ansible/ansible
$ sudo apt-get update
$ sudo apt-get install ansible
```

### 1. 生成配置文件夹config
### 2. 生成配置文件夹config

以下命令运行完安装必要软件和从config.example文件夹复制出一个文件夹config, 在[ansible.cfg](./ansible.cfg)中指定了该文件夹是实际指挥ansible安装的配置文件夹

```
$ cd deepops
$ ./scripts/setup.sh
$ ./scripts/init_config.sh
```

### 2. K8s集群节点配置
### 3. K8s集群节点配置

```
$ vi config/inventroy
@@ -90,7 +113,7 @@ $ vi config/inventroy

请注意:[kube-init-master]组指向的唯一ip需要和自动化脚本所在的机器一致

### 3. K8s集群参数配置
### 4. K8s集群参数配置

```
$ vi config/group_vars/all.yml
@@ -99,6 +122,15 @@ $ vi config/group_vars/all.yml

详情请看[K8s集群参数配置方案](./config.example/group_vars/all.yml)的注释


### 4. 如果worker节点是英伟达GPU节点,需要提前配置节点

需要先安装英伟达GPU驱动,保证节点上的nvidia-smi命令执行成功

```
$ nvidia-smi
```

# 如何首次安装K8s集群

### 首次安装K8s集群,先获取英伟达GPU相关自动安装脚本
@@ -183,9 +215,6 @@ $ 输入sudo密码

### 首次安装K8s集群后,如何增加worker节点

#### 注意:如果增加英伟达GPU节点,请先在config/group_vars/all.yml(有操作注释)中配置驱动文件下载路径参数 nvidia_driver_run_file_download_url



```
$ ansible-playbook playbooks/openi-octopus-add-node.yml -K
@@ -206,55 +235,26 @@ $ 输入sudo密码

约定4:[kube-label-node] <= [kube-worker-node]。配置在[all.yml](./config.example/group_vars/all.yml)配置文件中的参数kube_label_node_labels,安装脚本会给[kube-label-node]节点打label。

#### 增加英伟达GPU worker节点,出错: ERROR: Module nvidia is in use! 如何解决 ?

1. 登录错误出现的目标GPU服务器,切换到root用户
```
$ sudo su
```
2. 查看nvidia 内核模块被哪个进程占用
```
$ lsof /dev/nvidia*

$ kill -9 $进程号
```
3. 再次停止nvidia 内核相关模块
```
$ rmmod nvidia_drm
$ rmmod nvidia_modeset
$ rmmod nvidia_uvm
$ rmmod nvidia
```

4. 查看所有nvidia 内核模块已经被卸载, 没有显示就表示已经卸载
## 首次安装K8s集群后,如何给K8s集群worker节点打label

```

$ lsmod | grep nvidia

$ ansible-playbook playbooks/openi-octopus-label-node.yml -K
$ 输入sudo密码
```

5. 手动执行.run 文件 安装脚本
该脚本存在的意义在于,在不想新增worker节点的情况下,给集群中worker节点打新的labels。在运行命令前,请先根据如何增加worker节点章节配置labels
该脚本作用的对象是[kube-label-node]节点组,此时openi-octopus-label-node.yml脚本与[kube-worker-node]节点组无关

```
$ sh /root/NVIDIA-Linux-x86_64.run -a -s -Z -X
```
6. 验证安装驱动是否正常
## 如何更改Es集群的节点数据目录

1. config/group_vars/all.yml 中增加了配置项es_data_dir
2. 执行脚本重新配置所有节点的公共配置
```
$ nvidia-smi
$ nvidia-container-cli -k -d /dev/tty info

$ ansible-playbook playbooks/openi_common_setting_for_all_node.yml -K
```


7. 手动修复正常后,重新执行脚本

```
$ ansible-playbook playbooks/openi-octopus-add-node.yml -K
$ 输入sudo密码
```


## 如果kubectl delete node $init-master,可以使用另外一个master当作init-master?

@@ -287,21 +287,76 @@ $ etcdctl endpoint status --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/k
$ etcdctl member remove $unHealthMemberID --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key
```

## 由于国内网络问题,如果 install nvidia-Docker步骤不能下载nvidia-docker文件如何修复?

## 首次安装K8s集群后,如何给K8s集群worker节点打label
##### 安装脚本报错如下
```
fatal: [v100-1]: FAILED! => changed=false
msg: 'Failed to connect to raw.githubusercontent.com at port 443: [Errno 110] Connection timed out'
```

##### 修复方案

1. 下载[nvidia-docker文件](https://github.com/NVIDIA/nvidia-docker/blob/master/nvidia-docker)

```
$ ansible-playbook playbooks/openi-octopus-label-node.yml -K
$ 输入sudo密码
$ wget https://raw.githubusercontent.com/NVIDIA/nvidia-docker/master/nvidia-docker
```

该脚本存在的意义在于,在不想新增worker节点的情况下,给集群中worker节点打新的labels。在运行命令前,请先根据如何增加worker节点章节配置labels
该脚本作用的对象是[kube-label-node]节点组,此时openi-octopus-label-node.yml脚本与[kube-worker-node]节点组无关
2. 或者, 直接使用本仓库文件roles/installNvidiaDriver/files/nvidia-docker

## 如何更改Es集群的节点数据目录
3. 复制nvidia-docker到galaxy-roles/nvidia.nvidia_docker/files这个文件夹中

注意:galaxy-roles/nvidia.nvidia_docker文件夹会在以上安装命令(ansible-galaxy install -r requirements.yml)执行成功后生成, 不需要自己创建


4. 修改galaxy-roles/nvidia.nvidia_docker/tasks/main.yml

###### 将以下脚本

1. config/group_vars/all.yml 中增加了配置项es_data_dir
2. 执行脚本重新配置所有节点的公共配置
```
$ ansible-playbook playbooks/openi_common_setting_for_all_node.yml -K
```
- name: grab nvidia-docker wrapper
get_url:
url: "{{ nvidia_docker_wrapper_url }}"
dest: /usr/local/bin/nvidia-docker
mode: 0755
owner: root
group: root
environment: "{{proxy_env if proxy_env is defined else {}}}"
```
###### 替换为

```
- name: grab nvidia-docker wrapper
copy:
src: nvidia-docker
dest: /usr/local/bin/nvidia-docker
mode: 0755
owner: root
group: root
```

5. 重新运行安装脚本,进行安装OpenI章鱼集群



## 如果 install Docker 步骤出错如何修复?

节点安装docker-ce时候出错:"E: Packages were downgraded and -y was used without --allow-downgrades", 如何修复?

1. 查看配置文件config/group_vars/all.yml中 "docker_ce_version" 变量,是否比节点已安装的docker版本低? 如果节点已安装了更高版本docker,请卸载再继续运行脚本

2. 卸载命令:
```
$ sudo apt-get purge docker-ce docker-ce-cli containerd.io
```
3. rm -rf /var/lib/docker

4. 如果出错:cannot remove **: Device or resource busy

5. 重启机器

6. 再次 rm -rf /var/lib/docker
7. 如果还是cannot remove **: Device or resource busy
8. umount /var/lib/docker
9. 再次 rm -rf /var/lib/docker

+ 11
- 8
deepops/config.example/group_vars/all.yml View File

@@ -64,18 +64,19 @@ kube_image_repo: "registry.aliyuncs.com/google_containers"

## K8s集群中Service资源的内部虚拟ip地址网段空间
## Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.16.0.0/16
## service_addresses地址空间配置需要注意与pod_subnet的地址空间独立
kube_service_addresses: 11.0.0.0/16

## K8s集群中Pod资源的内部虚拟ip地址网段空间
## 网络插件Calico也需要依赖这个配置
## internal network. Kubernetes will assign IP addresses from this range to individual pods.
## This network must be unused in your network infrastructure!
kube_pods_subnet: 10.0.0.0/24
kube_pods_subnet: 10.0.0.0/8

## kubeproxy工作模式,ipvs或者iptables
## Kube-proxy proxyMode configuration.
## Can be ipvs, iptables.
kube_proxy_mode: ipvs
kube_proxy_mode: iptables

# ----------------------------------------- Config Calico -----------------------------------------------------------

@@ -85,14 +86,16 @@ kube_proxy_mode: ipvs
## String format: calico-$version (is the filename in roles/installCalico/templates)
kube_network_plugin: calico-v3.10

## calico使用Bgp模式还是IPIP隧道模式(注意:如果集群的机器是来自于多网段的,只能使用IPIP模式)
## calico使用Bgp模式还是IPIP隧道模式(注意:如果集群的机器是来自于多网段的,只能使用CrossSubnet模式)
## IPIP模式填写: "Always"
## Bgp模式填写:"off"
## Bgp模式填写:"Off"
## 跨网段模式填写: "CrossSubnet" (ipip-bgp混合模式: 同子网内路由采用bgp,跨子网路由采用ipip)
## 查看Bgp网络的node mesh网络状态可使用命令: calicoctl node status
## 查看Bgp网络某个node的详情命令: calico get node $nodeName -o yaml
## 查看某个机器的网卡的ip包分析:tcpdump -i $网卡名字 host $主机名字 -n
## enable calico IPIP mode input "Always", enable bgp mode inpu "off"
calico_enable_IPIP: "off"
## enable calico IPIP mode input "Always", enable bgp mode inpu "Off"
## If cross subnet machine in cluster must use "CrossSubnet" mode
calico_mode: "Off"

# calico寻找可用物理网卡的策略
# 1. 默认: "first-found"
@@ -169,7 +172,7 @@ nv_docker_daemon_json:
# 根据服务器的GPU类型,Cuda版本选择GPU driver版本,注意:系统选择Linux x64——》点击下载——》跳转下载页面——》鼠标浮在下载按钮上——》右键——》复制链接地址
# 获得通用的Linux 64位服务器的driver安装文件(.run文件)的下载路径
# (注意,选择Ubutun系统会变成下载.deb文件,一般情况下手动用这种模式安装也是可以的,但实际测试中,.deb文件安装模式有可能会安装driver不成功, 所以本脚本只采用run文件安装模式)
nvidia_driver_run_file_download_url: "http://cn.download.nvidia.com/tesla/418.116.00/NVIDIA-Linux-x86_64-418.116.00.run"
#nvidia_driver_run_file_download_url: "http://cn.download.nvidia.com/tesla/418.116.00/NVIDIA-Linux-x86_64-418.116.00.run"

# ------------------------------------------------ Config K8s Device Plugin For Nvidia GPU -------------------------------
# 要让K8s资源管理器能识别到英伟达的GPU,需要安装英伟达GPU的 K8s deivce plugin


BIN
deepops/deepops-19.10.zip View File


+ 3
- 3
deepops/playbooks/openi-octopus-add-node.yml View File

@@ -20,9 +20,9 @@
# ----------------------------------------------------------
# Install driver and container runtime on GPU servers
- include: install-nvidia-driver.yml
tags:
- nvidia
#- include: install-nvidia-driver.yml
# tags:
# - nvidia

- include: nvidia-docker.yml
tags:


+ 3
- 3
deepops/playbooks/openi-octopus-create-ha-k8s-cluster.yml View File

@@ -15,9 +15,9 @@
- include: check-and-reset-all-node-in-k8s.yml

# Install driver and container runtime on GPU servers
- include: install-nvidia-driver.yml
tags:
- nvidia
#- include: install-nvidia-driver.yml
# tags:
# - nvidia

- include: nvidia-docker.yml
tags:


+ 7
- 7
deepops/playbooks/openi-octopus-create-single-k8s-cluster.yml View File

@@ -10,19 +10,19 @@
- include: check-and-reset-all-node-in-k8s.yml

# Install driver and container runtime on GPU servers
- include: install-nvidia-driver.yml
tags:
- nvidia

- include: nvidia-docker.yml
tags:
- nvidia
#- include: install-nvidia-driver.yml
# tags:
# - nvidia

# Install Kubernetes
# for configuration, see: config/group_vars/all.yml

- include: create-single-k8s-cluster.yml

- include: nvidia-docker.yml
tags:
- nvidia

# config nvidia docker
- include: config-openi-octopus-nvidia-docker.yml
tags:


+ 3
- 3
deepops/requirements.yml View File

@@ -3,8 +3,8 @@
- src: dev-sec.ssh-hardening
version: "6.1.3"

- src: unxnn.users
version: '78fd08ca86678d00da376eaac909d22e1022a020'
#- src: unxnn.users
# version: '78fd08ca86678d00da376eaac909d22e1022a020'

- src: https://github.com/lukeyeager/ansible-role-hosts.git
version: '711ba98571f068a8bc6739fa1055ac38fc931010'
@@ -20,7 +20,7 @@
version: "v1.1.0"

- src: nvidia.nvidia_docker
version: "v1.1.0"
version: "v1.2.1"

- src: https://github.com/NVIDIA/ansible-role-enroot.git
verions: "v0.1.2"


+ 10
- 0
deepops/roles/checkAndResetAllNodeInK8s/tasks/main.yml View File

@@ -29,6 +29,16 @@
shell: kubeadm reset -f
ignore_errors: True

- name: stop kubelet
shell: systemctl stop kubelet
ignore_errors: True

- name: remove dir /var/lib/etcd
shell: rm -rf /var/lib/etcd

- name: remove dir /etc/kubernetes
shell: rm -rf /etc/kubernetes

- name: clear iptables
shell: iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X



+ 2
- 7
deepops/roles/facts/tasks/main.yml View File

@@ -1,11 +1,6 @@
---
- name: apt install pciutils
apt: name=pciutils update_cache=yes
when: ansible_os_family == 'Debian'

- name: yum install pciutils
yum: name=pciutils update_cache=yes
when: ansible_os_family == 'RedHat'
- name: apt-get install pciutils
shell: apt-get install pciutils -y

- name: create fact directory
file:


+ 1
- 1
deepops/roles/installCalico/templates/calico-v3.10.yaml.j2 View File

@@ -608,7 +608,7 @@ spec:
value: "{{calico_ip_autodetection_method}}"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "{{calico_enable_IPIP}}"
value: "{{calico_mode}}"
# Set MTU for tunnel device used if ipip is enabled
- name: FELIX_IPINIPMTU
valueFrom:


+ 3
- 9
deepops/roles/installDocker/tasks/main.yml View File

@@ -9,14 +9,8 @@
#- name: autoremove docker
# shell: apt-get autoremove -y docker-ce docker docker-engine docker.io

- name: install dokcer dependence
apt:
state: present
name:
- apt-transport-https
- ca-certificates
- curl
- software-properties-common
- name: apt-get install dokcer dependence
shell: apt-get install -y apt-transport-https ca-certificates curl software-properties-common

- name: add dokcer ali apt-key
shell: curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
@@ -41,7 +35,7 @@
mode: 0644

- name: install docker
shell: apt-get install -y --allow-downgrades docker-ce={{docker_ce_version}}
shell: apt-get install -y docker-ce={{docker_ce_version}}

- name: enable and restart docker
shell: systemctl enable docker && systemctl start docker


+ 34
- 0
deepops/roles/installNvidiaDriver/tasks/files/nvidia-docker View File

@@ -0,0 +1,34 @@
#! /bin/bash
# Copyright (c) 2017-2018, NVIDIA CORPORATION. All rights reserved.

NV_DOCKER=${NV_DOCKER:-"docker"}

DOCKER_ARGS=()
NV_DOCKER_ARGS=()
while [ $# -gt 0 ]; do
arg=$1
shift
DOCKER_ARGS+=("$arg")
case $arg in
run|create)
NV_DOCKER_ARGS+=("--runtime=nvidia")
if [ ! -z "${NV_GPU}" ]; then
NV_DOCKER_ARGS+=(-e NVIDIA_VISIBLE_DEVICES="${NV_GPU// /,}")
fi
break
;;
version)
printf "NVIDIA Docker: @VERSION@\n"
break
;;
--)
break
;;
esac
done

if [ ! -z $NV_DEBUG ]; then
set -x
fi

exec $NV_DOCKER "${DOCKER_ARGS[@]}" "${NV_DOCKER_ARGS[@]}" "$@"

+ 3
- 1
deepops/roles/joinK8sMaster/tasks/main.yml View File

@@ -36,8 +36,10 @@

- name: get ipvsadm info to check kube-proxy ipvs mode
shell: ipvsadm -Ln
when: kube_proxy_mode == "ipvs"
register: ipvsadmInfo

- name: show ipvsadm info to check kube-proxy ipvs mode
debug:
msg: "{{ipvsadmInfo}}"
msg: "{{ipvsadmInfo}}"
when: kube_proxy_mode == "ipvs"

+ 2
- 0
deepops/roles/joinK8sNode/tasks/main.yml View File

@@ -39,9 +39,11 @@

- name: get ipvsadm info to check kube-proxy ipvs mode
shell: ipvsadm -Ln
when: kube_proxy_mode == "ipvs"
register: ipvsadmInfo

- name: show ipvsadm info to check kube-proxy ipvs mode
debug:
msg: "{{ipvsadmInfo}}"
when: kube_proxy_mode == "ipvs"


+ 2
- 7
deepops/roles/k8s-gpu-plugin/tasks/main.yml View File

@@ -1,11 +1,6 @@
---
- name: install virtualenv
apt:
name: "{{ item }}"
state: present
with_items:
- "virtualenv"
- "python-setuptools"
- name: apt-get install virtualenv
shell: apt-get install virtualenv python-setuptools
when: ansible_distribution == 'Ubuntu'

- name: create location for python virtual env


+ 2
- 12
deepops/roles/nfs/tasks/main.yml View File

@@ -1,20 +1,10 @@
---
- name: install ubuntu packages
apt:
name: nfs-common
state: latest
- name: apt-get install ubuntu packages
shell: apt-get install nfs-common
when: ansible_os_family == "Debian"
tags:
- nfs

- name: install rhel packages
yum:
name: nfs-utils
state: latest
when: ansible_os_family == "RedHat"
tags:
- nfs

- name: configure idmapd domain for nfsv4
lineinfile:
dest: /etc/idmapd.conf


+ 2
- 4
deepops/roles/nfs/tasks/server.yml View File

@@ -1,8 +1,6 @@
---
- name: install ubuntu packages
apt:
name: nfs-kernel-server
state: latest
- name: apt-get install ubuntu packages
shell: apt-get install nfs-kernel-server
when: ansible_os_family == "Debian"
tags:
- nfs


+ 2
- 3
deepops/roles/nvidia-dgx/handlers/main.yml View File

@@ -1,7 +1,6 @@
---
- name: apt update
apt:
update_cache: yes
- name: apt-get update
shell: apt-get update
when: ansible_distribution == 'Ubuntu'

- name: restart cachefilesd


+ 1
- 1
deepops/roles/setClusterCommon/files/kubeletForNvidiaPodGPUMetricsExporter View File

@@ -1 +1 @@
KUBELET_EXTRA_ARGS=--feature-gates=KubeletPodResources=true
KUBELET_EXTRA_ARGS=--cgroup-driver=systemd --feature-gates=KubeletPodResources=true

+ 0
- 0
deepops/roles/setClusterCommon/files/rc.local View File


+ 35
- 35
deepops/roles/setClusterCommon/tasks/main.yml View File

@@ -1,24 +1,21 @@
---
- name: DNS config
copy:
src: resolv.conf
dest: /etc/resolv.conf
#- name: DNS config
# copy:
# src: resolv.conf
# dest: /etc/resolv.conf

- name: backup apt source
copy:
src: /etc/apt/sources.list
dest: /etc/apt/sources.list.old
#- name: backup apt source
# copy:
# src: /etc/apt/sources.list
# dest: /etc/apt/sources.list.old

- name: replace apt source to ali source
copy:
src: sources.list
dest: /etc/apt/sources.list
#- name: replace apt source to ali source
# copy:
# src: sources.list
# dest: /etc/apt/sources.list

- name: install curl
apt:
state: present
name:
- curl
- name: apt-get install curl
shell: apt-get install curl -y

- name: add ali k8s apt-key
shell: curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
@@ -29,17 +26,13 @@
dest: /etc/apt/sources.list.d/kubernetes.list

- name: apt-get update
apt:
update_cache: yes
shell: apt-get update

- name: apt fix broken pkg
shell: apt --fix-broken install

- name: install ufw
apt:
state: present
name:
- ufw
- name: apt-get install ufw
shell: apt-get install ufw -y

- name: disable firewall
shell: ufw disable
@@ -54,6 +47,11 @@
src: selinuxconfig
dest: /etc/selinux/config

- name: touch rc.local
copy:
src: rc.local
dest: /etc/rc.local

- name: persist allow Linux kernel forward net pkg (don't worry reboot)
lineinfile:
dest: /etc/rc.local
@@ -97,39 +95,41 @@
shell: sed -i 's/.*swap.*/#&/' /etc/fstab

#----- config for kubeproxy and keepalived to use ipvs
- name: install ipset
apt:
state: present
name:
- ipset

- name: install ipvsadm
apt:
state: present
name:
- ipvsadm
- name: apt-get install ipset
shell: apt-get install ipset -y
when: kube_proxy_mode == "ipvs"

- name: apt-get install ipvsadm
shell: apt-get install ipvsadm -y
when: kube_proxy_mode == "ipvs"

- name: mkdir ipvs modules dir
shell: mkdir -p /etc/sysconfig/modules
when: kube_proxy_mode == "ipvs"

- name: persist active ipvs (don't worry reboot)
copy:
src: ipvs.modules
dest: /etc/sysconfig/modules/ipvs.modules
when: kube_proxy_mode == "ipvs"

- name: chmod 755 /etc/sysconfig/modules/ipvs.modules
shell: chmod 755 /etc/sysconfig/modules/ipvs.modules
when: kube_proxy_mode == "ipvs"

- name: active ipvs.modules
shell: bash /etc/sysconfig/modules/ipvs.modules
when: kube_proxy_mode == "ipvs"

- name: get lsmod info
shell: lsmod | grep -e ip_vs -e nf_conntrack_ipv4
when: kube_proxy_mode == "ipvs"
register: lsmodinfo

- name: show lsmod info
debug:
msg: "{{lsmodinfo}}"
when: kube_proxy_mode == "ipvs"

#---- config for keepalived



Loading…
Cancel
Save