I’m working on a very large project currently at work that involves Kubernetes. I’ve had quite a bit of time to play with it, and I have been very much enjoying it. Although I run a lot of containerized applications on my local network at home, I have up to this point just been either scheduling the containers by hand, or have been using Rancher as my main scheduler. Honestly though, I have been so impressed with Kubernetes that I finally decided that I should spin up a local cluster for me to use in my homelab. Although it looked pretty straight forward, and really in the end it was, I ran into a number of issues that I wanted to touch on.
Two caveats before we get started. First, this is only intended for spinning up Kubernetes on your own VMs, Baremetal, or Linode/Digial Ocean type of boxes. There are way better tools out there if you want to install a cluster in AWS (see Kops). Second, I will be focusing on using CentOS7 as the base operating system.
Installation
For the most part I will be showing the same process as the Kubernetes Documentation. I however ran into some issues that I will cover along the way so others don’t have to go back and forth.
Obviously the first thing you need to do is to provision some VMs. I’m using KVM locally, but all the methods are different and I won’t go into that here. Once that is completed, there are some base tools that are needed to get things installed and running: docker
, kubelet
, kubectl
, and kubeadm
. To install these tools in CentOS, you need to add both the EPEL repository, and the Google Cloud repository to your repos list. Below is the /etc/yum.repos.d/kubernetes.repo
file:
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-${ARCH}
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
After the repo has been added, stop SELinux and then install and enable the following:
# setenforce 0
# yum install -y docker kubelet kubeadm kubectl kubernetes-cni
# systemctl enable docker && systemctl start docker
# systemctl enable kubelet && systemctl start kubelet
Although not explicitly stated, I also had to turn on bridge-nf-call-iptables
on the systems. Without doing so when trying to run the kubeadm init
command (which we will get to in a second) it fails with the error message /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
. I also had to open up ports 6443 and 10250 (as instructed).
Since this had to be done on every machine in the cluster, I wrote a small Ansible role that does this for me (which ended up proving very useful later).
Once all these initial setup steps were run, on the node that is the Master node run: kubeadm init
. The output should look similar to the below:
[root@k8sMaster ~]# kubeadm init
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.6.3
[init] Using Authorization mode: RBAC
[preflight] Running pre-flight checks
[preflight] WARNING: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[certificates] Generated CA certificate and key.
[certificates] Generated API server certificate and key.
[certificates] API Server serving cert is signed for DNS names [k8sMaster kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.20]
[certificates] Generated API server kubelet client certificate and key.
[certificates] Generated service account token signing key and public key.
[certificates] Generated front-proxy CA certificate and key.
[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 47.787762 seconds
[apiclient] Waiting for at least one node to register
[apiclient] First node has registered after 4.010572 seconds
[token] Using token: 193201.a45fa7617359bfb9
[apiconfig] Created RBAC rules
[addons] Created essential addon: kube-proxy
[addons] Created essential addon: kube-dns
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run (as a regular user):
sudo cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token 193201.sdfa32f3f259bfb9 192.168.1.20:6443
After the master is created and ready, make sure to follow the instructions about moving and exporting the admin config file. It is necessary to continue working with the cluster, and to make sure everything is working. Just as the instructions say, it’s as easy as running these three commands:
sudo cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
After those are run the kubectl command will now work and it is possible to inspect the pods in the kube-system
namespace.
$ kubectl get po --namespace=kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
etcd-k8smaster 1/1 Running 0 21m 192.168.1.20 k8smaster
kube-apiserver-k8smaster 1/1 Running 0 20m 192.168.1.20 k8smaster
kube-controller-manager-k8smaster 1/1 Running 0 21m 192.168.1.20 k8smaster
kube-dns-3913472980-g4tgp 0/3 Pending 0 22m <none>
kube-proxy-gntvs 1/1 Running 0 22m 192.168.1.20 k8smaster
kube-scheduler-k8smaster 1/1 Running 0 21m 192.168.1.20 k8smaster
The only thing that isn’t running at this point is the kube-dns service, but this is okay as a network overlay needs to be installed for the cluster to have communication. I got stuck here for quite a bit (and had to rebuild my cluster a few times, thank you Ansible) but finally got it working by using the Weave overlay network.
kubectl apply -f https://git.io/weave-kube-1.6
At this point now it’s a waiting game for all the pods to come up. It is done once the kube-dns
pod shows Running
.
$ kubectl get po --namespace=kube-system -o wide
NAME READY STATUS RESTARTS AGE
etcd-k8s-master 1/1 Running 0 1h
kube-apiserver-k8s-master 1/1 Running 0 1h
kube-controller-manager-k8s-master 1/1 Running 0 1h
kube-dns-3913472980-tvftd 3/3 Running 0 1h
kube-proxy-3tzqv 1/1 Running 0 1h
kube-scheduler-k8s-master 1/1 Running 0 1h
weave-net-77d51 2/2 Running 0 1h
Once the Pod Network is up and running, the other boxes in the cluster can be brought up. Run the connection string from the output on the master (kubeadm join --token <token> ip.addre.ss
). The node will say it joined the cluster and you can watch it come up via kubectl
.
If you get an error here that says failed to check server version: Get https://192.168.1.20:6443/version: x509: certificate has expired or is not yet valid
check the date on both servers. I ran into this and it was because the date was incorrect for the node that was trying to join the cluster.
$ kubectl get nodes -o wide
NAME STATUS AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION
k8smaster Ready 59m v1.6.3 <none> CentOS Linux 7 (Core) 3.10.0-514.16.1.el7.x86_64
k8snode1 Ready 34m v1.6.3 <none> CentOS Linux 7 (Core) 3.10.0-514.16.1.el7.x86_64
k8snode2 Ready 13m v1.6.3 <none> CentOS Linux 7 (Core) 3.10.0-514.16.1.el7.x86_64
Final Notes
I ran into quite a few issues getting this cluster running. The first issue was on me, since I was installing Docker via an older Ansible role that downloaded an install script. Turns out that this would not work at all, and once I installed Docker via EPEL it worked just fine. The second major issue was the Pod Network. As I previously mentioned I tried a lot of the different Pod Networks to varying success until I got Weave to work. Both Fluent and Calico seemed to work initially. Calico had an issue with the firewall (or so it seemed since connections failed between the nodes). Fluent worked up until it was time to install and run the Dashboard pod, at which point I continuelly got an error message about the network being unreachable.
All in all it wasn’t that bad of an installation and I now have a fully functional Kubernetes cluster behind my firewall running on KVM/libvirt.