Sunday, June 14, 2020

Learning about k8s





Today in agile teams is crucial embracing a GitOps Model to achieve a major level of maturity of process development, and thus evolving the product faster, the goals should be small in the way we could iterate over it.
When I write about “maturity levels” I refer to automation of traceability, configuration, and secret management, least privilege access.
The term GitOps was coined by Alexis Richardson at Weaveworks around 2017 [1] where he explained that. 
Some people could see the GitOps term like the next level in the process development, I think could be a level up from the “DevOps” term in the process development. 
It is a new way of looking at the operation process.


Before dive into something practical we need to recall which are the principles of GitOps:

Infrastructure as code (IaC): allows us to treat the configuration of infrastructure and deployment of code in the same way how we usually manage the software development process using a familiar tool: Git. Some references you could find definitions like “Git as a single source of truth” this means using Git to operate almost everything, even configuration, and secrets.

Self-Service: GitOps flow aims to break the barrier by automating and making it self-service (a pull request and code review). In this way, we as developers are close to the operation or deployment of environments in a simple way.

Declarative infrastructure configuration: In several references they recommend the use of a pure declarative infrastructure configuration, is an easy and simple way to map the state of our deployment and is more intuitive. This type of configuration lets us do code review, comments and pull requests of configuration files.

Observability: We need software agents to ensure correctness and alert on divergence in our system, we need alerts, healthchecks, that inform us if something went wrong with some element or application, checking through Git the desired state of system against the actual state, and doing the best to self-healing.

Taking this into account you could improve stability, higher reliability, consistency, and Standardization, also this could enhance the development experience making transparent the deployment process (with a commit approved you could deploy in a simple way, because Git is the source of true).

We could use these principles without using kubernetes, You could choose the tech stack that adjusts more to your situation or platform. In this blog we use minikube in a practical way making a review about subjects on GitOps and how kubernetes (minikube) help us to understand the flow of gitops.

Kubernetes was first developed by a team at Google, they called Borg around 2003 [2] in the middle of 2014 Google presented Borg in an open-source version called “Kubernetes” the definition is “Kubernetes is an open-source reference implementation of container cluster management.”[3] In several bibliographies you could find Kubernetes term abbreviated like “k8s” so from now on I will refer to Kubernetes in this way.

The k8s cluster should subscribe to changes made in a repository. So the first requirement we have is having a Git repository. You could use github[5], gitlab[4], bitbucket [6]. Second you need to install minikube, this is a software that lets you try basic commands from a k8s cluster and test the development workflow[7].

Install minikube:  


curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube_latest_amd64.deb
sudo dpkg -i minikube_latest_amd64.deb


After this you could see something like this:


Selecting previously unselected package minikube.
(Reading database ... 332649 files and directories currently installed.)
Preparing to unpack minikube_latest_amd64.deb ...
Unpacking minikube (1.11.0) ...
Setting up minikube (1.11.0) ...


You could choose how to use minikube, I mean how k8s will compose the pod (later we see this term), only you have to set up the driver.

Stable drivers are docker, virtualbox, VMWare, Parallels, kvm, Hyperkit, Hyper-V. In an experimental and active development is podman.


$ minikube start --vm-driver=virtualbox


If “vm-driver” is blank by default minikube will use docker. Once you created the cluster with a specific driver you couldn’t start with another driver type (you’ll get an error when you do it). I will create the cluster with the docker driver, so I executed without setting the driver. This will be a little slow the first time only because is pulling all images, the next execution will be faster, you will see sometime like this:


Output:


➜  /tmp minikube start
😄  minikube v1.9.0 on Ubuntu 18.04
✨  Using the docker driver based on existing profile
🚜  Pulling base image ...
🔄  Restarting existing docker container for "minikube" ...
🐳  Preparing Kubernetes v1.18.0 on Docker 19.03.2 ...
    ▪ kubeadm.pod-network-cidr=10.244.0.0/16
🌟  Enabling addons: dashboard, default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube"


Installing kubectl:

After that you need another tool “kubectl” that lets you manage (create, update and delete) the k8s resources, kubectl supports imperative commands as declarative commands. About the installation you could use you package manager or download and install in this way[8]:


$ curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl
$ chmod +x ./kubectl
$ sudo mv ./kubectl /usr/local/bin/kubectl
$ kubectl version --client


Output:


➜  /tmp kubectl version --client
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-20T12:52:00Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}


We are going to mention the elements in a k8s cluster: In a k8s cluster, you could find a master and nodes at first Level. At least one node should exist, a node is representing a virtual or physical machine. Where each node has elements like kubelet, kubelet proxy, Pods, and some add ons related to Dns, UI[9].


Listing Nodes in a cluster:

➜  Desktop kubectl get nodes
NAME       STATUS   ROLES    AGE   VERSION
minikube   Ready    master   34h   v1.18.0



K8s Cluster:

  • Master

  • nodes

K8s Node:

  • Kubelet

  • Kubelet Proxy

  • Pod(s)

K8s Master:

  • Etcd

  • Controller manager

  • Api server

  • Scheduler

  • Cloud manager




A Pod is the smallest unit in k8s cluster, they are representing a running process in a cluster. You could have more than one container in a pod. In this example, we have docker containers in each pod because we selected that driver at the beginning when we init our cluster with minikube.


Creating a Pod (this command will add the pod to the default namespace later we explain this):


➜  /code kubectl run db --image mongo
pod/db created

Another way to create a pod is using a yml file that defines our deployment in a declarative way, this is the recommended because it is following one of the GitOps principles, when we talk about a declarative way we define in a yml the desired state we want.


In an imperative way we put steps one after another or execute commands in order to get the desired state, i.e: when I execute kubectl run and later kubectl delete.

The next links are yml examples In a declarative way, the first one nginx-pod will create a pod using two docker images: nginx and docker/whalesay, but the final result is one docker container, showing “Hello kubernetes” in a html file. The second one will create another docker container with a mongodb instance. You shouldn’t mix declarative and imperative commands in a deployment.


  1. https://gitlab.com/j3nnn1/blogspot/-/blob/master/blogspot2/k8s/nginx-pod.yaml

  2. https://gitlab.com/j3nnn1/blogspot/-/blob/master/blogspot2/k8s/pod/db.yml

Using a declarative way to create a Pod.


kubectl create -f pod/db.yml

Listing Pods in a node with the default namespace:

Add -o wide to get more information or to specify a format like json or yaml.

➜  /code kubectl get pods                                           
NAME   READY   STATUS    RESTARTS   AGE
db     1/1     Running   0          91s


Listing pods running in a cluster:


kubectl get pods --all-namespaces


Describe POD:

When I read for the first time this command, I related it to the command docker inspect, when we need information about the pod, we could get it with this command, the output is extensive and have a lot of information: the more relevant items are namespace, status could be pending or running, node, Start Time, Restart Count, and Event related to it:

➜  k8s git:(master) ✗  kubectl describe pod db  
Name:         db
Namespace:    default
Priority:     0
Node:         minikube/172.17.0.2
Start Time:   Thu, 11 Jun 2020 16:07:26 -0300
Labels:       type=db
              vendor=MongoLabs
Annotations:  <none>
Status:       Pending
IP:          
IPs:          <none>
Containers:
  db:
    Container ID: 
    Image:         mongo:3.3

Events:

  Type    Reason     Age        From               Message

  ----    ------     ----       ----               -------

  Normal  Scheduled  <unknown>  default-scheduler  Successfully assigned default/db to minikube

  Normal  Pulling    7m47s      kubelet, minikube  Pulling image "mongo:3.3"

  Normal  Pulled     6m57s      kubelet, minikube  Successfully pulled image "mongo:3.3"

  Normal  Created    6m56s      kubelet, minikube  Created container db

  Normal  Started    6m56s      kubelet, minikube  Started container db


...


A Pod in “Running” state means that all containers successfully started, and the db Pod is ready to serve requests. If the db pod in your cluster is running, then we can try accessing it or executing a well known “ps” command to prove that it is actually working. We’ll kill the mongodb process (change the deployment state) and we’ll could observe that k8s will do the necessary to bring us the desire state defined in the yml file:

Executing a command inside a POD in the first container that found, use -c to specify a container (remember a pod could have multiple containers). 

➜  k8s git:(master) ✗ kubectl exec db -- ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.8  0.2 286084 59136 ?        Ssl  19:08   0:26 mongod --rest --httpinterface
root        29  0.0  0.0  17496  2036 ?        Rs   19:59   0:00 ps aux


Accessing to the POD (running a shell inside the container):

➜ kubectl exec -it db --  sh 

Killing a process inside the container and getting pod status:

You could see the Restarts column increase in one, this k8s features let us keep redundancy and stability. Remember the fourth principle where we have a software agent checking the actual state vs the desired state of application, and making actions to achieve the desired state defined in the Yaml file.

➜  k8s git:(master) ✗ kubectl exec -it db -- pkill mongod
➜  k8s git:(master) ✗ kubectl get pods                  
NAME   READY   STATUS      RESTARTS   AGE
db     0/1     Completed   2          75m

To delete the pod: 

After this practical we could delete the Pod to become a new practical and the last one. Also you could delete the entire deployment and the pod will be deleted too:

➜ kubectl delete pod db
pod "db" deleted

➜ kubectl delete deployment db

➜ kubectl delete -f pod/db.yml #declarative way




A deployment contains nodes, pods, use images, and all defined in a yml file called the manifest. The deployment could be organized by namespace, also you could use annotations to select objects/resources. The first part of a yml file declarative with a deployment looks like:

We set three replicas, some metadata “environment” and “organization”, and we specified the nginx image in the container. The Yml file with this kind of definition is called “manifest”

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-declarative
  annotations:
    environment: prod
    organization: marketing
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest

Creating a deployment:

Download it the file, from:

https://gitlab.com/j3nnn1/blogspot/-/blob/master/blogspot2/k8s/step2/declarative-deployment.yaml


➜  step2 git:(master) ✗ kubectl apply -f declarative-deployment.yaml

deployment.apps/nginx-declarative created

Listing deployment:

➜  step2 git:(master) ✗ kubectl get deployments

NAME                READY   UP-TO-DATE   AVAILABLE   AGE

nginx-declarative   3/3     3            3           63s

Update a deployment:

If you edit declarative-deployment.yaml you couldn’t see the change immediately in the k8s cluster, to see the changes edited in the file you need to execute the command kubectl apply or kubectl patch, an alternative is use watch with this command kubectl will subscribe to changes made into the declarative-deployment.yaml. And when you update the deployment you could see the output message change a little bit when you created the deployment.


➜  step2 git:(master) ✗ kubectl apply -f declarative-deployment.yaml
deployment.apps/nginx-declarative configured



In a GitOps flow you have an element that maintains an infinite loop and continuously reconciles the desired and actual state, checking if a change was made to this file, are called controllers. There are many patterns to implement a controller[10], K8s comes with built-in controllers and you also could write it in different languages, the controller should use the API server to manage the cluster. It’s recommended that we have many simple controllers instead of one monolithic controller, I.e use a controller for each rk8s resource to manage. And when we talk about automation we need to include a new term the GitOps operator [11] tha lets automate a task beyond what Kubernetes itself provides.

There are many tools to implement CI/CD. Some of these tools are: graden.io, skaffold, Draft, Squash, Helm, Charts, Ksonnet, Fluxcd, Argo CD. The path is the same, you should choose the option more suitable for the environment or kind of software application, there is no unique way or a magic equation to apply these principles, each company has different needs and structure.

1 https://www.weave.works/blog/gitops-operations-by-pull-request

2 https://research.google/pubs/pub43438/

3 https://github.com/kubernetes/kubernetes/commit/2c4b3a562ce34cddc3f8218a2c4d11c7310e6d56

4 https://about.gitlab.com/

5 https://github.com/

6 https://bitbucket.org/

7 https://minikube.sigs.k8s.io/docs/  “minikube is local Kubernetes, focusing on making it easy to learn and develop for Kubernetes”

8 https://kubernetes.io/docs/tasks/tools/install-kubectl/

9 https://kubernetes.io/docs/concepts/overview/components/

10 https://kubernetes.io/docs/concepts/extend-kubernetes/extend-cluster/#extension-patterns

11 https://kubernetes.io/docs/concepts/extend-kubernetes/operator/



PD: sorry for any grammar mistake made in the text, send corrections if you want.