Bootstrapping with Kubernetes
A book for condencing my knowledge in Kubernetes and sharing with others.
I'll try my best to help you appreciate how beautiful Kubernetes is.
Introduction
Kubernetes has been THE go-to solution for orchestrating workloads as it has established its acceptance due to its elegant design and commendable performance. This book is intended for those who wish to learn Kubernetes from scratch, understand its concepts, and use it with other open source tools to build scalable applications.
There are plethora of resources and articles available online to learn Kubernetes. Honestly, I haven't found one that gives a consistent and comprehensive explanation of the underlying concepts and how to use them with other tools. This book is an attempt to fill that gap. This book will distinguish itself by providing a hands-on approach to getting into the Kubernetes ecosystem.
The supplementary code for this book is available on GitHub.
Concepts
We cover the basic concepts of Kubernetes in this chapter.
Familiarizing ourselves with what a cluster is, we then dwell into the architecture of Kubernetes. A basic understanding of the architecture is essential if you truly want to appreciate the beauty of Kubernetes. With this knowledge, we can easily understand what happens in the background when we interact with Kubernetes.
Before going ahead with the architecture, we'll first cover some basic concepts like Nodes, Pods, Services, Deployments, and more. These are important for understanding the underlying architecture of Kubernetes.
For me, at the beginning, understanding these concepts was a bit overwhelming. If you feel the same, don't worry, the rest of the book will help you utilize these concepts in practical settings.
Cluster and Kubernetes
What is a Cluster?
Let's say you design a system of services which interact with each other to provide a functionality. You have containerized these services and want to deploy them.
Assume you have 3 spare laptops, and want to run your services on them. Ideally, you would want to connect your laptops together and run your services on them. This interconnected set of laptops is what we call a Cluster. In the context of Kubernetes, a cluster is a set of machines which run your containerized applications.
What is Kubernetes?
According to the official documentation
Kubernetes, also known as K8s, is an open source system for automating deployment, scaling, and management of containerized applications.
Let's simplify this a bit.
When you're deploying applications on a cluster of machines you would like to have the following issues addressed:
-
High Availability: You want the applications to be highly available. If instance of an application goes down, you want another instance to take over.
-
Scalability: You want your application to scale up and down as per the load.
-
Networking: You want the applications to be able to communicate with each other.
... and many more.
This is the job of Kubernetes. It provides you with a set of tools to manage your applications on a cluster of machines.
Nodes
A node is simply a machine.
As an example, imagine you have three spare laptops. And you create a cluster of these laptops using Kubernetes. Each of these laptops is a node in the cluster.
Pods
Pods are the simplest Kubernetes resources that you can manage. A pod is a group of one or more containers that share network and storage.
Why not just manage containers?
You might be wondering why we need pods when we can manage containers directly.
Though it might seem reasonable to simply schedule containers, issues might arise when those containers share dependencies and must be scheduled together to simplify the sharing of networking and memory resources. If we only manage containers, then in a highly distributed system the interdependent containers might get scheduled in different and far-apart machines — introducing delays within the system. Hence, pods were introduced as an abstraction over the containers.
This multi-container capability of pods is not commonly used in deploying applications: you would usually end up running a single container inside a pod. However, it has also given rise to design patterns like sidecar (we’ll cover this later) used in tools like istio (also later).
From here on, unless specified, we will be referring to a pod as a single container running inside it.
Services
Services is how any communication to a pod is done in Kubernetes.
Why do we need services?
Let's say you manage to get your applications running in pods, and you have a pod for each frontend and backend application. And these pods should communicate with each other.
How do you do that?
You look through and find out that every pod is assigned an IP Address. You hardcode these IP Addresses in your application code and deploy them. Would the pods still have the same IP Address when they are rescheduled? Most likely not.
You want to hand over the responsibility of managing the IP Addresses to Kubernetes, and just program an endpoint, like http://frontend
or http://backend
, and let Kubernetes handle the rest.
This is where services come in.
A service exposes a pod or a set of pods as a network service. This service has an IP Address and a DNS name. You can use this DNS name to communicate with the pods.
TLDR; any request that should go to a pod should be directed through a service.
Replica Sets
Replica Sets manage the number of replicas of a pod running at any given time.
Why do we need Replica Sets?
Say you designed an application and deployed it on one or more pods. And you've setup a service to communicate with these pods. Ideally, you would want your application to be always available. That means, if a pod goes down, another pod should take its place.
How do you ensure this?
Do you manually check if a pod is down and start another one? That's not a good idea, especially when you have thousands of pods running in your cluster.
Replica Sets handle just this. You specify the intended scenario like "I'd like to have 3 replicas of this pod running at all times", and the Replica Set ensures that this is the case.
A Replica Set continuously monitors the number of replicas of a pod and tries to match it with the desired number of replicas. If the number of replicas is less than the desired number, it starts a new pod. If the number of replicas is more than the desired number, it stops a pod.
Deployments
Deployments control the Replica Sets.
Why do we need Deployments?
Say you have a Replica Set managing the number of replicas of a pod. And you've setup a service to communicate with these pods.
Now, you create a new version of your application and want to deploy it. You could simply update the description of the pod in the Replica Set. And let the Replica Set handle the rest.
But what if the new version of the application has a bug? Or what if you want to rollback to the previous version?
This is where Deployments come in.
A Deployment is a higher-level concept which manages Replica Sets. It allows you to deploy new versions of your application, rollback to a previous version, and scale the application up and down.
A deployment is how you truly manage your application on Kubernetes.
Quick Recap
Here are the key concepts we have learned so far:
Cluster and Kubernetes
A cluster is a group of machines that work together.
Kubernetes is a container orchestration tool that helps manage these machines.
Nodes
A node is simply a machine.
Pods
A pod runs one or more containers.
Services
A service is a way to expose an application running in a pod to the outside world.
Replica Sets
Replica Sets manage the number of replicas of a pod running at any given time.
Deployments
Deployments control the Replica Sets, allowing versioning and scaling of applications.
Now let's understand the architecture of Kubernetes.
Architecture
Control Plane - Data Plane Design Pattern
On a very high level, Kubernetes architecture can be divided into two planes: the control plane and the data plane. This design pattern used in distributed systems as a way for separation of concerns. To put it simply,
The control plane makes decisions and the data plane carries out those decisions.
Keeping this in mind, let's dive into the architecture of Kubernetes.
The Kubernetes Control Plane
The Kubernetes control plane has the following components:
- API Server
- etcd
- Scheduler
- Controller Manager
- Cloud Controller Manager
1. API Server
The API Server is how you interact with Kubernetes. In the following chapters, you will use a command-line tool called kubectl
. This tool communicates with the API Server and directs the cluster to do what you want.
2. etcd
etcd is a key-value store. The d
in etcd
stands for distributed.
Kubernetes uses etcd
to store data for all of it's resources, such as pods, services, deployments, and more. Kubernetes uses etcd
because it's a reliable way to store data for distributed systems, as you can have multiple instances of etcd
running at the same time, synchronizing data between them.
Remind me to use etcd
as an example for distributed systems in the future.
3. Scheduler
It's job is straightforward: it schedules pods to run on nodes.
When you create a pod, you don't have to worry about where it runs. The Scheduler monitors the cluster and assigns pods to nodes. The scheduler interacts with the etcd
through the API Server to get information about the cluster and make decisions about where to run pods.
4. Controller Manager
It is a collection of controllers. Each controller is responsible for managing a specific resource in the cluster.
For example, the ReplicaSet
controller is responsible for managing ReplicaSets
. The Deployment
controller is responsible for managing Deployments
.
5. Cloud Controller Manager
This component is helps you run your Kubernetes cluster on a cloud provider. It's a way to abstract the cloud provider's APIs from the core Kubernetes code.
For example, if you're running Kubernetes on AWS, the Cloud Controller Manager will help you interact with AWS APIs. If you're running Kubernetes on GCP, the Cloud Controller Manager will help you interact with GCP APIs.
Now that we have seen the control plane, let's move on to the data plane.
The Kubernetes Data Plane
The data plane is where the actual work happens. It mainly consists of nodes which run the following components:
- Kubelet
- Kube Proxy
- Container Runtime
1. Kubelet
The Kubelet runs as a linux service on each node. This is responsible for registering the node with the API Server, and managing the containers within the node. It must be noted that the Kubelet only manages the containers that are created through Kubernetes, so any other containers on your nodes that are not managed by Kubernetes are not managed by the Kubelet.
2. Kube Proxy
It handles the network related operations. Remember we talked about services in the previous chapter? The Kube Proxy handles just that.
3. Container Runtime
It is responsible for running containers on the nodes.
Now that we've gone through the definitions, let's see how it looks in a diagram.
The following diagram is taken from the official Kubernetes documentation, I've added a distinction to show the control plane and data plane.
To show the interaction between the components, we'll go through behind the scene working of a Kubernetes deployment in the next chapter. This will help you truly understand and appreciate how the control plane and data plane work together.
Architecture - Behind the Scenes
We have gone through the high-level architecture of Kubernetes. And we've learned what's the purpose of each component in the architecture. Now let's go a bit deeper and understand how everything comes together behind the scenes.
I'm taking the following scenario and show you how it'll work in the background.
You have three spare laptops running linux, and you want to create a Kubernetes cluster using these laptops. And you are assigned with a task to create a web application. There are two microservices, a frontend microservice and a backend microservice. You want to deploy these microservices over the cluster such that each microservice has three replicas and can communicate with each other. To keep it simple, I'm not going to use any cloud services or databases.
Note that this is an example scenario, don't run these commands on your machines. This is just to give you an idea of the underlying steps.
Setting up the Cluster
To create a Kubernetes cluster using these laptops you need to install some Kubernetes-specific software on each laptop. Let's call these laptops as nodes. We have three nodes, node-1
will be our Master Node and node-2
and node-3
will be our Worker Nodes.
Master Node - Control Plane
The Master Node will have the following Control Plane components:
-
API Server: The API Server is how any interaction with the cluster happens.
-
Scheduler: The Scheduler is responsible for scheduling the pods on the nodes.
-
Controller Manager: The Controller Manager is responsible for managing the controllers that manage the resources in the cluster.
-
etcd: The
etcd
is a distributed key-value store that Kubernetes uses to store all of its data. -
Cloud Controller Manager: We'll skip this component for simplicity. It's not as important from the perspective of understanding the architecture in this scenario.
Worker Nodes - Data Plane
The following components run on all worker nodes in the cluster, these are the components that make up the Data Plane of Kubernetes:
-
Container Runtime: Essentially, running any workload on Kubernetes comes down to spinning up a container. Therefore, each node in the cluster should have a container runtime.
-
Kubelet: The Kubelet is responsible for interacting with the API Server and the container runtime. It's the Kubelet's job to make sure that the containers are running as expected.
-
Kube Proxy: The Kube Proxy will handle all the network-related operations within the node.
It must be noted that you can run the data plane components on the Master Node as well. I'm not showing that for the sake of simplicity.
Now that we have set up the cluster, let's move on to creating the microservices
Designing the Microservices
We have two microservices, a frontend microservice and a backend microservice.
Frontend Microservice
This is a simple node.js server that serves a static HTML page. The frontend microservice will be listening on port 3000
. This application makes REST API calls to the backend microservice.
And you've designed it such that it takes the endpoint of the backend microservice as an environment variable. We'll call this environment variable BACKEND_ENDPOINT
.
Backend Microservice
This is a simple Go server that serves a JSON response. The backend microservice will be listening on port 8080
.
Designing the Deployment and Services
We have the following requirements for the frontend microservice:
- It should have three replicas running at all times.
- It should be able to communicate with the backend microservice.
We have the following requirements for the backend microservice:
- It should have three replicas running at all times.
- It should be reachable by an external client.
Frontend Microservice Deployment
Using the requirements, we design the following specification.
Do not worry about the specification, we'll get through that in the next chapter, but for now, we're only concerned with a few key points in the specification.
The following is the deployment specification for the frontend microservice:
frontend-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
spec:
replicas: 3
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: frontend
image: frontend:v0.0.1
env:
- name: BACKEND_ENDPOINT
value: http://backend
You can note that:
replicas: 3
: specifies that we need three replicas of the frontend microservice running at all times.template: ...
: specifies the pod template. It tells what properties the pod should have.containers: ...
: specifies the container running within the pod. It tells what image to use and what environment variables to set. Notice that we've set theBACKEND_ENDPOINT
environment variable tohttp://backend
, we'll see where this comes from in the Services section. Theimage: frontend:v0.0.1
specifies the image to use for the container.
Backend Microservice Deployment
The following is the deployment specification for the backend microservice:
backend-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
spec:
replicas: 3
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: backend
image: backend:v0.0.1
ports:
- containerPort: 8080
You can note that:
replicas: 3
: specifies that we need three replicas of the backend microservice running at all times.template: ...
: specifies the pod template. It tells what properties the pod should have.containers: ...
: specifies the container running within the pod. It tells what image to use. Theimage: backend:v0.0.1
specifies the image to use for the container.
To make the backend microservice reachable by an external client, we need to create a service for it. The following is the service specification for the backend microservice:
backend-service.yaml
apiVersion: v1
kind: Service
metadata:
name: backend
spec:
type: ClusterIP
selector:
app: backend
ports:
- protocol: TCP
port: 80
targetPort: 8080
You can note that:
selector: ...
: specifies that the service will only target the pods with the labelapp: backend
. Any traffic to this service will be routed to the pods with this label.ports: ...
: specifies which port on the service receives the traffic and which port on the pod should the traffic be forwarded to.
Now that we have the deployment and service specifications, let's see how everything comes together.
Deploying everything
To deploy anything over the Kubernetes cluster, you would use the kubectl
command-line tool. This is a special tool that helps you interact with the Kubernetes API Server.
To deploy the resources, in this setup, you would head on to the Master Node, open the terminal, and run the following commands:
kubectl apply -f frontend-deployment.yaml
kubectl apply -f backend-deployment.yaml
kubectl apply -f backend-service.yaml
Behind the scenes, the following happens (I've mentioned etcd
and API Server
a lot, just to highlight the importance of these components):
- In
node-1
,kubectl
serializes the YAML files into a JSON payload and sends it to the API Server. So, all the files,frontend-deployment.yaml
,backend-deployment.yaml
, andbackend-service.yaml
are sent to the API Server bykubectl
after converting them to JSON. - The API Server in
node-1
receives the payload and processes it. It validates the specification and writes the data to theetcd
. Each of our specifications for the frontend deployment, backend deployment, and backend service are stored in theetcd
. The API Server is the only component that directly interacts with theetcd
. - In
node-1
, the Controller Manager, specifically theDeployment Controller
, sees the new deployment specifications in theetcd
(via the API Server). Then it creates/updates the Replica Sets inetcd
(via API Server) for the frontend and backend deployments, which defines how many replicas of the pods should be running at any given time. TheReplica Set Controller
(another controller) watches the Replica Sets sees that there are no pods running for the frontend and the backend deployments. It then creates the pod specifications in theetcd
(via the API Server). - The Scheduler in
node-1
sees the new pod specifications in theetcd
. It then assigns the pods to the worker nodes. In this case, let's say, it assignes two frontend pods tonode-2
and one tonode-3
. And one backend pod tonode-2
and two tonode-3
. - The Kubelet in
node-2
andnode-3
sees the new pod specifications in theetcd
(via API Server). It then creates the containers for the pods. Running the containers is the job of the container runtime. The Kubelet interacts with the container runtime to create the containers. - The Kube Proxy in
node-2
andnode-3
sees the new service specification in theetcd
(via API Server). It then configures the network rules in the node to route the traffic to the backend pods, which basically involves setting up the iptables rules.
And that's how everything comes together behind the scenes.
Installation and Setup
In this chapter, we'll discuss how to get Kubernetes up and running on your local machine. I will cover two methods to set up Kubernetes on your local machine:
-
Minikube: For those who want to get started with Kubernetes quickly and easily. This method is recommended for beginners. You won't have to worry about setting up a cluster from scratch.
-
A local Kubernetes setup from scratch: We'll use three laptops and create a Kubernetes cluster from scratch. This is a more advanced method and it'll help you understand the internals of Kubernetes better.
-
CloudLab: If you have access to CloudLab, you can use this guide to set up a 3-node Kubernetes cluster on CloudLab.
Kubernetes Components
If you're going with second and third methods, you should know about the common components being created with the installation. These will help you understand the working of Kubernetes better and come in handy when troubleshooting.
-
container runtime: Essentially, running kubernetes comes down to running containers on your machines. The container runtime is responsible for running the containers. In our case, we have Docker installed. Though Docker has it's own runtime, called
containerd
, Kubernetes requires a runtime that implements the Container Runtime Interface (CRI). So we'll be installingcri-dockerd
which is a CRI implementation for Docker. The installation steps specify a flag--cri-socket=unix:///var/run/cri-dockerd.sock
. This flag tells Kubernetes to usecri-dockerd
as the container runtime. -
Pod Network CIDR: Every pod in the cluster gets an IP Address. The
Pod Network CIDR
specifies the range of IP addresses that can be assigned to pods. We use192.168.0.0/16
which is a pool of65,536
IP addresses. You must be careful while choosing this range as it should not overlap with your local network. For most cases, this range should work fine. The--pod-network-cidr
flag is used to specify this range. -
kubeadm: This is a tool used to bootstrap the Kubernetes cluster. It's used to set up the control plane nodes and the worker nodes. A
kubeadm init
run on a node will set up a control plane on that node, i.e. make it a master node. Akubeadm join
run on a node will join that node to the master node, i.e. make it a worker node. The job ofkubeadm
ends once the cluster is set up. -
kubectl: This is a tool used to manage the resources in the Kubernetes cluster. It's a command-line tool that communicates with the Kubernetes API server to manage the resources.
-
kubelet: This is responsible for managing the containers created by Kubernetes on the node. It runs as a service on the node and communicates with the master node to get the work assigned to it. This component runs in the background on all nodes and communicates with the master node to get the work assigned to it. For a kubelet to start, the
kubeadm init
orkubeadm join
command must have been run on the node. -
kube-proxy: While kubelet manages the containers, kube-proxy manages the networking. It's responsible for routing the traffic to the correct container. It manages the
iptables
rules on the node to route the traffic. -
Container Network Interface (CNI): This is a plugin that provides networking capabilities to the pods. It's responsible for assigning IP addresses to the pods and providing network policies. We'll be using Calico as the CNI plugin in this guide.
Minikube
This is the easiest way to get started with Kubernetes.
Head on to the Minikube installation page and just follow the instructions, be sure to have the dependencies satisfied. The official documentation has a lot of information on how to use Minikube. Here are some commands you might find useful:
Once you have Minikube installed, you can start a cluster by running the following command:
minikube start
Minikube has a concept called profile
. You can create a new profile by running the following command:
minikube start -p <profile-name>
To set this profile as the default
minikube profile <profile-name>
This isolates the cluster from other clusters you might have running on your machine.
You can also simulate a cluster with three nodes by running the following command:
minikube start -p <profile-name> --nodes 3
To view the Kubernetes dashboard, run the following command:
minikube dashboard
A local Kubernetes setup from scratch
This is one of the most interesting experiments I've done with Kubernetes. I had three linux machines lying around and I decided to create a Kubernetes cluster from scratch. This was a great learning experience and I highly recommend you try this out if you have the resources. I wanted everything to be interconnected over my home wifi network. This is how I did it:
Machines used
3 laptops running Ubuntu 22.04 LTS. These laptops had Docker installed on them and were connected to the same wifi network.
Common setup
Install Docker on all machines
I followed the official Docker documentation here. Here are some points to note to ease the process:
- If the
apt-get update
fails with an error in reading from thedocker.list
file, make sure that the url in the file ends withubuntu
and not withdebian
. - Follow the post installation instructions to run Docker as a non-root user. Restart the machine after this step, you'll be able to run Docker commands without
sudo
.
Install CRI-Dockerd on all machines
This is a Kubernetes requirement. It's important to have the Container Runtime Interface (CRI) installed on all machines. I used the following method to install CRI-Dockerd:
- Open the Terminal and run the following command to download the
deb
package:
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.14/cri-dockerd_0.3.14.3-0.ubuntu-jammy_amd64.deb
- Install the package by running
sudo dpkg -i cri-dockerd_0.3.14.3-0.ubuntu-jammy_amd64.deb
- After the installation is complete, run
sudo systemctl enable cri-docker
sudo systemctl start cri-docker
- Verify the installation by running
sudo systemctl status cri-docker
Now that the container runtime is installed on all machines, we can proceed to set up the Kubernetes cluster.
Install kubeadm, kubelet, and kubectl on all machines
First turn off swap on all machines by running
sudo swapoff -a
Go to the official installation instructions and follow the steps to install kubeadm
, kubelet
, and kubectl
on all machines.
At this point we've set up the common requirements on all machines. Now we can proceed to set up the Kubernetes cluster.
Setting up the Master Node
- Run the following command to initialize the master node:
kubeadm init --cri-socket=unix:///var/run/cri-dockerd.sock --pod-network-cidr=192.168.0.0/16
The --cri-socket=unix:///var/run/cri-dockerd.sock
is used to specify which CRI to use. Since we're using CRI-Dockerd, we need to specify the socket path.
The --pod-network-cidr=192.168.0.0/16
is used to specify the range of IP addresses that pods can use. This is a pool of 65,536
IP addresses that Kubernetes can assign to pods.
-
After the command completes, you'll see a message like this:
-
Next, we'll install Calico as the CNI plugin. A CNI plugin is used to provide networking and security services to pods. Calico is a popular choice for Kubernetes clusters. This plugin is used to assign IP addresses to pods and provide network policies. Head on here and setup Calico on the Kubernetes cluster.
Great! Now our master node is set up. Let's move on to setting up the worker nodes. Instead of running kubeadm init
on the worker nodes, we have to run kubeadm join
to join the worker nodes to the master node. To get the join command, run the following command on the master node:
kubeadm token create --print-join-command
This will print the join command. Copy this command.
Setting up the Worker Nodes
- Add the parameter
--cri-socket=unix:///var/run/cri-dockerd.sock
to the join command and run it on the worker nodes with sudo. This will join the worker nodes to the master node. - Verify that the worker nodes have joined the cluster by running
kubectl get nodes
This will show the master node and the worker nodes.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane 4h3m v1.30.2
worker1 Ready <none> 3h53m v1.30.2
worker2 Ready <none> 3h52m v1.30.2
And that's it! We have a Kubernetes cluster up and running from scratch. You can now deploy applications to this cluster and experiment with Kubernetes.
From now on, I'll be using this setup to demonstrate various Kubernetes concepts, but you can use the three node Minikube cluster as mentioned here to simulate this setup.
CloudLab - For Researchers and Educators
If you have access to CloudLab or a similar infrastructure where you can create virtual machines, you can use it to set up a Kubernetes cluster.
Prerequisites
- Access to CloudLab or a similar infrastructure where you can create virtual machines.
kubectl
installed on your local machine. Refer the docs here.
Creating CloudLab Experiment
-
Head on here and instantiate this profile on CloudLab. This profile will create three nodes for you: one master node and two worker nodes.
-
Clone the cloudlab-kubernetes repository locally. Follow the instructions in the README to set up the Kubernetes cluster and configure the nodes. This step will also configure the
kubectl
on your local machine to connect to the Kubernetes cluster. -
Verify the installation by running
kubectl get nodes
If everything is set up correctly, you should see the master and worker nodes listed as shown below:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane 4h3m v1.30.2
worker1 Ready <none> 3h53m v1.30.2
worker2 Ready <none> 3h52m v1.30.2
Common Resources
In this chapter, we'll discuss some common resources that you will encounter while working with Kubernetes. These resources are mostly used to deploy applications over the cluster.
References
All resources for this chapter are available in rutu-sh/bootstrapping-with-kubernetes-examples repository.
Declarative vs. Imperative object management
Kubernetes provides two ways to manage objects: declarative and imperative.
Imperative object management
In imperative object management, you tell Kubernetes exactly what to do. You provide the command and Kubernetes executes it.
Declarative object management
In declarative object management, you tell Kubernetes what you want to achieve. You provide a configuration file and Kubernetes makes sure that the cluster matches the desired state.
The difference
The difference between the two is subtle. Let's understand this with examples.
Imperative pod creation
Let's understand imperative object management with an example.
Run the following command to create a pod named simple-pod
:
cd bootstrapping-with-kubernetes-examples/deploy/simple-pod
kubectl create -f pod.yaml
It should give the following output:
$ kubectl create -f pod.yaml
pod/simple-pod created
In this command, you're telling Kubernetes to exactly perform the create
operation on the pod.yaml
file. This will create a pod named simple-pod
with no labels, as shown below:
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-pod 1/1 Running 0 25m <none>
Now, say you want to add a label app=simple-pod
to the pod. You can do this by running the following command:
kubectl label pod simple-pod app=simple-pod
This will add the label app=simple-pod
to the pod.
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-pod 1/1 Running 0 27m app=simple-pod
In this command, you're telling Kubernetes to exactly perform the label
operation on the simple-pod
pod.
Another way to add label would be edit the metadata
field in the pod.yaml
file and add the label there, as shown below:
metadata:
name: simple-pod
labels:
app: simple-pod
Now, if you run create
on the updated configuration again, kubectl will give the following output:
$ kubectl create -f pod.yaml
Error from server (AlreadyExists): error when creating "pod.yaml": pods "simple-pod" already exists
This happens because the create
operation is idempotent. It will only create the object if it doesn't exist. It won't update the object if it already exists. This creates issues when you are working with existing objects and want to update them using imperative object management. To perform such updates, the imperative way, you will need to keep using commands like kubectl label pod simple-pod app=simple-pod
, kubectl edit pod simple-pod
, etc.
This behavior is not ideal when you want to manage objects in a more reliable and consistent way. This is where declarative object management helps.
Before going into declarative object management, let's delete the simple-pod
pod:
kubectl delete -f pod.yaml
Declarative pod creation
Using declarative object management, you provide a configuration file that describes the desired state of the object. Kubernetes will make sure that the cluster matches the desired state.
Let's understand this with an example.
kubectl apply -f pod.yaml
This will again create a pod named simple-pod
with no labels.
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-pod 1/1 Running 0 48s <none>
Now, to add a label app=simple-pod
to the pod, you can edit the pod.yaml
file and add the label there, as shown below:
metadata:
name: simple-pod
labels:
app: simple-pod
Now, if you run apply
on the updated configuration again, kubectl will update the pod with the new label:
$ kubectl apply -f pod.yaml
pod/simple-pod configured
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-pod 1/1 Running 0 2m42s app=simple-pod
The benefit here is that you don't have to worry about the current state of the object. You just provide the desired state and Kubernetes will make sure that the cluster matches the desired state. This is more reliable and easier to manage.
Also, if you run apply
command again, without making any changes, the command will still run successfully.
Delete the simple-pod
pod:
kubectl delete -f pod.yaml
Which one to use?
Declarative object management is the recommended way to manage objects in Kubernetes. This way, you don't have to consistently keep track of the current state of the object, Kubernetes will do that for you. All you need to do is to specify the desired state.
Kube System Components
Note: This chapter is only inteded for a deeper understanding of Kubernetes, can be skipped.
There are some default resources created when you set up a Kubernetes cluster, called the Kubernetes System Components. These components are responsible for the working of the cluster.
List the components by running the following command:
kubectl get all -n kube-system -o wide
Here's a sample output:
$ kubectl get all -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/coredns-7db6d8ff4d-792wd 1/1 Running 0 20m 192.168.219.65 master <none> <none>
pod/coredns-7db6d8ff4d-nvxsf 1/1 Running 0 20m 192.168.219.68 master <none> <none>
pod/etcd-master 1/1 Running 0 20m 192.168.1.1 master <none> <none>
pod/kube-apiserver-master 1/1 Running 0 20m 192.168.1.1 master <none> <none>
pod/kube-controller-manager-master 1/1 Running 0 20m 192.168.1.1 master <none> <none>
pod/kube-proxy-9l64r 1/1 Running 0 20m 192.168.1.1 master <none> <none>
pod/kube-proxy-svnvd 1/1 Running 0 17m ***.***.***.** worker2 <none> <none>
pod/kube-proxy-zfvgt 1/1 Running 0 19m ***.***.***.** worker1 <none> <none>
pod/kube-scheduler-master 1/1 Running 0 20m 192.168.1.1 master <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 20m k8s-app=kube-dns
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
daemonset.apps/kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 20m kube-proxy registry.k8s.io/kube-proxy:v1.30.2 k8s-app=kube-proxy
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/coredns 2/2 2 2 20m coredns registry.k8s.io/coredns/coredns:v1.11.1 k8s-app=kube-dns
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
replicaset.apps/coredns-7db6d8ff4d 2 2 2 20m coredns registry.k8s.io/coredns/coredns:v1.11.1 k8s-app=kube-dns,pod-template-hash=7db6d8ff4d
I've hidden the IP addresses for security reasons.
The following chapters will cover each of these components.
kube-dns
Kubernetes has a built-in DNS service that helps in resolving the DNS names. This service is called kube-dns
.
There is a DNS record for each service and pod created in the cluster. The DNS server is responsible for resolving the DNS names to the IP addresses.
Internally, Kubernetes uses CoreDNS as the DNS server. Let's see the components that make up the kube-dns
service:
CoreDNS Deployment: This is the deployment for the CoreDNS server. It manages the CoreDNS replica sets.
$ kubectl get deployments -n kube-system -l k8s-app=kube-dns -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
coredns 2/2 2 2 21m coredns registry.k8s.io/coredns/coredns:v1.11.1 k8s-app=kube-dns
CoreDNS Replica Set: This is the replica set for the CoreDNS server. It manages the CoreDNS pods.
kubectl get replicaset -n kube-system -l k8s-app=kube-dns -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
coredns-7db6d8ff4d 2 2 2 21m coredns registry.k8s.io/coredns/coredns:v1.11.1 k8s-app=kube-dns,pod-template-hash=7db6d8ff4d
CoreDNS Pods: These are the pods that run the CoreDNS server. There are usually two pods running in the cluster. These pods are labeled with k8s-app=kube-dns
.
$ kubectl get pods -n kube-system -l k8s-app=kube-dns -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
coredns-7db6d8ff4d-792wd 1/1 Running 0 157m 192.168.219.65 master <none> <none> k8s-app=kube-dns,pod-template-hash=7db6d8ff4d
coredns-7db6d8ff4d-nvxsf 1/1 Running 0 157m 192.168.219.68 master <none> <none> k8s-app=kube-dns,pod-template-hash=7db6d8ff4d
The Service kube-dns
is a ClusterIP service that exposes the CoreDNS server to the cluster. A ClusterIP Service is only accessible within the cluster.
$ kubectl get svc -n kube-system -l k8s-app=kube-dns -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 22m k8s-app=kube-dns
As shown above, this service uses the kube-dns
label selector, which is the same label used by the CoreDNS pods.
DNS Resolution in Kubernetes
The DNS Server, i.e. the codedns pods, always run in the kube-system
namespace in the master node. The DNS server is responsible for resolving the DNS names to the IP addresses. Whenever kubelet
creates a pod, it injects a DNS configuration file, /etc/resolv.conf
, into the pod. This file contains the IP address of the DNS server and the search domains.
Run the following command to create a simple pod and ssh
into it:
kubectl run -i --tty alpine --image=alpine --restart=Never -- sh
Once you're inside the pod, view the /etc/resolv.conf
file:
cat /etc/resolv.conf
You'll see the IP address of the DNS server and the search domains.
Let's test the DNS resolution. Run the following command to resolve the IP address of the kube-dns
service:
nslookup kube-dns.kube-system.svc.cluster.local
Here's the sample output:
$ kubectl run -i --tty alpine --image=alpine --restart=Never -- sh
If you don't see a command prompt, try pressing enter.
/ #
/ #
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ #
/ #
/ # nslookup kube-dns.kube-system.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10:53
Name: kube-dns.kube-system.svc.cluster.local
Address: 10.96.0.10
From the /etc/resolv.conf
file, you can see that the DNS Server is located at 10.96.0.10
on port 53
, which is the kube-dns
service. When you run the nslookup
command, the request is sent to the DNS server. The search
field in the /etc/resolv.conf
file specifies the search domains to append to the DNS query.
Putting it all together
Kubernetes has an internal DNS service, called kube-dns
. This service is responsible for resolving the DNS names to the IP addresses.
Kubernetes creates DNS records for:
- Services
- Pods
Kubernetes uses CoreDNS as the DNS server. The CoreDNS server is run as a deployment and exposed via a ClusterIP service called kube-dns
. The DNS server pods are scheduled on the Master node.
Kubelet is responsible for injecting the DNS configuration file into the pod. The /etc/resolve.conf
file contains the IP address of the Master node having the DNS server and the search domains. Whenever a request is made, the pods append the search domains to the DNS query till the DNS server resolves the IP address.
Whenever a pod has to make a request, it looks up the /etc/resolve.conf
file to find the IP of the DNS server. It then sends the DNS query to the DNS server. The DNS server resolves the IP address and sends it back to the pod. The pod then makes the request to the resolved IP address.
etcd
This component is the distributed key-value store used by Kubernetes to store all cluster data. etcd is a consistent and highly-available key-value store used as Kubernetes' backing store for all cluster data.
Run the following command to check the status of the etcd cluster:
kubectl get pods -n kube-system -l component=etcd
$ kubectl get pods -n kube-system -l component=etcd -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
etcd-master 1/1 Running 0 4h24m 192.168.1.1 master <none> <none> component=etcd,tier=control-plane
SSH into the etcd
pod by running the following command:
kubectl exec -it etcd-master -n kube-system -- /bin/sh
Set the environment variables:
export ETCDCTL_API=3
export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
View the Keys created for the Services in the etcd cluster, by running the following command:
etcdctl get /registry/services/specs --prefix --keys-only
Similarly, view the pods in the cluster by running the following command:
etcdctl get /registry/pods --prefix --keys-only
To view the data for each key, remove the --keys-only
flag.
Here's the sample output:
$ kubectl exec -it etcd-master -n kube-system -- /bin/sh
sh-5.2#
sh-5.2# export ETCDCTL_API=3
sh-5.2# export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
sh-5.2# export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
sh-5.2# export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
sh-5.2#
sh-5.2# etcdctl get /registry/services/specs --prefix --keys-only
/registry/services/specs/calico-apiserver/calico-api
/registry/services/specs/calico-system/calico-kube-controllers-metrics
/registry/services/specs/calico-system/calico-typha
/registry/services/specs/default/kubernetes
/registry/services/specs/kube-system/kube-dns
sh-5.2#
sh-5.2# etcdctl get /registry/pods/ --prefix --keys-only
/registry/pods/calico-apiserver/calico-apiserver-7cb798b74b-dmrx6
/registry/pods/calico-apiserver/calico-apiserver-7cb798b74b-s59cs
/registry/pods/calico-system/calico-kube-controllers-798d6c8f99-6mq5x
/registry/pods/calico-system/calico-node-gbxp7
/registry/pods/calico-system/calico-node-gpbrj
/registry/pods/calico-system/calico-node-wkjxt
/registry/pods/calico-system/calico-typha-65f8578fc5-8vshb
/registry/pods/calico-system/calico-typha-65f8578fc5-z5srw
/registry/pods/calico-system/csi-node-driver-6s4zr
/registry/pods/calico-system/csi-node-driver-7lqxb
/registry/pods/calico-system/csi-node-driver-kxp7r
/registry/pods/kube-system/coredns-7db6d8ff4d-792wd
/registry/pods/kube-system/coredns-7db6d8ff4d-nvxsf
/registry/pods/kube-system/etcd-master
/registry/pods/kube-system/kube-apiserver-master
/registry/pods/kube-system/kube-controller-manager-master
/registry/pods/kube-system/kube-proxy-9l64r
/registry/pods/kube-system/kube-proxy-svnvd
/registry/pods/kube-system/kube-proxy-zfvgt
/registry/pods/kube-system/kube-scheduler-master
/registry/pods/tigera-operator/tigera-operator-76ff79f7fd-2qddl
sh-5.2#
Here, we can see the keys created for the resources like services
and pods
in the etcd. This component is crucial for the functioning of the Kubernetes cluster. Since every component in Kubernetes is stateless, the etcd provides a way to store the state of the cluster.
kube-apiserver
This is the component that provides a RESTful interface on the Kubernetes control plane. It is used to manage the lifecycle of resources in the cluster.
Run the following command to check the status of the kube-apiserver
component:
kubectl get pods -n kube-system -l component=kube-apiserver --show-labels
kubectl get pods -n kube-system -l component=kube-apiserver --show-labels
NAME READY STATUS RESTARTS AGE LABELS
kube-apiserver-master 1/1 Running 0 14h component=kube-apiserver,tier=control-plane
This pod is controlled and run by the master node. It listens on the port 6443
for incoming connections.
The kubectl command-line-interface communicates with the kube-apiserver
using the .kube/config
file. This file contains the details of the cluster, including the server address, the certificate authority, and the user credentials.
To view the kube-apiserver
logs, run the following command:
kubectl logs kube-apiserver-master -n kube-system
And since this is a RESTful interface that provides a set of APIs, you can get the API documentation by running the following command:
kubectl get --raw /openapi/v2 > openapi.json
You can then import the openapi.json
file into a REST client like Postman to view the API documentation.
kube-controller-manager
The kube-controller-manager is responsible for running the controllers that regulate the state of the cluster. It polls the kube-apiserver for changes to the cluster state and makes changes to the cluster to match the desired state.
View the status of the kube-controller-manager component by running the following command:
kubectl get pods -n kube-system -l component=kube-controller-manager --show-labels
Here's a sample output:
$ kubectl get pods -n kube-system -l component=kube-controller-manager --show-labels
NAME READY STATUS RESTARTS AGE LABELS
kube-controller-manager-master 1/1 Running 0 110m component=kube-controller-manager,tier=control-plane
Next, view the logs of the kube-controller-manager component by running the following command:
kubectl logs kube-controller-manager-master -n kube-systema --tail 10
Here's a sample of output logs showing that the replicaset-controller is polling the kube-apiserver for changes:
$ kubectl logs kube-controller-manager-master -n kube-system --tail 10
I0709 13:30:10.948824 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="68.198µs"
I0709 13:30:38.878189 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="119.836µs"
I0709 13:30:38.919270 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="74.039µs"
I0709 13:30:39.476992 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="102.913µs"
I0709 13:30:39.567695 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="92.123µs"
I0709 13:30:40.495761 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="109.909µs"
I0709 13:30:40.523994 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="81.243µs"
I0709 13:30:40.555419 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="77.062µs"
I0709 13:30:40.595865 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="83.422µs"
I0709 13:30:40.609762 1 replica_set.go:676] "Finished syncing" logger="replicaset-controller" kind="ReplicaSet" key="default/simple-deployment-794f78c89" duration="63.266µs"
From this, you can see that the replicaset-controller is hitting the kube-apiserver to sync the state of the ReplicaSet.
kube-proxy
The kube-proxy is a network proxy that runs on each node in the cluster. It is responsible for managing the network for the pods running on the node. It maintains network rules on the host and performs connection forwarding.
View the status of the kube-proxy component by running the following command:
kubectl get pods -n kube-system -l k8s-app=kube-proxy --show-labels
Here's a sample output for a 3-node cluster:
$ kubectl get pods -n kube-system -l k8s-app=kube-proxy --show-labels -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
kube-proxy-7trr5 1/1 Running 0 3h51m ***.***.***.*** worker1 <none> <none> controller-revision-hash=669fc44fbc,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-8w9sn 1/1 Running 0 3h49m ***.***.***.*** worker2 <none> <none> controller-revision-hash=669fc44fbc,k8s-app=kube-proxy,pod-template-generation=1
kube-proxy-zp79v 1/1 Running 0 3h52m 192.168.1.1 master <none> <none> controller-revision-hash=669fc44fbc,k8s-app=kube-proxy,pod-template-generation=1
In this case, we have 3 nodes - master
, worker1
, and worker2
. The kube-proxy is running as a pod on each of these nodes. The state of the kube-proxy is managed by the kubelet on the node.
The kube-proxy can work in three modes:
- iptables: This is the default mode. It uses the
iptables
rules to manage the network. Uses the round robin algorithm for load balancing. We'll use this mode in our cluster. - ipvs: It uses the
ipvs
kernel module to manage the network, allows for more efficient load balancing. - userspace: In this mode the routing takes place in the userspace. Not commonly used.
kube-scheduler
The kube-scheduler is responsible for scheduling pods to run on nodes, so that you don't worry about monitoring the cluster's state and assigning pods to nodes.
Run the following command to check the status of the kube-scheduler component:
kubectl get pods -n kube-system -l component=kube-scheduler --show-labels
$ kubectl get pods -n kube-system -l component=kube-scheduler --show-labels -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
kube-scheduler-master 1/1 Running 0 5h45m 192.168.1.1 master <none> <none> component=kube-scheduler,tier=control-plane
Workloads
We'll cover the following workloads in this chapter:
Pods
Pods are the smallest deployable units you'll create in Kubernetes. We already know what a pod is, so let's go ahead and create a pod.
Navigate to the simple-pod
directory:
cd bootstrapping-with-kubernetes-examples/deploy/simple-pod
The pod.yaml
file in this directory contains the following configuration (I've updated the labels here):
apiVersion: v1
kind: Pod
metadata:
name: simple-pod
labels:
env: dev
app: simple-restapi-server
spec:
containers:
- name: apiserver
image: rutush10/simple-restapi-server-py:v0.0.1
ports:
- containerPort: 8000
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
To create the pod, run:
kubectl apply -f pod.yaml
To check if the pod is running, run:
kubectl get pods
It should give the following output:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
simple-pod 1/1 Running 0 60m
Understanding the pod manifest
Now let's understand the specifications in the pod.yaml
file:
apiVersion
: This field specifies where the object is defined. In this case, it's defined in thev1
version of the Kubernetes API. This field is mandatory for all Kubernetes objects as it helps the API server to locate the object definition.kind
: This field specifies the type of object you're creating. In this case, it's aPod
.metadata
: This field specifies the additional metadata that should be associated with the pod.name
: This field specifies the name of the pod. In this case, it'ssimple-pod
.labels
: This field specifies the labels attached to the pod. Labels are key-value pairs that can be used to filter and select resources. In this case, the pod is labeled withenv: dev
andapp: simple-restapi-server
.
spec
: This field specifies the desired configuration of the pod.containers
: This is a list of containers that should be run in the pod. In this case, there's only one container namednginx
. For every container you specify in the pod, the following fields are mandatory:name
: This field specifies the name of the container. In this case, it'sapiserver
.image
: This field specifies the image that should be used to create the container. In this case, it'srutush10/simple-restapi-server-py:v0.0.1
.ports
: This field specifies the ports that should be exposed by the container. In this case, the container is exposing port8000
.resources
: This field specifies the resource requirements and limits for the container. In this case, the container requires a minimum of64Mi
memory and250m
CPU and can use a maximum of128Mi
memory and500m
CPU.
To simply say, we are telling Kubernetes to create a pod named simple-pod
with a single container named apiserver
that runs the rutush10/simple-restapi-server-py:v0.0.1
image and exposes port 8000
. The pod is also labeled with env: dev
and app: simple-restapi-server
.
You can read more about the pod spec here.
Cleaning up
To delete the pod, run:
kubectl delete -f pod.yaml
Summary
In this section, you learned how to create a pod in Kubernetes. You saw how to write a pod manifest and create a pod using kubectl
. You also learned about the different fields in the pod manifest and what they mean.
Pods - A deeper dive
In this chapter, we'll take a deeper look into pods by inspecting the internals of a pod. We'll explore the configuration of the pod's container, the network configuration, and the filesystem of the pod. Next, we'll look at the lifecycle of a pod and how it interacts with the Kubernetes API server.
Launch a pod
Navigate to the simple-pod
directory:
cd bootstrapping-with-kubernetes-examples/deploy/simple-pod
Create a pod by running the following command:
kubectl apply -f pod.yaml
$ kubectl apply -f pod.yaml
pod/simple-pod created
List the pods to check if the pod is running:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
simple-pod 1/1 Running 0 70s 192.168.189.66 worker2 <none> <none>
Inspecting the Pod
The pod is running in the worker2
node, let's see the details of the pod:
SSH into worker2
node and run the following command:
docker container ls
worker2$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0f5939ad0388 rutush10/simple-restapi-server-py "/bin/sh -c '\"python…" 2 minutes ago Up 2 minutes k8s_simple-pod_simple-pod_default_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a_0
82ebd1e524c3 registry.k8s.io/pause:3.9 "/pause" 3 minutes ago Up 3 minutes k8s_POD_simple-pod_default_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a_0
e846a7697b13 calico/node-driver-registrar "/usr/bin/node-drive…" 34 minutes ago Up 34 minutes k8s_csi-node-driver-registrar_csi-node-driver-nbxnx_calico-system_7e156333-7fee-4ae9-bc6c-9364dcb91f7f_0
33001e4e0e05 calico/csi "/usr/bin/csi-driver…" 34 minutes ago Up 34 minutes k8s_calico-csi_csi-node-driver-nbxnx_calico-system_7e156333-7fee-4ae9-bc6c-9364dcb91f7f_0
d928c0344731 registry.k8s.io/pause:3.9 "/pause" 34 minutes ago Up 34 minutes k8s_POD_csi-node-driver-nbxnx_calico-system_7e156333-7fee-4ae9-bc6c-9364dcb91f7f_1
31c2a5be5eba calico/node "start_runit" 34 minutes ago Up 34 minutes k8s_calico-node_calico-node-c7ggl_calico-system_8ce36768-9d3d-42c0-ad21-ad532dc4ae8c_0
5c8359b97e29 registry.k8s.io/kube-proxy "/usr/local/bin/kube…" 34 minutes ago Up 34 minutes k8s_kube-proxy_kube-proxy-p9dvm_kube-system_d12c6ee0-e804-4b9c-933d-7f49e32629a8_0
4cfb7018165f registry.k8s.io/pause:3.9 "/pause" 34 minutes ago Up 34 minutes k8s_POD_calico-node-c7ggl_calico-system_8ce36768-9d3d-42c0-ad21-ad532dc4ae8c_0
fa97f998bb9f registry.k8s.io/pause:3.9 "/pause" 34 minutes ago Up 34 minutes k8s_POD_kube-proxy-p9dvm_kube-system_d12c6ee0-e804-4b9c-933d-7f49e32629a8_0
We can see that the pod container is running on the worker2
node with the name: k8s_simple-pod_simple-pod_default_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a_0
The pod name can be broken down as follows:
k8s
: This is a prefix added by Kubernetes to all the containers it creates.simple-pod
: This is the name of the pod's container, as specified in the manifest file.simple_pod
: This is the name of the pod, as specified in the manifest file.default
: This is the namespace in which the pod is running.a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a
: This is the UID of the pod.0
: This is the container number. If there are multiple containers in the pod, they will be numbered starting from 0.
We can find pod's UID by running the following command:
kubectl get pod simple-pod -n default -o json | jq '.metadata.uid'
$ kubectl get pod simple-pod -n default -o json | jq '.metadata.uid'
"a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a"
Let's inspect the container by running the following command:
worker-2:~$ docker inspect k8s_simple-pod_simple-pod_default_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a_0
The output will be a JSON object with all the details of the container. Here is a snippet of the output:
[
{
"Id": "0f5939ad0388a0780b65a1b208c6eaa65e6a99030a17cee62e1b676b4962501a",
"Created": "2024-11-14T14:25:29.844718436Z",
"Path": "/bin/sh",
"Args": [
"-c",
"\"python3\" \"main.py\""
],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 28263,
"ExitCode": 0,
"Error": "",
"StartedAt": "2024-11-14T14:25:30.488961484Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
"Image": "sha256:90745eeb9a750a5fb4a92e804c7ab09727cf3bd8615e005333cf2f4fb15dafe4",
"ResolvConfPath": "/var/lib/docker/containers/82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578/hostname",
"HostsPath": "/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/etc-hosts",
"LogPath": "/var/lib/docker/containers/0f5939ad0388a0780b65a1b208c6eaa65e6a99030a17cee62e1b676b4962501a/0f5939ad0388a0780b65a1b208c6eaa65e6a99030a17cee62e1b676b4962501a-json.log",
"Name": "/k8s_simple-pod_simple-pod_default_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a_0",
"RestartCount": 0,
"Driver": "overlay2",
"Platform": "linux",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "docker-default",
"ExecIDs": null,
"HostConfig": {
"Binds": [
"/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/volumes/kubernetes.io~projected/kube-api-access-qj5tq:/var/run/secrets/kubernetes.io/serviceaccount:ro",
"/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/etc-hosts:/etc/hosts",
"/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/containers/simple-pod/fbf17dce:/dev/termination-log"
],
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {}
},
"NetworkMode": "container:82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578",
"PortBindings": null,
"RestartPolicy": {
"Name": "no",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"ConsoleSize": [
0,
0
],
"CapAdd": null,
"CapDrop": null,
"CgroupnsMode": "private",
"Dns": null,
"DnsOptions": null,
"DnsSearch": null,
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "container:82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 999,
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": [
"seccomp=unconfined"
],
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"Isolation": "",
"CpuShares": 256,
"Memory": 134217728,
"NanoCpus": 0,
"CgroupParent": "kubepods-burstable-poda4259cb7_26fc_47eb_87e9_d3e57ba7bb0a.slice",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 100000,
"CpuQuota": 50000,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DeviceCgroupRules": null,
"DeviceRequests": null,
"MemoryReservation": 0,
"MemorySwap": 134217728,
"MemorySwappiness": null,
"OomKillDisable": null,
"PidsLimit": null,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0,
"MaskedPaths": [
"/proc/asound",
"/proc/acpi",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/proc/scsi",
"/sys/firmware",
"/sys/devices/virtual/powercap"
],
"ReadonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
},
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay2/03cf4b8e9ea55337533d965b2d9e16778d4978f60812fa73bcc5749fb9f0bc20-init/diff:/var/lib/docker/overlay2/6a2c58abab266b7735aea477fb0ce40cd1882b2577ce0d8e5875a21379f217ef/diff:/var/lib/docker/overlay2/abb1bdc38b6369baf88ad83dac5aae128ecd6341ad7da8382f2f2466c3a32a7e/diff:/var/lib/docker/overlay2/623b0f4639ffb69a660668abadbb64b6e9db6828c9a781119856c2d5c0bb2d1e/diff:/var/lib/docker/overlay2/9dd03fb186fefb00540cb6d8fea2df15096733a19ed77ca0487dd1a97394115c/diff:/var/lib/docker/overlay2/07a611603cc0c62094196c9f8cf03de6c2bd41c9466e740ca9ee83f186f28da4/diff:/var/lib/docker/overlay2/f2b16c1c2ea43dfec7099840bf751af30ab16aa6d88f185e0557f3508e6d3543/diff:/var/lib/docker/overlay2/c3dd7c202af8747f01db40546eee4fb56ffbc389ee1b0e874de13b8875d44d67/diff:/var/lib/docker/overlay2/703d954d4c1bf29a37a836118dc97d6ae4d2542e28d94cb1823f4101bcd34db0/diff:/var/lib/docker/overlay2/82b9cbd96041be35306dd401329fa8c7036711134b01c1bc618d1a9a74afd213/diff:/var/lib/docker/overlay2/ccdebad004d4a8ba3dd4e2fa117fd1d67d324aefb992d497a43533c1ac0fe34a/diff",
"MergedDir": "/var/lib/docker/overlay2/03cf4b8e9ea55337533d965b2d9e16778d4978f60812fa73bcc5749fb9f0bc20/merged",
"UpperDir": "/var/lib/docker/overlay2/03cf4b8e9ea55337533d965b2d9e16778d4978f60812fa73bcc5749fb9f0bc20/diff",
"WorkDir": "/var/lib/docker/overlay2/03cf4b8e9ea55337533d965b2d9e16778d4978f60812fa73bcc5749fb9f0bc20/work"
},
"Name": "overlay2"
},
"Mounts": [
{
"Type": "bind",
"Source": "/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/volumes/kubernetes.io~projected/kube-api-access-qj5tq",
"Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
"Mode": "ro",
"RW": false,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/etc-hosts",
"Destination": "/etc/hosts",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/containers/simple-pod/fbf17dce",
"Destination": "/dev/termination-log",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
],
"Config": {
"Hostname": "simple-pod",
"Domainname": "",
"User": "0",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"KUBERNETES_SERVICE_HOST=10.96.0.1",
"KUBERNETES_SERVICE_PORT=443",
"KUBERNETES_SERVICE_PORT_HTTPS=443",
"KUBERNETES_PORT=tcp://10.96.0.1:443",
"KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443",
"KUBERNETES_PORT_443_TCP_PROTO=tcp",
"KUBERNETES_PORT_443_TCP_PORT=443",
"KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1",
"PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"LANG=C.UTF-8",
"GPG_KEY=E3FF2839C048B25C084DEBE9B26995E310250568",
"PYTHON_VERSION=3.9.19",
"PYTHON_PIP_VERSION=23.0.1",
"PYTHON_SETUPTOOLS_VERSION=58.1.0",
"PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/dbf0c85f76fb6e1ab42aa672ffca6f0a675d9ee4/public/get-pip.py",
"PYTHON_GET_PIP_SHA256=dfe9fd5c28dc98b5ac17979a953ea550cec37ae1b47a5116007395bfacff2ab9",
"PYTHONPATH=/app/src:"
],
"Cmd": [
"/bin/sh",
"-c",
"\"python3\" \"main.py\""
],
"Healthcheck": {
"Test": [
"NONE"
]
},
"Image": "rutush10/simple-restapi-server-py@sha256:be0b72b57e7a8e222eafe92fff61145352ef90c67819cd33c5edffbc4978f81c",
"Volumes": null,
"WorkingDir": "/app/src",
"Entrypoint": null,
"OnBuild": null,
"Labels": {
"annotation.io.kubernetes.container.hash": "62b86e75",
"annotation.io.kubernetes.container.ports": "[{\"containerPort\":8000,\"protocol\":\"TCP\"}]",
"annotation.io.kubernetes.container.restartCount": "0",
"annotation.io.kubernetes.container.terminationMessagePath": "/dev/termination-log",
"annotation.io.kubernetes.container.terminationMessagePolicy": "File",
"annotation.io.kubernetes.pod.terminationGracePeriod": "30",
"io.kubernetes.container.logpath": "/var/log/pods/default_simple-pod_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/simple-pod/0.log",
"io.kubernetes.container.name": "simple-pod",
"io.kubernetes.docker.type": "container",
"io.kubernetes.pod.name": "simple-pod",
"io.kubernetes.pod.namespace": "default",
"io.kubernetes.pod.uid": "a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a",
"io.kubernetes.sandbox.id": "82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578"
},
"StopSignal": "SIGINT"
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "",
"SandboxKey": "",
"Ports": {},
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {}
}
}
]
Let's look at the following details of the container:
"Image": "sha256:90745eeb9a750a5fb4a92e804c7ab09727cf3bd8615e005333cf2f4fb15dafe4",
"ResolvConfPath": "/var/lib/docker/containers/82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578/hostname",
"HostsPath": "/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/etc-hosts",
"LogPath": "/var/lib/docker/containers/0f5939ad0388a0780b65a1b208c6eaa65e6a99030a17cee62e1b676b4962501a/0f5939ad0388a0780b65a1b208c6eaa65e6a99030a17cee62e1b676b4962501a-json.log",
"Name": "/k8s_simple-pod_simple-pod_default_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a_0",
Image
: This is the hash of the image that the container is running.ResolvConfPath
: This is the path to the resolv.conf file of the container. It contains the DNS configuration of the coredns server the container should use.HostnamePath
: This is the path of the file containing the hostname of the container.HostsPath
: This is the path of the file containing the kubernetes-managed hosts file of the container. It contains the IP address and hostname of the container.LogPath
: This is the path of the log file of the container.Name
: This is the name of the container.
Notice that the ResolvConfPath and HostnamePath are part of another container with the ID 82ebd...
, with name k8s_POD_simple-pod_default_a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a_0
. This is the pause container, also called the Sandbox container. This is a container used by Kubernetes to manage the lifecycle of the network namespace of the pod. If the worker container dies, the pause container will help in restarting another worker container with the same network configuration. All the containers in the pod share the network namespace of the pause container.
Let's look at the resolv.conf
file of the container by running the following command:
worker2$ sudo cat /var/lib/docker/containers/82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
The resolv.conf
file contains details regarding the DNS configuration of the container. It contains the IP address of the coredns ClusterIP service that the container should use for DNS resolution. We verify this by running the following command to fetch the IP address of the coredns ClusterIP service:
$ kubectl get svc kube-dns -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 48m
The IP address of the coredns ClusterIP service is 10.96.0.10
which matches the IP address in the resolv.conf
file.
The etc-hosts
has the following content:
worker2$ sudo cat /var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/etc-hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
192.168.189.66 simple-pod
This file contains static mapping of IP addresses to hostnames.
127.0.0.1 localhost
: This is the loopback address of the container. It allows the container to communicate with itself.::1 localhost ip6-localhost ip6-loopback
: This is the IPv6 loopback address of the container.fe00::0 ip6-localnet
: This is the link-local IPv6 address of the container. It is used for communication within the same network segment.fe00::0 ip6-mcastprefix
: This is the multicast IPv6 address of the container. It is used for multicast communication.fe00::1 ip6-allnodes
: This is the IPv6 address for reaching all nodes in the network.fe00::2 ip6-allrouters
: This is the IPv6 address for reaching all routers in the network.192.168.189.66 simple-pod
: This is the IPv4 address assigned to the pod. This allows any other service or pod in the cluster to communicate with thesimple-pod
pod using the hostnamesimple-pod
.
The HostConfig
section specifies how the container is configured to run on the Host.
Let's take a look at the Volume mounts of the container:
"HostConfig": {
"Binds": [
"/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/volumes/kubernetes.io~projected/kube-api-access-qj5tq:/var/run/secrets/kubernetes.io/serviceaccount:ro",
"/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/etc-hosts:/etc/hosts",
"/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/containers/simple-pod/fbf17dce:/dev/termination-log"
],
}
The Binds
field contains the list of volumes that are mounted to the container. Here are the details of the volumes:
/var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/volumes/kubernetes.io~projected/kube-api-access-qj5tq:/var/run/secrets/kubernetes.io/serviceaccount:ro
: This is the volume that contains the certificate, namespace, and the token required for the pod to communicate with the Kubernetes API server. You can find the details of the volume by running the following command:
worker2$ cd /var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/volumes/kubernetes.io~projected/kube-api-access-qj5tq
worker2$ ls
ca.crt namespace token
The ca.crt file contains the certificate of the Kubernetes API server, the namespace file contains the namespace of the pod, and the token file contains the token required for the pod to authenticate with the Kubernetes API server.
Read the contents of the ca.crt file by running the following command in the node running the pod:
worker2$ cat /var/lib/kubelet/pods/a4259cb7-26fc-47eb-87e9-d3e57ba7bb0a/volumes/kubernetes.io~projected/kube-api-access-qj5tq/ca.crt
Now run the following command on your local machine to get the ca.crt:
$ kubectl describe configmap kube-root-ca.crt
This should match the contents of the ca.crt file in the volume mounted to the container.
The kube-root-ca.crt
file is mounted by the Kubelet to the pod. This certificate allows the containerized applications to communicate with the K8s services protected by TLS, which includes communication to the K8s API server, and other services like kube-dns, etc. These certificates are signed by the K8s CA.
Next, let's take a look at the "NetworkMode" field of the container:
"HostConfig": {
"NetworkMode": "container:82ebd1e524c3b8920acb3cd0196cb34410cc80e44145626359b538ef2aff8578",
}
As discussed earlier, the worker containers share the network namespace of the pause container. This is specified by the NetworkMode
field.
The Config
section contains the configuration of the container. It contains the details like environment variables, command, health check, image, etc. The Labels
field contains the associated labels of the container, which are used by Kubelet to manage the lifecycle of the container.
The NetworkSettings
section contains the details of the network configuration of the container. It contains details like the IP address, MAC address, and the network namespace of the container. Since this container shares the network namespace of the pause container, the IP address and MAC address are not assigned to the container but are specified in the pause (sandbox) container.
Conclusion
On inspecting the Pod we can observe that the pod is more than just a container. It has a lot of configurations and settings that are automatically managed by Kubernetes. From the network configuration to the filesystem, Kubernetes manages everything for the pod. This allows more flexibility and scalability in managing the containers. Imagine managing all these configurations manually for each container, it would be a nightmare. Kubernetes abstracts all these complexities and provides a simple interface to manage the containers.
Summary
In this chapter, we took a deeper dive and understood how a pod is configured and managed by Kubernetes. We inspected the internals of the pod and understood how the network configuration, filesystem, and lifecycle of the pod are managed. We also understood how the pod interacts with the Kubernetes API server.
Replica Sets
Replica Sets are the resources that help you make sure that a specific number of pods are running at any given time. If a pod fails, the Replica Set will create a new one to replace it.
Managing pods with Replica Sets
Replica Sets use selectors, like labels, to identify the pods they should manage. When you create a Replica Set, you specify a selector. This selector tells the Replica Set which pods to manage.
Navigate to the simple-replicaset
directory:
cd bootstrapping-with-kubernetes-examples/deploy/simple-replicaset
The replicaset.yaml
file in this directory contains the following configuration:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: simple-replicaset
labels:
env: dev
app: simple-replicaset
spec:
replicas: 3
selector:
matchLabels:
app: simple-replicaset
template:
metadata:
labels:
app: simple-replicaset
spec:
containers:
- name: apiserver
image: rutush10/simple-restapi-server-py:v0.0.1
ports:
- containerPort: 8000
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 200Mi
To create the Replica Set, run the following command:
kubectl apply -f replicaset.yaml
You can check the status of the Replica Set using the following command:
kubectl get replicaset
$ kubectl get replicaset
NAME DESIRED CURRENT READY AGE
simple-replicaset 3 3 3 6m54s
To check the pods managed by the Replica Set, run:
kubectl get pods --show-labels
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-replicaset-bzgpf 1/1 Running 0 7m47s app=simple-replicaset
simple-replicaset-hj9th 1/1 Running 0 7m47s app=simple-replicaset
simple-replicaset-xm2nm 1/1 Running 0 7m47s app=simple-replicaset
There is a file named pod.yaml
in the same folder, go ahead and create a pod using the following command:
kubectl apply -f pod.yaml
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-pod 1/1 Running 0 3s <none>
simple-replicaset-bzgpf 1/1 Running 0 8m58s app=simple-replicaset
simple-replicaset-hj9th 1/1 Running 0 8m58s app=simple-replicaset
simple-replicaset-xm2nm 1/1 Running 0 8m58s app=simple-replicaset
You can see that the pod simple-pod
is not managed by the Replica Set, as it doesn't have the label app: simple-replicaset
.
To see the label selectors of the replicaset working, we will add the label app=simple-replicaset
to the pod. Update the metadata
field in the pod.yaml
file and add the label there, as shown below:
metadata:
name: simple-pod
labels:
app: simple-replicaset
Now, if you run apply
on the updated configuration again, kubectl will give the following output:
$ kubectl apply -f pod.yaml
pod/simple-pod configured
Check the pods again:
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-pod 1/1 Terminating 0 36s app=simple-replicaset
simple-replicaset-bzgpf 1/1 Running 0 9m31s app=simple-replicaset
simple-replicaset-hj9th 1/1 Running 0 9m31s app=simple-replicaset
simple-replicaset-xm2nm 1/1 Running 0 9m31s app=simple-replicaset
You can see that the pod simple-pod
is in Terminating
state. This happens because the controller ReplicationController
detected that the pod has the label app: simple-replicaset
and since there are already 3 pods managed by the Replica Set, it terminated the pod simple-pod
.
Similarly, go ahead and remove the label app=simple-replicaset
from a pod and see what happens.
kubectl label pod simple-replicaset-bzgpf app-
And check the pods again:
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-replicaset-bzgpf 1/1 Running 0 14m <none>
simple-replicaset-brbmr 0/1 ContainerCreating 0 2s app=simple-replicaset
simple-replicaset-hj9th 1/1 Running 0 14m app=simple-replicaset
simple-replicaset-xm2nm 1/1 Running 0 14m app=simple-replicaset
A new pod simple-replicaset-brbmr
is created by the Replica Set to maintain the desired state of 3 pods. While the old pod simple-replicaset-bzgpf
is still running, it is no longer managed by the Replica Set.
This is an example of how labels
and selectors
are used by Kubernetes to manage resources. The same concept is used by other resources like Deployments
and Services
, as we'll see in the upcoming sections.
Understanding the Replica Set manifest
Let's break down the replicaset.yaml
manifest:
apiVersion: apps/v1
: This tells Kubernetes to use theapps/v1
API group.kind: ReplicaSet
: This tells Kubernetes that we are creating a Replica Set.metadata
: This is the metadata for the Replica Set.name: nginx-replicaset
: The name of the Replica Set.labels
: The labels for the Replica Set. Here, we have labelsenv: dev
andapp: simple-replicaset
.
spec
: This is the specification for the Replica Set.replicas: 3
: This tells the Replica Set that we want 3 replicas of the pod.selector
: This is the selector for the Replica Set.matchLabels
: This tells the Replica Set to manage pods with the labelapp: simple-replicaset
.
template
: This is the template for the pods which will be managed by the Replica Set.metadata
: This is the metadata for the pod.labels
: The labels for the pod. Here, we have a labelapp: simple-replicaset
. Make sure that the labels in the pod template match the labels in the selector.
spec
: This is the specification for the pod. Similar to the one we defined in the previous section.containers
: This is the list of containers in the pod.name: apiserver
: The name of the container.image: rutush10/simple-restapi-server-py:v0.0.1
: The image for the container.ports
: The ports for the container. Here, we are exposing port 8000.
Here's the visual representation of the state of the system. The pod simple-pod
(created above, before making the label changes) is not managed by the Replica Set. The Replica Set simple-replicaset
manages the pods simple-replicaset-bzgpf
, simple-replicaset-hj9th
, and simple-replicaset-xm2nm
.
You can learn more about the replica set spec here.
Cleaning up
To clean up the resources created in this section, run the following commands:
kubectl delete -f replicaset.yaml
kubectl delete -f pod.yaml
Summary
In this section, you learned about Replica Sets and how they help you manage pods. You saw how Replica Sets use labels and selectors to identify the pods they should manage. You also saw how Replica Sets maintain the desired number of pods by creating new pods or terminating existing ones. And finally, you learned how to write a Replica Set manifest and create a Replica Set using kubectl
.
Deployments
A Deployment manages Replica Sets. It helps manage the lifecycle of the application by providing features like rolling updates, rollbacks, and scaling.
Managing applications with Deployments
Deployments create Replica Sets, which in turn create and manage Pods. Deployments are used to manage the lifecycle of Pods. They help in creating new Pods, updating existing Pods, and deleting old Pods.
Let's create a Deployment for the simple REST API server we created earlier. Navigate to the simple-deployment
directory:
cd bootstrapping-with-kubernetes-examples/deploy/simple-deployment
The deployment.yaml
file in this directory contains the following configuration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-deployment
spec:
replicas: 3
selector:
matchLabels:
app: simple-deployment
template:
metadata:
labels:
app: simple-deployment
spec:
containers:
- name: apiserver
image: rutush10/simple-restapi-server-py:v0.0.1
ports:
- containerPort: 8000
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 200Mi
Create the Deployment by running the following command:
kubectl apply -f deployment.yaml
You can check the status of the Deployment using the following command:
kubectl get deployments
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
simple-deployment 3/3 3 3 17s
This deployment creates a Replica Set.
NAME DESIRED CURRENT READY AGE LABELS
simple-deployment-794f78c89 3 3 3 76s app=simple-deployment,pod-template-hash=794f78c89
The Replica Set, in turn, creates and manages Pods.
NAME READY STATUS RESTARTS AGE LABELS
simple-deployment-794f78c89-4tl2w 1/1 Running 0 107s app=simple-deployment,pod-template-hash=794f78c89
simple-deployment-794f78c89-rkc27 1/1 Running 0 107s app=simple-deployment,pod-template-hash=794f78c89
simple-deployment-794f78c89-szjfg 1/1 Running 0 107s app=simple-deployment,pod-template-hash=794f78c89
You can see that the Replica Set created by the deployment has an additional label pod-template-hash
. This label is used by the Deployment Controller to manage Replica Sets when you make changes to the Deployment.
Let's see how to update the Deployment with a new version v0.0.2
of the simple REST API server.
Updating the Deployment
To update the Deployment, you need to change the image version in the deployment.yaml
file.
Update the deployment.yaml
file as shown below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-deployment
spec:
replicas: 3
selector:
matchLabels:
app: simple-deployment
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: simple-deployment
spec:
containers:
- name: apiserver
image: rutush10/simple-restapi-server-py:v0.0.2
ports:
- containerPort: 8000
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 200Mi
Apply the changes to the Deployment:
kubectl apply -f deployment.yaml
You can check the status of the Deployment using the following command:
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-deployment-794f78c89-4tl2w 1/1 Running 0 13m app=simple-deployment,pod-template-hash=794f78c89
simple-deployment-794f78c89-rkc27 1/1 Terminating 0 13m app=simple-deployment,pod-template-hash=794f78c89
simple-deployment-794f78c89-szjfg 1/1 Running 0 13m app=simple-deployment,pod-template-hash=794f78c89
simple-deployment-98d7d96b-dxc25 0/1 ContainerCreating 0 2s app=simple-deployment,pod-template-hash=98d7d96b
simple-deployment-98d7d96b-w46tx 0/1 ContainerCreating 0 2s app=simple-deployment,pod-template-hash=98d7d96b
You can see that the Deployment is updating the Pods. The old Pods are being terminated, in a controlled manner such that at least one Pod is available at all times. The new Pods are being created with the updated image version v0.0.2
.
Understanding the Deployment manifest
Let's break down the deployment.yaml
file:
apiVersion: apps/v1
: This tells Kubernetes to use theapps/v1
API group.kind: Deployment
: This specifies the type of object we're creating, which is a Deployment.metadata
: This field specifies the additional metadata that should be associated with the Deployment.name
: This field specifies the name of the Deployment. In this case, it'ssimple-deployment
.
spec
: This field specifies the desired state of the Deployment.replicas: 3
: This field specifies the number of Pods that should be running at all times. In this case, it's3
.selector
: This field specifies how the replica set created by the Deployment should select the Pods it manages.matchLabels
: This field specifies the labels that the replica set should match to manage the Pods. In this case, it'sapp: simple-deployment
.
template
: This field specifies the Pod template that should be used to create the Pods.metadata
: This field specifies the labels that should be attached to the Pods created by the Deployment.labels
: This field specifies the labels attached to the Pods. In this case, the Pods are labeled withapp: simple-deployment
.
spec
: This field specifies the desired configuration of the Pods.containers
: This field specifies the containers that should run in the Pods.name: apiserver
: This field specifies the name of the container. In this case, it'sapiserver
.image: rutush10/simple-restapi-server-py:v0.0.1
: This field specifies the image that should be used for the container. In this case, it'srutush10/simple-restapi-server-py:v0.0.1
.ports
: This field specifies the ports that should be exposed by the container.containerPort: 8000
: This field specifies the port8000
should be exposed by the container.
resources
: This field specifies the resource requests and limits for the container. In this case, it's100m
CPU and100Mi
memory for requests, and200m
CPU and200Mi
memory for limits.
You can learn more about the Deployment spec here.
Cleaning up
To clean up the resources created in this section, run the following commands:
kubectl delete -f deployment.yaml
Summary
In this section, you learned about Deployments and how they help you manage the lifecycle of Pods. You saw how Deployments create Replica Sets and manage Pods. You also learned how to update a Deployment with a new version of the application. Finally, you learned how to write a Deployment manifest and create a Deployment using kubectl
.
Namespaces
In Kubernetes, a namespace is a way to partition the resources in a cluster. They are intended for use in environments with multiple users, projects, or teams. This prevents the resources from interfering with each other.
When we installed Kubernetes, we created a few namespaces. Let's list them with the following command:
kubectl get ns
You should see the following output:
$ kubectl get ns
NAME STATUS AGE
calico-apiserver Active 10m
calico-system Active 11m
default Active 11m
kube-node-lease Active 11m
kube-public Active 11m
kube-system Active 11m
tigera-operator Active 11m
Let's go through the namespaces we see here:
kube-system: This namespace contains the core Kubernetes resource which form the control plane. Let's list the resources in this namespace:
kubectl get all -n kube-system
$ kubectl get all -n kube-system
NAME READY STATUS RESTARTS AGE
pod/coredns-55cb58b774-f7bds 1/1 Running 0 9m20s
pod/coredns-55cb58b774-fmv59 1/1 Running 0 9m20s
pod/etcd-master 1/1 Running 0 9m35s
pod/kube-apiserver-master 1/1 Running 0 9m34s
pod/kube-controller-manager-master 1/1 Running 0 9m34s
pod/kube-proxy-qm8lq 1/1 Running 0 7m28s
pod/kube-proxy-sjdsf 1/1 Running 0 9m20s
pod/kube-proxy-xjcf7 1/1 Running 0 6m2s
pod/kube-scheduler-master 1/1 Running 0 9m36s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 9m34s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 9m34s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 2/2 2 2 9m34s
NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-55cb58b774 2 2 2 9m20s
As you can see from the output above, the kube-system
namespace important Kubernetes resources like the DNS server, etcd, apiserver, controller manager, proxy, and scheduler. This namespace is managed by Kubernetes and should not be modified by users.
kube-public: This namespaces is readable by everyone, even the non-authenticated users. By default, there are no resources created in this namespace.
kubectl get all -n kube-public
$ kubectl get all -n kube-public
No resources found in kube-public namespace.
kube-node-lease: This namespace contains the lease objects associated with each node. A lease is how a node tells the control plane that it is alive. The node sends a heartbeat to the control plane to extend its lease. If the control plane does not receive a heartbeat from the node, it assumes that the node is dead and reschedules the pods running on that node.
To list the leases in the kube-node-lease
namespace, run the following command:
kubectl get leases -n kube-node-lease
$ kubectl get leases -n kube-node-lease
NAME HOLDER AGE
master master 28m
worker1 worker1 26m
worker2 worker2 25m
default: This is the default namespace for objects created without any namespace specified. Ex: if you create a pod without specifying a namespace, it will be created in the default
namespace.
The namespaces calico-apiserver
, calico-system
, and tigera-operator
are created by the Calico CNI plugin.
Creating a Namespace
Navigate to the bootstrapping-with-kubernetes-examples/deploy/simple-namespace
directory, observe the namespace.yaml
file:
apiVersion: v1
kind: Namespace
metadata:
name: simple-namespace
labels:
name: simple-namespace
Note: The manifests are available here
Create the namespace by running the following command:
kubectl apply -f namespace.yaml
$ kubectl apply -f namespace.yaml
namespace/simple-namespace created
To list the namespaces, run the following command:
kubectl get ns
$ kubectl get ns
NAME STATUS AGE
calico-apiserver Active 93m
calico-system Active 94m
default Active 95m
kube-node-lease Active 95m
kube-public Active 95m
kube-system Active 95m
simple-namespace Active 114s
tigera-operator Active 95m
Understanding the Namespace Manifest
Now let's understand the specifications in the namespace.yaml
file:
apiVersion
: This field specifies where the object is defined. In this case, it's defined in thev1
version of the Kubernetes API. This field is mandatory for all Kubernetes objects as it helps the API server to locate the object definition.kind
: This field specifies the type of object you're creating. In this case, it's aNamespace
.metadata
: This field specifies the details about the Namespace.name
: This is the name assigned to the namespacelabels
: This is a map of key-value pairs that will be associated with the namespace.
Cleaning up
To delete the namespace, run the following command:
kubectl delete -f namespace.yaml
Summary
In this chapter you learned about namespaces in Kubernetes. You learned how to list the namespaces and create new namespaces. In the next chapter, we will see how resource quotas can be used to limit the resources consumed by a namespace. We will use namespaces throughout the book, so more examples will follow.
Resource Quotas
In an environment where multiple users or teams are using the same Kubernetes cluster, it is important to ensure that the resources are fairly distributed. This is where the resource quotas help. In the previous chapter, we discussed how namespaces are a way to logically partition the cluster resources. Resource quotas are a way to parameterize the amount of resources that can be consumed within a namespace.
Note: The manifests used in this repository are available here.
Resource quotas can be set for the following:
-
Compute: This includes CPU and memory.
-
Storage: This includes PersistentVolumeClaims, storage classes, and persistent volumes.
-
Object Count: This resource quota sets restriction on the number of objects that can be created for a particular resource type.
Resource quotas are defined using the ResourceQuota
object.
With resource quotas you can set hard and soft limits on the resources. A hard limit is a strict limit which cannot be exceeded by the namespace. A soft limit can be exceeded but the cluster will issue alerts when the limit is reached.
It should be noted that the resource quotas are applied per namespace and not per pod, i.e. if you set a hard limit of 100Gi for memory and 10 CPUs for a namespace, the sum of memory and CPU consumed by all the pods in the namespace should not exceed these limits.
Creating a Resource Quota
Let's use the application we deployed in the previous chapter to understand how resource quotas work. We will start with creating a namespace and then apply a resource quota to it. For the purpose of demonstration, we will go with a very low limit on resource constraints. Next, we will deploy an application as a replica set in the namespace and see how the resource quota is enforced.
Navigate to the simple-resource-quota
directory:
cd bootstrapping-with-kubernetes-examples/deploy/simple-resource-quota
The namespace.yaml
file in this directory contains the following configuration:
apiVersion: v1
kind: Namespace
metadata:
name: simple-namespace
labels:
name: simple-namespace
This file creates a namespace named simple-namespace
.
Create the namespace by running the following command:
kubectl apply -f namespace.yaml
$ kubectl apply -f namespace.yaml
namespace/simple-namespace created
Next, observe the resource-quota.yaml
file in the same directory:
apiVersion: v1
kind: ResourceQuota
metadata:
name: simple-resource-quota
namespace: simple-namespace
spec:
hard:
cpu: "3"
memory: 10Gi
pods: "3"
This file specifies that the namespace simple-namespace
has a hard limit of 3 CPUs, 10Gi of memory, and 3 pods. This means that:
- The sum of memory consumed by all the pods in the namespace should not exceed 10Gi.
- The sum of CPU consumed by all the pods in the namespace should not exceed 3.
- There cannot be more than 3 pods in the namespace.
Create the resource quota by running the following command:
kubectl apply -f resource-quota.yaml
$ kubectl apply -f resource-quota.yaml
resourcequota/simple-resource-quota created
Now that the namespace and resource quota are created, let's deploy the application. The replica-set.yaml
file in the same directory contains the following configuration:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: simple-replicaset
namespace: simple-namespace
labels:
env: dev
app: simple-replicaset
spec:
replicas: 3
selector:
matchLabels:
app: simple-replicaset
template:
metadata:
labels:
app: simple-replicaset
spec:
containers:
- name: apiserver
image: rutush10/simple-restapi-server-py:v0.0.1
ports:
- containerPort: 8000
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "1"
memory: "2Gi"
This file specifies that the application should be deployed as a replica set with 3 replicas. Each pod in the replica set is configured to consume 1 CPU and 2Gi of memory.
The overall resource consumption will be 3 CPUs and 6Gi of memory. This should not exceed the resource quota limits set for the namespace.
Create the replica set by running the following command:
kubectl apply -f replica-set.yaml
$ kubectl apply -f replica-set.yaml
replicaset.apps/simple-replicaset created
To check if the replica set is running, run:
kubectl get rs -n simple-namespace
$ kubectl get rs -n simple-namespace
NAME DESIRED CURRENT READY AGE
simple-replicaset 3 3 3 2m30s
This shows that the replica set is running with 3 replicas.
Now let's check how much the resource quota is being consumed. Run the following command:
kubectl describe quota -n simple-namespace
$ kubectl describe quota -n simple-namespace
Name: simple-resource-quota
Namespace: simple-namespace
Resource Used Hard
-------- ---- ----
cpu 3 3
memory 6Gi 10Gi
pods 3 3
This shows that the resource quota is being consumed as expected. The CPU and memory limits are being enforced.
Now let's modify the replica set to consume more resources than the quota allows. Update the replica-set.yaml
file to have 4 replicas (which will consume 4 CPUs and 8Gi of memory) and apply the changes:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: simple-replicaset
namespace: simple-namespace
labels:
env: dev
app: simple-replicaset
spec:
replicas: 4
selector:
matchLabels:
app: simple-replicaset
template:
metadata:
labels:
app: simple-replicaset
spec:
containers:
- name: apiserver
image: rutush10/simple-restapi-server-py:v0.0.1
ports:
- containerPort: 8000
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "1"
memory: "2Gi"
Apply the changes by running:
kubectl apply -f replica-set.yaml
$ kubectl apply -f replica-set.yaml
replicaset.apps/simple-replicaset configured
Now check the resource quota again:
kubectl describe quota -n simple-namespace
$ kubectl get rs -n simple-namespace
NAME DESIRED CURRENT READY AGE
simple-replicaset 4 3 3 17m
This shows that the resource quota is being enforced. The replica set could not scale to 4 replicas as the resource quota allows only 3 pods.
Check the resource quota again:
kubectl describe quota -n simple-namespace
$ kubectl describe quota -n simple-namespace
Name: simple-resource-quota
Namespace: simple-namespace
Resource Used Hard
-------- ---- ----
cpu 3 3
memory 6Gi 10Gi
pods 3 3
This shows that the resource quota is being enforced. The CPU and memory limits are being enforced.
Understanding the Resource Quota Manifest
Now let's understand the specifications in the resource-quota.yaml
file:
apiVersion
: This field specifies where the object is defined. In this case, it's defined in thev1
version of the Kubernetes API. This field is mandatory for all Kubernetes objects as it helps the API server to locate the object definition.kind
: This field specifies the type of object you're creating. In this case, it's aResourceQuota
.metadata
: This field specifies the details about the ResourceQuota.name
: This is the name assigned to the ResourceQuota.namespace
: This is the namespace to which the ResourceQuota is applied.
spec
: This field specifies the hard limits for the resources. The hard limits are the maximum limits that can be consumed by the namespace. In this case, the hard limits are set for CPU, memory, and pods.hard
: This field specifies the hard limits for the resources.cpu
: This is the hard limit for CPU.memory
: This is the hard limit for memory.pods
: This is the hard limit for the number of pods that can be created in the namespace.
Cleaning up
To delete the namespace and the resource quota, run the following commands:
kubectl delete -f replica-set.yaml
kubectl delete -f resource-quota.yaml
kubectl delete -f namespace.yaml
Summary
In this chapter, we discussed how resource quotas can be applied to limit the resources used by the namespace. We created a namespace and applied a resource quota to it. We then deployed an application as a replica set in the namespace and saw how the resource quota was enforced. Resource quotas are an important tool to ensure fair distribution of resources in a multi-tenant environment.
Networking
Services
A Service is an abstraction that enables communication to the Pods. It provides a stable endpoint for the pods so that we don't have to worry about the dynamic allocation of IP addresses to the Pods. Services can be of different types, such as ClusterIP, NodePort, LoadBalancer, and ExternalName. Here, I'll cover only the ClusterIP and NodePort type services.
ClusterIP Service
Creating a ClusterIP Service
We'll use the deployment created in the previous section to create applications. Then, we'll use a Service to route traffic to the Pods created by the Deployment.
Navigate to the simple-service
directory:
cd bootstrapping-with-kubernetes-examples/deploy/simple-service
The service.yaml
file in this directory contains the following configuration:
apiVersion: v1
kind: Service
metadata:
name: backend-service
spec:
type: ClusterIP
selector:
app: simple-deployment
ports:
- protocol: TCP
port: 8000
targetPort: 8000
Start the deployment and service with the following commands:
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Here's a sample output:
$ kubectl apply -f deployment.yaml
deployment.apps/simple-deployment created
$ kubectl apply -f service.yaml
service/backend-service created
Verify if the service and deployment are created:
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
simple-deployment 3/3 3 3 10m
$ kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
simple-deployment-794f78c89-9h7jh 1/1 Running 0 10m app=simple-deployment,pod-template-hash=794f78c89
simple-deployment-794f78c89-hs2s9 1/1 Running 0 10m app=simple-deployment,pod-template-hash=794f78c89
simple-deployment-794f78c89-tgnhl 1/1 Running 0 10m app=simple-deployment,pod-template-hash=794f78c89
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
backend-service ClusterIP 10.98.60.140 <none> 8000/TCP 11m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 20m
Let's send a request to the Service.
To send request to a ClusterIP Service, we need to be inside the Kubernetes cluster. So we'll use the kubectl
command to run a Pod with interactive shell:
kubectl run -i --tty --rm debug --image=alpine -- sh
Install curl
apk add curl
Send a request to the Service:
curl -i http://backend-service:8000/rest/v1/health/
Here's the output:
$ kubectl run -i --tty --rm debug --image=alpine -- sh
If you don't see a command prompt, try pressing enter.
/ #
/ # apk add curl
fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/community/x86_64/APKINDEX.tar.gz
(1/10) Installing ca-certificates (20240226-r0)
(2/10) Installing brotli-libs (1.1.0-r2)
(3/10) Installing c-ares (1.28.1-r0)
(4/10) Installing libunistring (1.2-r0)
(5/10) Installing libidn2 (2.3.7-r0)
(6/10) Installing nghttp2-libs (1.62.1-r0)
(7/10) Installing libpsl (0.21.5-r1)
(8/10) Installing zstd-libs (1.5.6-r0)
(9/10) Installing libcurl (8.8.0-r0)
(10/10) Installing curl (8.8.0-r0)
Executing busybox-1.36.1-r29.trigger
Executing ca-certificates-20240226-r0.trigger
OK: 13 MiB in 24 packages
/ #
/ #
/ # curl -i http://backend-service:8000/rest/v1/health/
HTTP/1.1 200 OK
date: Thu, 04 Jul 2024 04:48:21 GMT
server: uvicorn
content-length: 15
content-type: application/json
{"status":"ok"}/ #
/ #
Here's a breakdown of what happened here:
- Kubernetes has a built-in DNS Server that creates DNS records for the Services. The resolution happens by the Service name. In this case, the Service name is
backend-service
. - When you create a Service, you spcify a
selector
field. Based on this, the endpoint slice controller creates Endpoint objects. The Endpoint object contains the IP addresses of the Pods that match the selector. You can see the Endpoint object created for the Service using the following command:
$ kubectl get endpoints backend-service
NAME ENDPOINTS AGE
backend-service 192.168.189.66:8000,192.168.189.67:8000,192.168.235.130:8000 38m
- You can see to which Pods the Service is routing the traffic. Run the following command to see the IP addresses of the Pods:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
simple-deployment-794f78c89-9h7jh 1/1 Running 0 39m 192.168.189.66 worker2 <none> <none>
simple-deployment-794f78c89-hs2s9 1/1 Running 0 39m 192.168.189.67 worker2 <none> <none>
simple-deployment-794f78c89-tgnhl 1/1 Running 0 39m 192.168.235.130 worker1 <none> <none>
- Along with this a DNS record for the Service is created by the DNS server.
- When you send a curl http://backend-service:8000/rest/v1/health/ request, the DNS server resolves the Service name to the IP addresses of the Pods. The request is then routed to one of the Pods.
Understanding the Service manifest
Let's break down the service.yaml
file:
apiVersion: v1
: This tells Kubernetes to use thev1
API version.kind: Service
: This specifies the type of object we're creating, which is a Service.metadata
: This field specifies the additional metadata that should be associated with the Service.name
: This field specifies the name of the Service. In this case, it'sbackend-service
.
spec
: This field specifies the specification of the Service.type
: This field specifies the type of the Service. In this case, it'sClusterIP
.selector
: This field specifies the selector that the Service uses to route the traffic to the Pods. In this case, the selector isapp: simple-deployment
.ports
: This field specifies the ports that the Service listens on.protocol
: This field specifies the protocol that the Service listens on. In this case, it'sTCP
.port
: This field specifies the port on which the Service listens. In this case, it's8000
.targetPort
: This field specifies the port on the Pods to which the traffic is routed. In this case, it's8000
.name
: [Optional] This field specifies the name of the port. In this case, it's not specified.
Cleaning up
To clean up the resources created in this section, run the following commands:
kubectl delete -f deployment.yaml
kubectl delete -f service.yaml
Summary
In this section, we created a ClusterIP Service to route traffic to the Pods created by the Deployment. We then sent a request to the Service from inside the cluster to see how the traffic is routed to the Pods.
NodePort Service
Creating a NodePort Service
A NodePort Service exposes a specified port on all the nodes in the cluster. Any traffic that comes to the Node on that port is routed to the selected Pods.
To Create a NodePort Service, update the service.yaml
file to have the following configuration:
apiVersion: v1
kind: Service
metadata:
name: backend-service
spec:
type: NodePort
selector:
app: simple-deployment
ports:
- protocol: TCP
port: 8000
targetPort: 8000
nodePort: 30001
Apply the changes to the Service:
kubectl apply -f service.yaml
Now, we'll send a request to the Node directly. First, get the nodes where the Pods are running:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
simple-deployment-794f78c89-cpwdd 1/1 Running 0 7m4s 192.168.189.66 worker2 <none> <none>
simple-deployment-794f78c89-gq7ts 1/1 Running 0 7m4s 192.168.235.130 worker1 <none> <none>
simple-deployment-794f78c89-nfzzk 1/1 Running 0 7m4s 192.168.219.71 master <none> <none>
In this case, the Pods are running on all the nodes. So, you can send a request to any of the nodes.
Next get the IP address of the node:
If you're using Minikube, you can use the following command:
minikube ip
If you're hosting the cluster locally, you can use the following command:
kubectl get nodes -o wide
If you're using CloudLab, you can visit the node's page and get the IP under Control IP
.
Now, send a request to the Node:
curl -i http://<node-ip>:30001/rest/v1/health/
$ curl -i http://<node-ip>:30001/rest/v1/health/
HTTP/1.1 200 OK
date: Fri, 05 Jul 2024 18:44:34 GMT
server: uvicorn
content-length: 15
content-type: application/json
{"status":"ok"}
Here, we're hitting the node port 30001
. This is the port we've specified in the service.yaml
file. The traffic is routed to the Pods based on the selector.
Understanding the Service manifest
Let's break down the service.yaml
file:
apiVersion: v1
: This tells Kubernetes to use thev1
API version.kind: Service
: This specifies the type of object we're creating, which is a Service.metadata
: This field specifies the additional metadata that should be associated with the Service.name
: This field specifies the name of the Service. In this case, it'sbackend-service
.
spec
: This field specifies the specification of the Service.type
: This field specifies the type of the Service. In this case, it'sNodePort
.selector
: This field specifies the selector that the Service uses to route the traffic to the Pods. In this case, the selector isapp: simple-deployment
.ports
: This field specifies the ports that the Service listens on.protocol
: This field specifies the protocol that the Service listens on. In this case, it'sTCP
.port
: This field specifies the port on which the Service listens. In this case, it's8000
. This port should be used by the pods talk to the service within the cluster.targetPort
: This field specifies the port on the Pods to which the traffic is routed. In this case, it's8000
.nodePort
: This field specifies the port on the Node to which the traffic is routed. In this case, it's30001
.name
: [Optional] This field specifies the name of the port. In this case, it's not specified.
To access a NodePort service from within a cluster, you can use the same method as for a ClusterIP service. You can use the Service name to access the service.
$ kubectl run -i --tty --rm debug --image=alpine -- sh
If you don't see a command prompt, try pressing enter.
/ # apk add curl
fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/community/x86_64/APKINDEX.tar.gz
(1/10) Installing ca-certificates (20240226-r0)
(2/10) Installing brotli-libs (1.1.0-r2)
(3/10) Installing c-ares (1.28.1-r0)
(4/10) Installing libunistring (1.2-r0)
(5/10) Installing libidn2 (2.3.7-r0)
(6/10) Installing nghttp2-libs (1.62.1-r0)
(7/10) Installing libpsl (0.21.5-r1)
(8/10) Installing zstd-libs (1.5.6-r0)
(9/10) Installing libcurl (8.8.0-r0)
(10/10) Installing curl (8.8.0-r0)
Executing busybox-1.36.1-r29.trigger
Executing ca-certificates-20240226-r0.trigger
OK: 13 MiB in 24 packages
/ #
/ # curl -i ^C
/ # curl -i http://backend-service:8000/rest/v1/health/
HTTP/1.1 200 OK
date: Fri, 05 Jul 2024 21:12:04 GMT
server: uvicorn
content-length: 15
content-type: application/json
{"status":"ok"}/ #
/ #
Note that here the port is 8000
and not 30001
. This value is defined by the port
field in the service.yaml
file.
Cleaning up
To clean up the resources created in this section, run the following commands:
kubectl delete -f deployment.yaml
kubectl delete -f service.yaml
Summary
In this section, we learned how to create a NodePort Service. We also learned how to access the Service from outside the cluster. We used the nodePort
field to specify the port on the Node to which the traffic is routed. We also learned how to access the Service from within the cluster using the Service name.
Using Port Names
This feature is not talked about much, but it's a good feature nevertheless. You can use port names in the Service manifest. This decouples the port number from the Service configuration and provides a more human-readable way to access the Service. Here's is an example for how you'll use port names:
apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-deployment
spec:
replicas: 3
selector:
matchLabels:
app: simple-deployment
template:
metadata:
labels:
app: simple-deployment
spec:
containers:
- name: apiserver
image: rutush10/simple-restapi-server-py:v0.0.1
ports:
- containerPort: 8000
name: apiserver-http
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 200Mi
---
apiVersion: v1
kind: Service
metadata:
name: backend-service
spec:
type: NodePort
selector:
app: simple-deployment
ports:
- protocol: TCP
port: 8000
targetPort: apiserver-http
nodePort: 30001
name: backend-http
The ---
is called the YAML separator. It's used to separate multiple documents in a single file. In this case, we have two documents: the Deployment and the Service.
First, we define the port name apiserver-http
in the Deployment. Then, we use this port name in the Service, specifying the targetPort
as apiserver-http
. This way, we can use the port name to access the Service. The benefit of this approach is that you can change the port number in the Deployment without changing the Service configuration.
Also, you can name the Service port as well, here we've named it backend-http
. Another resource, like a Pod, can access the Service using the port name backend-http
, or another resource like an Ingress
can use the port name to route the traffic to the Service.
Summary
In this chapter, we learned about the different types of Services in Kubernetes. We started with the ClusterIP Service, which is used to route traffic within the cluster. Then, we moved on to the NodePort Service, which is used to expose the Service on a specific port on each node. Finally, we discussed the LoadBalancer Service, which is used to expose the Service outside the cluster.
Services - A deeper dive
Now that we have a basic understanding of Kubernetes Services, let's go a level deeper and figure out how they work under the hood.
Note: This chapter is only inteded for a deeper understanding of Kubernetes, can be skipped.
Developing Applications
In this chapter we will discuss how to structure your application code in a way that is easy to maintain and scale. The code can be found in the rutu-sh/bootstrapping-with-kubernetes-examples.
You can use the application structure as a reference to build and deploy your own applications.
To begin with, we will discuss the Controller Service Repository pattern.
Controller Service Repository Pattern
This pattern of structuring code is a common practice in the software industry. It is a way to separate concerns and make the codebase maintainable such that if any new feature is added, it can be done with minimal changes to the existing codebase.
Let's see what each of these components does:
-
Controller: This is the entrypoint to the application. If you're writing any APIs, this is where you write the code to handle the request. The only responsibility of the controller is to take external input, validate it, and pass it to the service layer.
-
Service: This is where you implement the business logic. The service layer is not concerned with how the request/response is handled or how the data is stored. It is only concerned with the business logic. This layer is called by the controller layer and it interacts with the repository layer to handle data.
-
Repository: This is where you implement all the low level logic for managing the data. It generally provides a CRUD interface to interact with the database.
In a typical microservice, this is how the flow of request would look like:
- The request comes to the controller.
- The controller validates the request and calls the appropriate function in the service layer. Additionally, it may transform the request data to a format that the service layer understands.
- The service layer performs the business logic. To do this, whenever any data operation is required, it interacts with the repository layer to manage data.
- The repository layer interacts with the database to manage data.
This pattern separates the concerns of each layer and makes the codebase maintainable.
Let's see how that can be useful for you.
Consider the following scenario:
You are building a backend application for a user management system.
The application should have the following features:
- Add a new user
- Get a user by ID
- Update a user
- Delete a user
- Get all users
And a user is defined by the following properties:
struct User {
ID String,
Name String,
Email String,
Age Integer
}
You assume that the application won't have a lot of users initially, so you implement the application using simple Flask + SQLite stack.
Here is how you would structure your code in the Controller Service Repository pattern (I've removed all __init__.py
files for brevity):
.
├── app
| ├── controller
| | └── controller.py
| ├── common
| | └── common.py
| | └── config.py
| ├── models
| | └── user.py
| ├── repository
| | └── repository.py
| ├── service
| | └── service.py
| ├── main.py
Here is, in breif, what each file implements:
common/config.py
# Configure the values of HOST, PORT, DATABASE_ENDPOINT, etc. using environment variables
models/common.py
# Define common functions like logger, etc.
models/user.py
class User:
# User model
id: str
name: str
email: str
age: int
class UserDB:
# Database representation of User
id: str
name: str
email: str
age: int
created_at: datetime
updated_at: datetime
class UserRequest:
# Request representation of User
name: str
email: str
age: int
repository/repository.py
def add_user(user: User):
# Add a new user to the database
# Generate the UserDB object from the User object with created_at and updated_at set to current time
# Add the user_db to the database
def get_user_by_id(user_id: str) -> User:
# Get a user by ID from the database
# Convert the UserDB object to User object and return it
def update_user(user_id: str, user: User) -> User:
# Update a user in the database
# Generate the updated UserDB object from the User object with updated_at set to current time
# Update the user_db in the database
return user
def delete_user(user_id: str):
# Delete a user from the database
# Delete the user_db from the database
def get_users() -> List[User]:
# Get all users from the database
# Convert the UserDB objects to User objects and return them
service/service.py
def add_user(user_request: UserRequest) -> User:
# Generate the User object from the UserRequest object
# Call the repository function to add the user to the database
# Return the User object
def get_user_by_id(user_id: str) -> User:
# Call the repository function to get the user by ID
# Return the User object
def update_user(user_id: str, user_request: UserRequest) -> User:
# Generate the User object from the UserRequest object
# Call the repository function to update the user in the database
# Return the User object
def delete_user(user_id: str):
# Call the repository function to delete the user from the database
def get_users() -> List[User]:
# Call the repository function to get all users from the database
# Return the list of User objects
controller/controller.py
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route('/user', methods=['POST'])
def add_user():
# Get the user data from the request
# Call the service function to add the user
# Return the user data
@app.route('/user/<user_id>', methods=['GET'])
def get_user_by_id(user_id):
# Call the service function to get the user by ID
# Return the user data
@app.route('/user/<user_id>', methods=['PUT'])
def update_user(user_id):
# Get the user data from the request
# Call the service function to update the user
# Return the user data
@app.route('/user/<user_id>', methods=['DELETE'])
def delete_user(user_id):
# Call the service function to delete the user
# Return the success message
@app.route('/users', methods=['GET'])
def get_users():
# Call the service function to get all users
# Return the list of users
main.py
from controller.controller import app
from common.config import HOST, PORT
if __name__ == '__main__':
app.run(host=HOST, port=PORT)
Now let's first understand the importance of the Repository layer.
You notice that the application is working fine, but the database is slow. You decide to switch to a faster database like PostgreSQL.
OR
You decide to push your application to cloud and use a managed database service like AWS RDS.
To do this, you would only need to change the implementation of the repository functions in the repository.py
file. The service and controller layers would remain the same.
Now let's understand the importance of the Service layer.
You decide to add a new feature to the application. You want to call another microservice to send an email to the user when a new user is added.
To do this, you would only need to change the implementation of the service functions in the service.py
file. The controller and repository layers would remain the same.
To understand the importance of the Controller Layer, let's say you decide to switch from Flask to FastAPI. You would only need to change the implementation of the controller functions in the controller.py
file. The service and repository layers would remain the same.
This is the power of the Controller Service Repository pattern. It makes your codebase maintainable and scalable.
You can use this pattern to structure your code in any language or framework.
Building a Python FastAPI application
In this chapter, we will build a simple FastAPI application in Python using the Controller Service Repository pattern, referencing the scenario discussed in the previous chapter.
Reference
The reference code is available in the rutu-sh/bootstrapping-with-kubernetes-examples repository.
Application Structure
Here is how you would structure your code in the Controller Service Repository pattern:
.
├── build
│ ├── Dockerfile
│ └── requirements.txt
├── docs
│ └── README.md
└── src
│ ├── common
│ │ ├── __init__.py
│ │ ├── common.py
│ │ └── config.py
│ ├── controller
│ │ ├── __init__.py
│ │ ├── health_check_controller.py
│ │ └── user_controller.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── errors.py
│ │ └── models.py
│ ├── repository
│ │ ├── __init__.py
│ │ ├── db_common.py
│ │ └── user_repository.py
│ ├── service
│ │ ├── __init__.py
│ │ └── user_service.py
│ └── main.py
└── Makefile
At the top level, this structure is divided into the following:
build
: Contains build-specific files like Dockerfile and requirements.txt.docs
: Contains documentation for the application.src
: Contains the source code for the application.Makefile
: Contains commands to build, run, and push the application.
Application Structure
The application contains the following components:
common
:common.py
: Contains common functions like logger, etc.config.py
: Contains configuration values like HOST, PORT, DATABASE_HOST, etc.
models
:errors.py
: Contains custom exceptions.models.py
: Contains the data models used by the different layers.
The above two components are used by the following components:
-
controller
: Contains the routers and request handlers for processing incoming requests.health_check_controller.py
: Contains the health check routers.user_controller.py
: Contains the user routers.
-
service
: Contains the business logic for the application.user_service.py
: Contains the user service logic.
-
repository
: Contains the database interaction logic.db_common.py
: Contains common database functions.user_repository.py
: Contains the user repository logic.
-
main.py
: Contains the FastAPI application setup and configuration. This is the entry point of the application.
The flow of the application is as follows:
- The FastAPI application is started in
main.py
. It imports the routers from thecontroller
package. Based on the incoming request, the respective router is called. - The router calls the appropriate function in the
service
layer. The service layer interacts with therepository
layer to perform data operations. - The
repository
layer interacts with the database to perform CRUD operations. - Repository returns the data to the service layer, which processes it and returns it to the router. The router sends the response back to the client.
This structure separates the concerns of each layer and makes the codebase maintainable. You can use this structure as a reference to build and deploy your own applications.
If you move to a different database, you will only need to change the implementation of the repository layer. Since the service and controller layers are agnostic to the database, you won't need to make any changes there. If you want to add a new feature, you can add it to the service layer. The controller layer will call the new function, and the repository layer will interact with the database to perform the operation. If you want to change the request/response format, you can do it in the controller layer. The service and repository layers will remain unaffected.
Summary
In this chapter, we discussed how to structure your application code in a way that is easy to maintain and scale. We used the Controller Service Repository pattern to build a simple FastAPI application. You can use the application structure as a reference to build and deploy your own applications.