I am writing a series of blog posts about troubleshooting Kubernetes. One of the reasons why Kubernetes is so complex is because troubleshooting requires many levels of information gathering. It’s like trying to find the other end of a string in a tangled string ball.

Today, I got this alert in my Slack channel and I have no idea what this means.

Annotations message: There are 2 different versions of Kubernetes components running.

Let’s dig into it.

I am using the Helm Chart Prometheus Operator. This gives me a very good all-in-one Prometheus/Grafana/Alerting solution.

This alert is one of the default alerts that comes in the package.

The alert sends me to this run book:

- runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeversionmismatch

However, this doesn’t really tell me anything besides the name of the alert.

Let’s look at the alert it self and the query in the Prometheus UI:

alert: KubeVersionMismatch expr: count(count by(gitVersion) (kubernetes_build_info{job!="kube-dns"})) > 1

I still only seem to know that there are some different versions of Kubernetes running but I am not sure exactly what it is talking about.

Here is the query:

count(count
  by(gitVersion) (kubernetes_build_info{job!="kube-dns"})) > 1

Let’s run this query ourselves and get more details of what it returns

query

Ah right, it returns a number. We can eliminate most of the query to see what the underlying data is. Change the query to:

kubernetes_build_info{job!="kube-dns"}

This would give you:

query

Yup, per the query the data does show that there are different gitVersion being returned.

However, what does this mean?

Looking at the data some more you will notice this key/value

job="apiserver"

This is information on the job apiserver. This is the Kubernetes API. It is saying currently there are 2 different versions of Kubernetes API server running in this cluster. This might or might not be a concern. If you are upgrading your Kubernetes cluster, then this is not a concern, because Kubernetes can work in a multi version state. You probably shouldn’t leave it at this state but it will be fine. If you are not upgrading your server and no changes are planned, then you will definitely want to investigate this.

As it turns out, I am running on GKE and it is automatically upgrading the k8s master for me.

Using k8sBot to troubleshoot

I created k8sBot because I’ve spent countless hours fixing Kubernetes configuration issues. It was frustrating to spend time looking at multiple Kubernetes resources to figure out what was wrong. There were many times when my eyes would skim right over the error and I would feel terrible when I finally did find the error (minutes or hours later). Troubleshooting Kubernetes is a prime example of when robots are better than humans!

k8sBot can help you troubleshoot with our easy point-and-click user interface directly in Slack so the whole team knows what’s going on:

k8sbot workflow - imagepullbackoff pod

Now, anyone can get meaningful Kubernetes information with @k8sbot. It’s just one click to retrieve pod status, get pod logs, and get troubleshooting recommendations based on real-time information from your cluster’s Kubernetes API.

Learn more about k8sBot, a point-and-click interface for Kubernetes in Slack or sign up for a free 30 day trial

More troubleshooting blog posts