How to use Prometheus and k8sBot
This is us eating our own dog food. How else are we going to find these use cases =)
Our Prometheus alerting sent an alert to our Slack channel telling us that the
deployment for one of our GCP Marketplace backend pods has been in a non-ready state for longer than an hour.
From there in Slack, I can ask the bot to list the pods:
Then I see that the k8sbot-gcp-marketplace-backend-server-5f54ddbdd5-6bbrf
is
in a ImagePullBackOff
state.
I then ask @k8sBot to describe this pod from the drop down menu, which then @k8sbot returns in Slack:
This telling me that it is failing to pull the image and there is a specific event that k8sBot brought back to us that gives a really good clue on what happened:
unauthorized: incorrect username or password
That triggered my memory that Docker Hub had one of their databases compromised and they sent out emails to everyone to reset the password
https://success.docker.com/article/docker-hub-user-notification
So I did. However, this has some downstream effects that were not known to me at the time, like this one. The final fix was to update the password used to pull these images and we are back!
Need personalized help?
ManagedKube provides DevOps consulting services that help you leverage the power of Docker/Kubernetes in building highly resilient, secure, and scalable fully automated CI/CD workflows.
Schedule a free 15 minute consultation today by e-mailing us: consulting@managedkube.com
Contact me if you have any questions about this or want to chat, happy to start a dialog or help out: blogs@managedkube.com {::nomarkdown}
Learn more about integrating Kubernetes apps
{:/nomarkdown}
GKE | Prometheus | k8sBot