As I’ve been putting more and more time into my monitoring setup, I wanted to set up some simple uptime monitoring. Grafana Cloud does have Synthetic monitoring, but not only is this overkill for uptime monitoring, the way their free tier works I would only be able to monitor a single web application. I also wanted to get metrics on things like latency, SSL certificates, and domain renewals. Finally, I also run quite a bit of infrastructure behind my firewall that I wanted to monitor as well. After searching around for about an hour, I just decided to write something myself.
Now, big caveat here, you really shouldn’t monitor anything mission critical from within your infrastructure. If you lose your entire infrastructure (Cloud outage, ISP outage, etc.) you won’t get the alerts because your monitoring lives in the same place. That being said, there are also plenty of times where you want internal monitoring. Just think through your use cases.
Uptime Prometheus
Uptime Prometheus is a simple application I wrote that takes a YAML configuration file that lists URLs or domains and runs various checks on them. Currently I have written uptime checks (including simple up/down and latency), SSL checks, and domain renewal checks. It then formats these checks in a way that any Prometheus scraper can retrieve.
All the instructions are in the README file so I won’t rehash it here. Hopefully this helps out other people!