Simple Uptime Monitoring in Kubernetes
EDIT March 08, 2024: updated with the latest version of the container, netcat example, and keeping track of state in the
uptime-status
PVC
I will keep this brief: I have implemented uptime monitoring in Kubernetes and I don’t know of any solution that’s simpler.
There are two ingredients for this system:
- NTFY for push notifications
- My bash-uptime script/container
I will share my specs for them and assume both will run in the same cluster.
NTFY #
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: ntfy
name: ntfy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: ntfy
template:
metadata:
labels:
app: ntfy
spec:
nodeSelector:
kubernetes.io/hostname: nix-nvidia
containers:
- image: docker.io/binwiederhier/ntfy:v2.8.0
name: ntfy
command: ["ntfy"]
args: ["serve", "--cache-file", "/var/cache/ntfy/cache.db"]
ports:
- name: ntfy
containerPort: 80
volumeMounts:
- name: ntfy-cache
mountPath: /var/cache/ntfy
- name: ntfy-etc
mountPath: /etc/ntfy
env:
- name: NTFY_BASE_URL
value: "http://ntfy.barn-banana.ts.net"
- name: NTFY_UPSTREAM_BASE_URL
value: "https://ntfy.sh"
- name: TZ
value: "America/Denver"
volumes:
- name: ntfy-cache
hostPath:
path: /opt/ntfy/cache
type: Directory
- name: ntfy-etc
hostPath:
path: /opt/ntfy/etc
type: Directory
---
apiVersion: v1
kind: Service
metadata:
labels:
app: ntfy
name: ntfy
namespace: default
annotations:
tailscale.com/expose: "true"
tailscale.com/hostname: "ntfy"
tailscale.com/tags: [ "tag:http" ]
spec:
ports:
- name: ntfy
port: 80
protocol: TCP
targetPort: 80
selector:
app: ntfy
type: ClusterIP
bash-uptime #
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: uptime-status
namespace: monitoring
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 512Mi
---
apiVersion: v1
kind: ConfigMap
metadata:
name: uptime-config
namespace: monitoring
data:
uptime.yaml: |
global:
track_status: true
status_dir: "/status"
ping:
hosts:
- nix-backups
- nix-drive
- mac-mini
- nix-precision
- nixos-matrix
options: "-c 1 -W 1"
silent: "true"
curl:
urls:
- "http://syncthing.syncthing"
- "http://redlib.default"
- "http://cloudtube.default"
- "http://second.default"
- "http://home-assistant.default"
- "http://some-bad-service-fake-not-real.net"
options: "-LI --silent"
silent: "true"
netcat:
services:
- rustdesk.default:22000
- protonmail-bridge.default:25
- protonmail-bridge.default:143
- protonmail-bridge.default:143
- syncthing.syncthing:80
- syncthing.syncthing:22000
options: "-vz"
silent: "true"
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: uptime
namespace: monitoring
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
activeDeadlineSeconds: 30
template:
spec:
containers:
- image: docker.io/heywoodlh/bash-uptime:0.0.4
name: uptime
command:
- "/bin/bash"
- "-c"
args:
- "/app/uptime.sh | xargs -r -I {} curl -d \"{}\" http://ntfy.default/uptime-notifications"
volumeMounts:
- name: uptime-config
mountPath: /app/uptime.yaml
subPath: uptime.yaml
- name: uptime-status
mountPath: /status
restartPolicy: OnFailure
volumes:
- name: uptime-config
configMap:
name: uptime-config
items:
- key: uptime.yaml
path: uptime.yaml
- name: uptime-status
persistentVolumeClaim:
claimName: uptime-status
Conclusion/more info #
That’s it!
If you want any additional information/reference on my cluster, feel free to check out my Nix-managed cluster configuration here: heywoodlh/flakes
Written on February 8, 2024
linux kubernetes monitoring alert uptime