Published: Jul 31, 2025 by Isaac Johnson
Today we’re going to cover backing up very large GIT repos to a local fileserver, specifically an NFS mount on a NAS. I’ll then look at upgrading our Uptime Kuma to a much improved 2.0 beta3 contrasting some of the new features we can see in the 2.x line. I’ll cover cutover and usage including setting up Grafana Cloud and using Grafana IRM (formerly OnCall).
I’ll show how to fix (for good) the feedback form which uses a Github Repository dispatch with a new non-expiring narrow Github narrowly defined token. Lastly, we’ll touch on using Datadog to help find and fix some containers really slowing down our primary production kubernetes cluster.
But first, let’s show how to use a Synology NAS for a GIT server (for backups).
GIT Repos on a NAS
One of the first things I wanted to tackle was backing up my very large Git repo to a locally controlled NAS. I’m always worried what might happen if something errantly deleted it or made a force push modification.
First, in the Synology NAS, in “Control Panel” and “Shared Folder”, create a home for GIT repos
I’ll give it a name and disable the recycle bin
I’m going to add data integrity checking
I’ll share it with my ijohnson user as well, then create. Now that it’s created
Back on the NAS, I’ll SSH as ijohnson and create a folder for my destination repo.
ijohnson@sirnasilot:~$ cd /volume1/gitrepos/
ijohnson@sirnasilot:/volume1/gitrepos$ mkdir jekyll-blog
ijohnson@sirnasilot:/volume1/gitrepos$
We need to make it bare
to work
ijohnson@sirnasilot:/volume1/gitrepos/jekyll-blog$ git init --bare
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in /volume1/gitrepos/jekyll-blog/
Now, back on my PC, i can add a new origin and push the repo to the NAS for safe keeping
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ git push originnas4 main
ijohnson@192.168.1.116's password:
fatal: '/volume1/gitrepos/jekyll-blog' does not appear to be a git repository
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ git push originnas4 main
ijohnson@192.168.1.116's password:
Enumerating objects: 20217, done.
Counting objects: 100% (20217/20217), done.
Delta compression using up to 16 threads
Compressing objects: 100% (17709/17709), done.
Writing objects: 100% (20217/20217), 5.05 GiB | 28.17 MiB/s, done.
Total 20217 (delta 2776), reused 16824 (delta 2325)
remote: Resolving deltas: 100% (2776/2776), done.
To ssh://192.168.1.116:/volume1/gitrepos/jekyll-blog
* [new branch] main -> main
I can see this (very large) repo is now safely stored
ijohnson@sirnasilot:/volume1/gitrepos$ du -chs ./jekyll-blog/
5.1G ./jekyll-blog/
5.1G total
Uptime Kuma
Our current Uptime Kuma
runs on a docker container on the Docker host
$ kubectl get endpoints | grep uptime
uptime-external-ip 192.168.1.99:3101 505d
There I can see we are running version 1.23.3
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1700cf832175 louislam/uptime-kuma:1.23.3 "/usr/bin/dumb-init …" 21 months ago Up 3 weeks (healthy) 0.0.0.0:3101->3001/tcp, :::3101->3001/tcp uptime-kuma-1233b
I can look at the listing in Dockerhub and see we have up to 2.0 beta3 available now
Let’s test that before we commit to using a beta version
I’ll pull it down
$ docker pull louislam/uptime-kuma:2.0.0-beta.3
2.0.0-beta.3: Pulling from louislam/uptime-kuma
61320b01ae5e: Already exists
057bf83be68a: Pull complete
d34dc2c1b56b: Pull complete
b12d1e6fd3ba: Pull complete
d9d139bf2ac2: Pull complete
dee3e095122f: Pull complete
b3bb6e0218a8: Pull complete
34358a26e336: Pull complete
e0ca47558051: Pull complete
8fcf95f1cbdd: Pull complete
92a41d2e8868: Pull complete
18df373464a0: Pull complete
6de0eb6f16fa: Pull complete
a3a168ba7927: Pull complete
Digest: sha256:0c699f39b1952652e809966618c731d7c37b11bcbd7bdc240f263e84f14e1786
Status: Downloaded newer image for louislam/uptime-kuma:2.0.0-beta.3
docker.io/louislam/uptime-kuma:2.0.0-beta.3
Then start with:
docker run -d –restart=unless-stopped -p 3001:3001 -v uptime-kuma:/app/data –name uptime-kuma louislam/uptime-kuma:2.0.0-beta.3
In my case I needed to stop and remove the old container
$ docker stop uptime-kuma
uptime-kuma
$ docker rm uptime-kuma
uptime-kuma
$ docker run -d --restart=unless-stopped -p 3001:3001 -v uptime-kuma:/app/data --name uptime-kuma louislam/uptime-kuma:2.0.0-beta.3
8d441bce2f0cefe2c26f4d40a99552c3e936b886d3307f9a8cb42dc73b012937
This is new… I have a database picker on first launch
SQLite matches what has always been used, and the Embedded MariaDB is new. I found it interesting we can now use an external database as well
After I picked a database and it initialized, I was then prompted to create a user
The first feature that I noticed was a new parameter we can use in our monitors “cachebuster”:
I was going to compare Notification Providers
But there are too many new ones to list. You can view them all in server/notification-providers in Github or components/notifications
In settings, there is a new “Monitor Toast notifications” section
There is a new section altogether called “Remote Browsers”
The only things really missing are those wonkier features that were already marked deprecated like backups
Lastly, I noticed there are some new Monitor types for SMTP and SNMP
I’m tempted to start fresh as I have a few volumes so far
builder@builder-T100:~$ docker volume ls | grep uptime
local uptime-kuma
local uptime-kuma1233
local uptime-kuma1233b
And the current volume is half a gig
Approach 1: Copy
I’m somewhat worried about an upgrade going wrong and losing my Uptime configuration. I would be a lot more comfortable creating a “new” instance and migrating over to it.
I’ll make a new volume just for this 2.0 instance then launch on a new port. I plan to keep an eye on tags in case a new 2.0 comes out
builder@builder-T100:~$ docker volume create uptimekuma2
uptimekuma2
builder@builder-T100:~$ docker run -d --restart=unless-stopped -p 3201:3001 -v uptime-kuma2:/app/data --name uptime-kuma2 louislam/uptime-kuma:2.0.0-beta.3
Unable to find image 'louislam/uptime-kuma:2.0.0-beta.3' locally
2.0.0-beta.3: Pulling from louislam/uptime-kuma
61320b01ae5e: Pull complete
057bf83be68a: Pull complete
d34dc2c1b56b: Pull complete
b12d1e6fd3ba: Pull complete
d9d139bf2ac2: Pull complete
dee3e095122f: Pull complete
b3bb6e0218a8: Pull complete
34358a26e336: Pull complete
e0ca47558051: Pull complete
8fcf95f1cbdd: Pull complete
92a41d2e8868: Pull complete
18df373464a0: Pull complete
6de0eb6f16fa: Pull complete
a3a168ba7927: Pull complete
Digest: sha256:0c699f39b1952652e809966618c731d7c37b11bcbd7bdc240f263e84f14e1786
Status: Downloaded newer image for louislam/uptime-kuma:2.0.0-beta.3
7667c491a40531e6fc9909f9b7d6fdaf30c4bb3fe2bbccf10fa37efb7da72385
I can now do the setup steps again
I think I will use an external MariaDB (MySQL) database this time.
I went to my NAS and logged into MariaDB 10
root@SassyNassy:~# /usr/local/mariadb10/bin/mysql -u root -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 851089
Server version: 10.3.32-MariaDB Source distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> SELECT User, Host FROM mysql.user WHERE Host <> 'localhost';
+---------+-----------+
| User | Host |
+---------+-----------+
| forgejo | % |
| phpiam | % |
| yourls | % |
| root | 127.0.0.1 |
| root | ::1 |
+---------+-----------+
5 rows in set (0.200 sec)
MariaDB [(none)]>
I can now create a new Database and User for Kuma then grant that user all privileges to the new database
MariaDB [(none)]> CREATE DATABASE kuma;
Query OK, 1 row affected (0.020 sec)
MariaDB [(none)]> CREATE USER 'kuma'@'%' IDENTIFIED BY 'NOTTHEPASSWORD';
Query OK, 0 rows affected (0.432 sec)
MariaDB [(none)]> GRANT ALL PRIVILEGES ON kuma.* TO 'kuma'@'%';
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.013 sec)
MariaDB [(none)]>
_(Note: the default port for MySQL/MariaDB is _3306, but my Synology serves it on 3307)
I can now see it is setting up the DB
Lastly, I create my admin account
I now have a blank instance on the new port
As I was replicating alerts, I kept finding new new things
For instance, Discord now has a “message type” option that didn’t exist before
Or with Ping tests, there is now a Per-Ping timeout value (default 2)
As I added notification channels
I made sure to test them
Eventually I got everything organized and updated in the new Uptime system
I should just be able to now switch the external IP for the endpoint in Kubernetes to redirect my existing Uptime ingress
$ kubectl get endpoints | grep uptime
uptime-external-ip 192.168.1.99:3101 512d
$ kubectl get endpoints uptime-external-ip
NAME ENDPOINTS AGE
uptime-external-ip 192.168.1.99:3101 512d
I can just edit it
$ kubectl edit endpoints uptime-external-ip
endpoints/uptime-external-ip edited
and change the value from the older 3101 to 3201
Yet, that did not work
I changed it back and it was happy. I might have to do this in a bigger way
I’ll grab the old service to a YAML file
$ kubectl get svc uptime-external-ip -o yaml > uptime.svc.yaml
$ kubectl get endpoints uptime-external-ip -o yaml >> uptime.svc.yaml
I then cleaned it up and made a “new” version
$ cat ./uptime.svc.yaml
apiVersion: v1
kind: Service
metadata:
name: uptime-external-ip-new
spec:
clusterIP: None
clusterIPs:
- None
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
- IPv6
ipFamilyPolicy: RequireDualStack
ports:
- name: uptimepnew
port: 80
protocol: TCP
targetPort: 3201
sessionAffinity: None
type: ClusterIP
---
apiVersion: v1
kind: Endpoints
metadata:
name: uptime-external-ip-new
subsets:
- addresses:
- ip: 192.168.1.100
ports:
- name: uptimepnew
port: 3201
protocol: TCP
This meant giving a new name to the Endpoint, Service and port name.
$ kubectl apply -f ./uptime.svc.yaml
service/uptime-external-ip-new created
endpoints/uptime-external-ip-new created
I then pulled the ingress and updated the service name (yes, I misspelled that, i know)
$ kubectl get ingress uptimeingress -o yaml > uptimeimgres.yaml.bak
$ kubectl get ingress uptimeingress -o yaml > uptimeimgres.yaml
$ vi uptimeimgres.yaml
$ diff uptimeimgres.yaml uptimeimgres.yaml.bak
12c12
< nginx.org/websocket-services: uptime-external-ip-new
---
> nginx.org/websocket-services: uptime-external-ip
28c28
< name: uptime-external-ip-new
---
> name: uptime-external-ip
$ kubectl apply -f ./uptimeimgres.yaml
ingress.networking.k8s.io/uptimeingress configured
Now I can see the updated version
The last bit of cleanup is to power down the old so we don’t have two systems alerting us
Quick note, just so this does not errantly startup next time, let’s also set it to not restart
builder@builder-T100:~$ docker stop uptime-kuma-1233b
uptime-kuma-1233b
builder@builder-T100:~$ docker update --restart=no uptime-kuma-1233b
uptime-kuma-1233b
Feedback form
I worry about the feedback form dropping off from time to time.
I did a quick test
but didn’t see the action trigger.
Digging into the bearer token, it had expired. I realize this is really best served by a narrow token (not classic).
I recreated it, this time without an expiration, but only focused on the one repo
I did a quick test:
$ curl -X POST -H "Accept: application/vnd.github+json" -H "Authorization: Bearer github_pat_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" -H "X-GitHub-Api-Version: 2022-11-28" https://api.github.com/re
pos/idjohnson/workflowTriggerTest/dispatches -d '{"event_type":"on-demand-payload","client_payload":{"summary":"this is the summary","descriptio
n":"this is the descriptionn","userid":"isaac@freshbrewed.science"}}'
Which triggered the flow.
This is bundled in app.c3f9f951.js
and app.c3f9f951.js.map
. I can add this to this blogs post update, but i might as well update this right now
$ aws s3 cp ./app.c3f9f951.js s3://freshbrewed.science/app.c3f9f951.js --acl public-read
upload: ./app.c3f9f951.js to s3://freshbrewed.science/app.c3f9f951.js
$ aws s3 cp ./app.c3f9f951.js.map s3://freshbrewed.science/app.c3f9f951.js.map --acl public-read
upload: ./app.c3f9f951.js.map to s3://freshbrewed.science/app.c3f9f951.js.map
And to make it live, we need to expire the old version in CloudFront (create an invalidation)
$ aws cloudfront create-invalidation --distribution-id E3U2HCN2ZRTBZN --paths "/app.c3f9f951.js"
{
"Location": "https://cloudfront.amazonaws.com/2020-05-31/distribution/E3U2HCN2ZRTBZN/invalidation/I9H1WA1X1VEYPZZJEFE70TWGHC",
"Invalidation": {
"Id": "I9H1WA1X1VEYPZZJEFE70TWGHC",
"Status": "InProgress",
"CreateTime": "2025-07-28T17:08:23.655000+00:00",
"InvalidationBatch": {
"Paths": {
"Quantity": 1,
"Items": [
"/app.c3f9f951.js"
]
},
"CallerReference": "cli-1753722502-508817"
}
}
}
$ aws cloudfront create-invalidation --distribution-id E3U2HCN2ZRTBZN --paths "/app.c3f9f951.js.map"
{
"Location": "https://cloudfront.amazonaws.com/2020-05-31/distribution/E3U2HCN2ZRTBZN/invalidation/I88YRKYP1WL9WYC88Q84VUII7N",
"Invalidation": {
"Id": "I88YRKYP1WL9WYC88Q84VUII7N",
"Status": "InProgress",
"CreateTime": "2025-07-28T17:08:33.171000+00:00",
"InvalidationBatch": {
"Paths": {
"Quantity": 1,
"Items": [
"/app.c3f9f951.js.map"
]
},
"CallerReference": "cli-1753722512-908545"
}
}
}
Now a quick test kicks the flow off
And I can see it created the issues
This should be good from now on (knock on wood) because its a narrow non-expiring token.
Grafana Cloud
Looking at the Grafana SaaS options, there is a really nice free tier
We can use a new username/password or one of our existing IdPs
I then need to pick a name
It will the create
Once done, we will end up on a landing page
I want to use the Grafana Oncall Alerting which is now part of Grafana IRM
Let’s start by going to “Integrations”
From there, add an integration
Select “Webhook”
I can now give it a name and description to indicate this will be used by Uptime Kuma
This creates the integration which has an HTTP endpoint we can use
I can add that to PagerDuty as a new “Grafana Oncall” notification endpoint
Now if I test it
I can see it show up in Grafana IRM
If I click that second one which came from Uptime Kuma, we can see the details
I can force a real alert by using a utility box and changing its IP or port.
I changed the IP back to simulate a “restored”
I see Pagerduty auto-resolved
but I had to manually resolve it in Grafana IRM
Datadog
I noticed my cluster was responding quite slowly. Something seemed to really be weighing it down.
I went to my Datadog dashboard to see the “Kubernetes Pod Overview” hoping to identify the cause
I had assumed it might be the full Grafana stack at fault since I know that can get pretty resource intensive.
However, I was surprised to find the biggest offender with the Krita app we launched last week.
Since, in that blog post, I said it wasn’t performant enough and I just moved to local docker, let’s go ahead and kill that deployment now.
Besides CPU, we also should check memory intensive pods to ensure we are using them
Zulip was something I wrote about a year ago.
While it is still up, I never really used it because I find my existing systems functional enough
It’s a pretty heavy stack
$ kubectl get svc -n zulip
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
zulip ClusterIP 10.43.45.82 <none> 80/TCP 343d
zulip-memcached ClusterIP 10.43.81.179 <none> 11211/TCP 343d
zulip-postgresql ClusterIP 10.43.175.252 <none> 5432/TCP 343d
zulip-postgresql-hl ClusterIP None <none> 5432/TCP 343d
zulip-rabbitmq ClusterIP 10.43.21.254 <none> 5672/TCP,4369/TCP,25672/TCP,15672/TCP 343d
zulip-rabbitmq-headless ClusterIP None <none> 4369/TCP,5672/TCP,25672/TCP,15672/TCP 343d
zulip-redis-headless ClusterIP None <none> 6379/TCP 343d
zulip-redis-master ClusterIP 10.43.150.84 <none> 6379/TCP 343d
$ kubectl get po -n zulip
NAME READY STATUS RESTARTS AGE
zulip-0 1/1 Running 1 (37d ago) 343d
zulip-memcached-74f9bfc676-4v6rv 1/1 Running 1 (37d ago) 343d
zulip-postgresql-0 1/1 Running 1 (37d ago) 343d
zulip-rabbitmq-0 1/1 Running 1 (37d ago) 343d
zulip-redis-master-0 1/1 Running 1 (37d ago) 343d
I tried asking nicely at first
$ helm delete zulip -n zulip
Error: failed to delete release: zulip
I did see it clean things up, but I’ll delete the namespace to be sure
$ kubectl get po -n zulip
No resources found in zulip namespace.
$ kubectl get svc -n zulip
No resources found in zulip namespace.
$ kubectl delete ns zulip
namespace "zulip" deleted
I’m still seeing my master node pegged and I’m a bit worried. I fired up btop which showed me what I already knew - its 100% pegged and the heat is at about 141d F.
I added a fan and didn’t really see the temp change
But I do see the ping time improving in Uptime Kuma so perhaps that is a good sign
Looking at the nodes, I can see most have been up a month but one hasn’t rebooted in 3
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ ssh builder@192.168.1.33 uptime
builder@192.168.1.33's password:
08:17:17 up 95 days, 14:24, 1 user, load average: 1.62, 2.88, 4.87
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ ssh builder@192.168.1.34 uptime
08:17:41 up 37 days, 21:25, 1 user, load average: 40.60, 39.92, 41.14
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ ssh builder@192.168.1.215 uptime
builder@192.168.1.215's password:
08:17:58 up 37 days, 21:20, 1 user, load average: 2.42, 3.87, 4.84
builder@DESKTOP-QADGF36:~/Workspaces/jekyll-blog$ ssh hp@192.168.1.57 uptime
hp@192.168.1.57's password:
08:18:28 up 37 days, 21:24, 1 user, load average: 2.72, 4.50, 5.74
The load average on the master node (.34) show’s its very overburdened (all at or above 40 processes waiting for CPU for the 1, 5, and 15m intervals).
One of the biggest is Sonarqube. I spun this up to hopefully use but they so limited the community edition I don’t see the point
I can remove it
builder@DESKTOP-QADGF36:~$ helm list -n sonarqube
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
sonarqube sonarqube 1 2024-04-16 06:38:12.595519 -0500 CDT deployed sonarqube-10.5.0+2748 10.5.0
builder@DESKTOP-QADGF36:~$ helm delete sonarqube -n sonarqube
release "sonarqube" uninstalled
While the CPU still shows 100%, i do find the system perhaps getting better.
(flash to several days later).
It’s still bogged and not improving.
I finally took the plung and rebooted. A few pods needed a kick in the pants, but most came back up. Once everything was finally running, we can see the CPU dropped back down considerably
Summary
Today we tackled a few things including creating a GIT repo backup locally on our NAS. We looked at the differences of Uptime Kuma 2.x to the last release before migrating our monitoring and alerting to Uptime Kuma 2.0beta3.
We fixed a problematic Feedback form by using Github non-expiring narrowly focused tokens (so hopefully this will be the last time).
We setup Grafana Cloud and used Grafana IRM (formerly OnCall) with the new Uptime Kuma (and compared to Pagerduty). Lastly, we started to dig into performance issues with our primary Kubernetes cluster by looking at data in Datadog.