Blog

Scaling GitHub Actions, Docker Caching, and Smarter Security Scans — Lessons From My Home Lab
Over the last few weeks, my development environment has been a bit of a rollercoaster—in a good way. In my previous blog, I talked about experimenting with GitHub Actions runners. That journey has evolved from running random Ubuntu runners → running runners inside VMs → running them in Kubernetes → and now testing out GitHub ARC (Actions Runner Controller) to dynamically request runner capacity on demand.

Honestly, it’s been a blast. But along the way, I hit some real-world challenges that mirror the same ones I see in the enterprise security space.

Centralized Security Scanning + Hitting Rate Limits

In my home lab, I run what I call Centralized GitHub Actions:
- pull security scanning Docker images
- run the scans
- send results to DefectDojo
- repeat this across multiple repos
Since I love experimenting with open-source security tooling, I try to run just about every scanner I can get my hands on.

And then I hit the wall.

Docker pull limits. Pip download limits. Everything.

Docker Hub’s limits are designed for normal users. I am not a normal user when it comes to automated pulls:
- Unauthenticated Docker Hub pull limit: ~100
- Authenticated: ~200
- GitHub-hosted runners get special higher limits (but only when the request originates from GitHub’s infra)
- On-prem GitHub Actions runners get none of those benefits
So I burned through my pull quota constantly.

Exploring Solutions to Rate Limits

In enterprise environments, people usually take one of three approaches:

1. Use AWS ECR or another paid cloud registry

You pay for it, but you get predictable throughput and higher pull limits.

2. Use a vendor-managed registry appliance

Good for enterprise scale. Not worth it for a home lab.

3. Build a Docker Proxy Cache (the path I picked)

A Docker proxy cache isn’t a full registry—it’s more like a caching reverse proxy:
- First pull: fetch from Docker Hub and store locally
- Subsequent pulls: instant, local, no external rate limit hit
I deployed mine per Kubernetes cluster at first, got it mostly working, and then moved toward centralizing it so all runners call the same cache endpoint.

Then pip started complaining about SSL certificates…
…so, I fixed that too.

Now it mostly works—but I still want to revisit the idea of a true self-hosted registry with caching capabilities. Not sure if it’s worth the complexity yet.

Enterprise Problems Show Up in Home Labs

One eye-opening lesson:
The exact same problems I see in enterprise CI/CD show up in my personal environment.

Why?

Because rate limits, scanner behavior, and Docker pull patterns don’t care whether you’re a Fortune 500 or a guy in his garage. The constraints are identical.

Rethinking How We Run Security Scans

Running every scanner on every pull request sounded awesome… until I tried it.

Security scans are computationally expensive. Some scanners take minutes; some take forever. Running everything on every PR is:
- wasteful
- slow
- not developer-friendly
- and honestly not necessary
A more practical pattern is emerging:

1. Base Scans → Run on a Cron Schedule

This keeps a high-level view of system health.

2. PR Scans → Run selectively

Only run the scanners that add value during development.

3. Adaptive Scans → Run based on the diff

Imagine:
- If secrets are detected → run secret scanners
- If Dockerfile changes → run container hardening checks
- If a lot of files changed → bump up scan intensity
This “context-aware scanning” is something I want to explore. Developers could even toggle this:
- “Always run full security scans for my PRs.”
- “Run lightweight scans unless something looks suspicious.”
That flexibility is powerful.

Running Scans Outside CI Jobs Using Webhooks

One of the coolest things I’ve rediscovered:
GitHub Webhooks let you run security scans outside the CI job entirely.

That means:
- CI stays fast
- scanners can run asynchronously
- failures don’t block merges
- logs stay out of the GitHub Actions UI clutter
When I was first setting up ARC, I noticed that every DefectDojo upload job appeared in the GitHub Actions queue—even though they didn’t belong there.

This made it obvious:

CI jobs should not handle everything.

Sometimes you want:
- CI job finishes →
- a webhook triggers →
- async scanners run somewhere else →
- results go to DefectDojo
- developers stay unblocked
This is something I want to build into my workflow logic.

To-Do List From This Work

A few tasks emerged from this whole process:
- Improve centralized Docker caching
- Explore hybrid scanning (cron + PR-based + adaptive)
- Build logic to run certain scans outside the CI job
- Add a PR flag to allow developers to request extra scans
- Clean up DefectDojo upload jobs to avoid cluttering the CI timeline
December 18, 2025
How I Built Automatic WordPress Failover Between On-Prem and AWS Using Coolify and Route53
Running WordPress on-premise is great for control and performance, but if your home lab or self-hosted hardware goes down, your website shouldn’t go offline with it. In this guide, I’ll show you how I built a fully automated failover setup where traffic seamlessly moves from an on-prem server to AWS — and back — with no downtime.

The best part?
It uses tools you’re probably already familiar with:
- Coolify (on-prem & AWS)
- Route53 DNS failover
- A custom health check endpoint
- A lightweight PHP status script
- Zero manual intervention once deployed
Here’s how it works.

🚀 Architecture Overview

For this example, we’ll use a fake domain:
```
examplefailover.com
```
And two WordPress servers:

Environment Location IP Address
Primary On-Prem 10.10.10.10
Failover AWS EC2 203.0.113.25

DNS is hosted in Route53, and SSL certificates are issued using the Route53 DNS-01 method.

Here’s the traffic flow:
1. Route53 checks the on-prem server’s health using a dedicated endpoint.
2. If healthy → traffic goes to on-prem.
3. If unhealthy → traffic automatically fails over to AWS.
4. When restored → traffic automatically returns to on-prem.
You get real-world high availability without running load balancers or Kubernetes.

🔧 Step 1: Configure SSL on Both Coolify Instances

Both Coolify deployments (on-prem & AWS) need valid SSL certificates for:
```
examplefailover.com
www.examplefailover.com
```
Inside each Coolify instance:
1. Open Settings → Domains & SSL → ACME DNS Providers
2. Add your Route53 IAM credentials
3. Add domains to your WordPress app:
  - examplefailover.com
  - www.examplefailover.com
4. Click Enable SSL
Using DNS-01 validation means both servers can generate certificates no matter which one DNS currently points at.

🔧 Step 2: Create a Reliable Health Check Endpoint

Using your homepage for health checks is risky. WordPress crashes, plugin errors, or PHP upgrades can accidentally trigger failover.

The fix is to build a dedicated health check endpoint that bypasses WordPress entirely:
```
https://examplefailover.com/healthcheck/index.php
```
Create the directory inside WordPress’s volume:
```
mkdir -p /var/lib/docker/volumes/<YOUR_VOLUME_NAME>/_data/healthcheck
```
Add a smart PHP health script:

/healthcheck/index.php
```
<?php
header('Content-Type: application/json');

$server_type = getenv('SERVER_TYPE') ?: 'unknown';

$response = [
    "status" => "healthy",
    "server" => $server_type,
    "hostname" => gethostname(),
    "ip" => $_SERVER['SERVER_ADDR'] ?? 'unknown',
    "time" => date('Y-m-d H:i:s'),
];

echo json_encode($response);
```
Tag each environment in Coolify:

On-Prem:
```
SERVER_TYPE=on-prem
```
AWS:
```
SERVER_TYPE=aws-failover
```
Now loading the endpoint returns clear JSON:
```
{
  "status": "healthy",
  "server": "on-prem",
  "hostname": "coolify-primary",
  "ip": "10.10.10.10",
  "time": "2025-12-07 14:33:12"
}
```
This helps you debug and verify which server is responding.

🔧 Step 3: Create a Route53 Health Check

Go to:

Route53 → Health checks → Create health check

Use:
- Protocol: HTTPS
- Domain: examplefailover.com
- Path: /healthcheck/index.php
- Port: 443
- Request interval: 30 seconds
- Failure threshold: 3
- Optional string matching: healthy
If the endpoint fails, Route53 marks the server as UNHEALTHY.

🔧 Step 4: Set Up DNS Failover in Route53

You will create two A-records for each domain, a primary and a secondary.

Root domain (examplefailover.com)

Primary (on-prem)
- Type: A
- Value: 10.10.10.10
- Routing policy: Failover → Primary
- Health check: Use the one created above
Secondary (AWS)
- Type: A
- Value: 203.0.113.25
- Routing policy: Failover → Secondary
- Health check: None
Repeat the same for www.examplefailover.com.

This ensures both root and www domain fail over correctly.

✔️ Failover Behavior Explained

Normal Operation
```
Healthcheck OK → Route53 routes traffic to on-prem (10.10.10.10)
```
On-Prem Fails
```
Healthcheck FAIL → Route53 routes traffic to AWS (203.0.113.25)
```
On-Prem Recovers
```
Healthcheck returns OK → Route53 routes traffic back to on-prem
```
Visitors experience zero downtime — it’s seamless.

💡 Why This Setup Works So Well
- Zero cloud load balancers required
- No need for highly available networking gear
- Coolify deploys identical apps in both environments
- DNS-01 SSL validation avoids certificate conflicts
- Dedicated health endpoint avoids WordPress false positives
- Route53’s global health check network ensures accuracy
- Failover is fast and automatic
This approach gives you “cloud-level high availability” with simple, inexpensive tools.

🎉 Conclusion

Pairing Coolify with Route53 failover lets you build a robust, self-healing WordPress environment without complex infrastructure. Whether you’re self-hosting for fun or running a real production site, combining:
- on-prem hardware
- AWS failover
- automated SSL
- a dedicated health check
- and smart DNS logic
allows your site to stay online under almost any circumstance.
December 7, 2025

Blog

Scaling GitHub Actions, Docker Caching, and Smarter Security Scans — Lessons From My Home Lab

Centralized Security Scanning + Hitting Rate Limits

Docker pull limits. Pip download limits. Everything.

Exploring Solutions to Rate Limits

1. Use AWS ECR or another paid cloud registry

2. Use a vendor-managed registry appliance

3. Build a Docker Proxy Cache (the path I picked)

Enterprise Problems Show Up in Home Labs

Rethinking How We Run Security Scans

A more practical pattern is emerging:

1. Base Scans → Run on a Cron Schedule

2. PR Scans → Run selectively

3. Adaptive Scans → Run based on the diff

**Running Scans Outside CI Jobs Using Webhooks**

To-Do List From This Work

How I Built Automatic WordPress Failover Between On-Prem and AWS Using Coolify and Route53

🚀 Architecture Overview

🔧 Step 1: Configure SSL on Both Coolify Instances

🔧 Step 2: Create a Reliable Health Check Endpoint

Create the directory inside WordPress’s volume:

Add a smart PHP health script:

Tag each environment in Coolify:

🔧 Step 3: Create a Route53 Health Check

🔧 Step 4: Set Up DNS Failover in Route53

Root domain (`examplefailover.com`)

Primary (on-prem)

Secondary (AWS)

Repeat the same for `www.examplefailover.com`.

✔️ Failover Behavior Explained

Normal Operation

On-Prem Fails

On-Prem Recovers

💡 Why This Setup Works So Well

🎉 Conclusion

Environment	Location	IP Address
Primary	On-Prem	10.10.10.10
Failover	AWS EC2	203.0.113.25

Blog

Scaling GitHub Actions, Docker Caching, and Smarter Security Scans — Lessons From My Home Lab

Centralized Security Scanning + Hitting Rate Limits

Docker pull limits. Pip download limits. Everything.

Exploring Solutions to Rate Limits

1. Use AWS ECR or another paid cloud registry

2. Use a vendor-managed registry appliance

3. Build a Docker Proxy Cache (the path I picked)

Enterprise Problems Show Up in Home Labs

Rethinking How We Run Security Scans

A more practical pattern is emerging:

1. Base Scans → Run on a Cron Schedule

2. PR Scans → Run selectively

3. Adaptive Scans → Run based on the diff

Running Scans Outside CI Jobs Using Webhooks

To-Do List From This Work

How I Built Automatic WordPress Failover Between On-Prem and AWS Using Coolify and Route53

🚀 Architecture Overview

🔧 Step 1: Configure SSL on Both Coolify Instances

🔧 Step 2: Create a Reliable Health Check Endpoint

Create the directory inside WordPress’s volume:

Add a smart PHP health script:

Tag each environment in Coolify:

🔧 Step 3: Create a Route53 Health Check

🔧 Step 4: Set Up DNS Failover in Route53

Root domain (examplefailover.com)

Primary (on-prem)

Secondary (AWS)

Repeat the same for www.examplefailover.com.

✔️ Failover Behavior Explained

Normal Operation

On-Prem Fails

On-Prem Recovers

💡 Why This Setup Works So Well

🎉 Conclusion

**Running Scans Outside CI Jobs Using Webhooks**

Root domain (`examplefailover.com`)

Repeat the same for `www.examplefailover.com`.