GOTRS 0.5.1 shipped with a production-ready Helm chart. Here’s what went into making it robust.
The Problem We started with Kustomize manifests in a k8s/ directory. They worked for simple deployments but became unwieldy as configuration options grew. Database selection, replica counts, …
The latest GOTRS release focused on something that doesn’t make for exciting screenshots but matters enormously: trust in the development workflow.
The Problem Every developer has a slightly different local environment. Different Go versions, different database clients, different shell …
Cloud computing consumes enormous energy. As infrastructure scales, so does environmental impact. We started measuring and optimising for carbon emissions alongside cost and performance.
The Problem Cloud made infrastructure invisible—including its environmental cost. Spinning up resources was …
AI promises to revolutionise everything, including operations. After a year of experimentation, we’ve found where it genuinely helps and where it’s still hype. Spoiler: it’s not replacing engineers anytime soon.
The Problem Alert fatigue persisted despite tuning. Hundreds of alerts …
“Who owns this service?” shouldn’t require Slack archaeology. Backstage gave us a single place for service catalogues, documentation, and developer workflows. The portal became the starting point for everything.
The Problem Tribal knowledge dominated. Which team owns the payment …
OpenTofu 1.7 introduced client-side state encryption—a feature the community requested from Terraform for years without success. For us, it solved a compliance problem that previously required workarounds.
The Problem Terraform state contains secrets. Database passwords, API keys, and sensitive …
SolarWinds, Log4Shell, and countless smaller incidents proved that software supply chains are attack vectors. Compliance frameworks now require provenance verification. We implemented SLSA and Sigstore to meet requirements and build genuine trust.
The Problem “Where did this binary come …
HashiCorp’s August 2023 license change sent shockwaves through the infrastructure-as-code community. Terraform moved from MPL to BSL, and within weeks, OpenTofu emerged as an open-source fork. We had decisions to make.
The Problem The Business Source License isn’t open source. …
Traditional observability requires instrumentation. Add libraries, modify code, redeploy. eBPF offers visibility into systems you can’t or won’t change, directly from the kernel.
The Problem Instrumenting legacy applications was impractical. Some had no source code access. Others were …
“You build it, you run it” sounds empowering until developers spend more time on infrastructure than features. Platform engineering offers a middle path between centralised ops and full developer responsibility.
The Problem DevOps promised developer autonomy. The reality? Developers …
Secrets end up everywhere: environment variables, config files, CI systems, developer laptops. Centralising them isn’t just about security—it’s about knowing what credentials exist and who can access them.
The Problem Credential sprawl was rampant. The same database password existed in …
Cloud bills have a way of growing faster than the applications they support. When finance asked for a 30% reduction, we had to find savings without compromising reliability.
The Problem The bill had grown organically. Resources provisioned for load tests never deleted. Development environments …
After years of imperative deployment scripts and kubectl commands in CI pipelines, we adopted GitOps. The shift was more cultural than technical, and the benefits exceeded expectations.
The Problem Deployment scripts grew organically. Each application had slight variations. Some used Helm, others …
December 2021 delivered Log4Shell, and the subsequent weeks were chaos. A month later, we’re reflecting on what worked, what didn’t, and what we’re changing permanently.
The Problem The vulnerability itself was severe—remote code execution with trivial exploitation. But the real …
Everyone talks about observability, but most organisations have monitoring with extra steps. We spent a year building genuine observability and learned what actually matters.
The Problem We had monitoring. Lots of it. Dashboards for everything. Alert fatigue was constant. When incidents occurred, we …
Terraform state is deceptively simple until you have multiple teams, dozens of repositories, and hundreds of resources. Then it becomes your biggest operational challenge.
The Problem Local state files don’t scale. The moment two people run terraform apply simultaneously, you have a race …
Running multiple teams on a shared Kubernetes cluster sounds efficient until one team’s runaway pod consumes all the cluster resources. We learned this the hard way.
The Problem Namespaces provide logical separation but not isolation. By default, pods in one namespace can communicate with pods …
After years of maintaining Jenkins servers, we finally made the switch to GitHub Actions. Here’s why it was worth the effort.
The Problem Jenkins served us well for a decade, but the maintenance burden grew unsustainable. Plugin updates broke builds. Java version conflicts caused headaches. …