When an application gets breached, the first thing an attacker will do is exfiltrate whatever data they can reach. From customer data to authentication tokens, secrets are target number one (goodbye to your GitHub PATs, your Slack bots in #general channels, your Mailchimp tokens, OpenAI keys, and so on). You’ll have to revoke everything, assuming the attacker already had access.


When thinking about secret management, you think of OpenBao, External Secrets Operator, or sidecars that inject secrets — it works, but the application still ends up with the secret content in memory (and so does an attacker who compromises it).

While doing my GitHub browsing (believe it or not, you find gems on the homepage), I stumbled on a young project that caught my eye: Kloak.

Kloak transparently intercepts outbound TLS traffic in Kubernetes using eBPF uprobes, replacing hashed placeholders with real secrets at the kernel level before encryption. Applications never handle actual credentials, and no sidecars or code changes are required.

Kloak proposes a radically different approach: a controller intercepts TLS traffic at the kernel level using eBPF uprobes, replacing placeholders with real secrets just before encryption. The application never sees the actual values — and that’s exactly what makes the difference in case of a compromise.

This project was open-sourced recently (it may have even started in April 2026), so I’m among the first to write about it in depth — though not for long given the traction it’s already getting.

In this article, I’ll walk you through my Kloak PoC, the issues I ran into, and how I resolved them. But first, let’s revisit the original problem.

The problem: your application knows your secrets

Let’s take a concrete example — an application calling an external API (simulated by httpbin):

import os
import httpx

api_key = os.environ["API_TOKEN"]  # super-secret-bearer-token-12345

response = httpx.get(
    "https://httpbin.org/headers",
    headers={"Authorization": f"Bearer {api_key}"}
)

Even with OpenBao, the secret ends up in plaintext in the process memory. If the container is compromised (RCE for example), the attacker can read API_TOKEN directly from the environment or memory.

Kloak solves this by never giving the real secret to the application. Instead, the application receives a placeholder like kloak:63KNA74T0FV868SK01KQAHH78 — and the eBPF program replaces that placeholder with the real secret when the data passes through SSL_write, just before encryption. httpbin.org receives super-secret-bearer-token-12345, but the process never saw it.

That’s where things get a bit magical: the application isn’t even aware of the secret it’s using.


Architecture-wise, Kloak is split into two distinct planes.

Control Plane:

  • Manages Shadow Secrets, synchronizes the eBPF maps.
  • Intercepts pod creation and rewrites volume mounts via a Mutating Admission Webhook.

Data Plane:

  • Attaches hooks on SSL_write and crypto/tls.(*Conn).Write
  • Handles DNS kprobes (which capture DNS responses to map IPs — more on that below)

The Controller runs as a DaemonSet. It watches secrets labeled getkloak.io/enabled=true, creates “Shadow Secrets” (the name for the fake secrets managed by Kloak) with kloak:<UUID> placeholders, and synchronizes the real values into eBPF maps in kernel space.

The Webhook is a Mutating Admission Webhook. When a pod starts, it automatically rewrites the secret mounts to point to the Shadow Secrets. The application mounts the file, reads kloak:Y1R3B718AGD3X4ZC01KQAHQ26, and uses it as if it were the real secret.

One limitation worth noting: the mount point must be a file, not an environment variable. An issue is open and the feature should arrive soon!

When the application calls SSL_write with the placeholder in the buffer, the eBPF uprobe intercepts the call, checks that the destination is authorized (I’ll come back to this), and rewrites the placeholder with the real value before OpenSSL encrypts the content.

Regarding Shadow Secrets: you won’t need to adapt your Helm charts or kustomize configurations — the controller generates them and modifies your pods to point to the fake ones (on the ArgoCD side, you’ll just need to switch to server-side apply).

Installation

As usual, a Helm chart is available:

helm repo add kloak https://chart.getkloak.io
helm repo update
helm install kloak kloak/kloak -n kloak-system --create-namespace

Usage

You always start from a pre-existing secret — Kloak only creates its Shadow Secrets. To do so, add the label getkloak.io/enabled=true on the Secret. The getkloak.io/hosts and getkloak.io/port labels let you restrict which destinations the secret can be sent to.

Forcing a destination is optional — you can simply hide the content without restricting where it goes.

apiVersion: v1
kind: Secret
metadata:
  name: mytoken
  labels:
    getkloak.io/enabled: "true"
    getkloak.io/hosts: "httpbin.org"
    getkloak.io/port: "443"
stringData:
  token: "super-secret-bearer-token-12345"

As soon as this Secret is applied, the Kloak Controller automatically creates a Shadow Secret with a kloak:<UUID> placeholder of the same length as the real values.

kubectl get secrets -n kloak-test
NAME            TYPE     DATA   AGE
mytoken         Opaque   1      4s
mytoken-kloak   Opaque   1      4s

The placeholder is length-matched character by character:

# Original secret
kubectl get secret mytoken -o jsonpath='{.data.token}' | base64 -d
super-secret-bearer-token-12345

# Shadow secret
kubectl get secret mytoken-kloak -o jsonpath='{.data.token}' | base64 -d
kloak:Y1R3B718AGD3X4ZC01KQAHQ26

On the application side, you need to enable Kloak on the namespace (or directly on the pod with the label getkloak.io/enabled=true). The Webhook then rewrites the volume mount automatically.

To enable Kloak, simply add a label to the namespace or pods directly:

# Enable Kloak on the entire namespace
kubectl label namespace kloak-test getkloak.io/enabled=true

And now you can deploy your application as if nothing changed:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  template:
    spec:
      volumes:
        - name: token-secret
          secret:
            secretName: mytoken          # ← the real Secret
      containers:
        - name: agent
          image: ghcr.io/unetassedecafe/test-app:latest
          volumeMounts:
            - name: token-secret
              mountPath: /var/secrets
              readOnly: true

The Webhook intercepts this pod at startup and rewrites the secretName to point to the Shadow Secret. You can verify it via kubectl describe pod:

kubectl describe pod httpbin-test -n kloak-test | grep -A3 "Volumes:"
Volumes:
  token-secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  mytoken-kloak   # ← automatically rewritten by the webhook

The application mounts the file and reads kloak:Y1R3B718AGD3X4ZC01KQAHQ26.

Kernel-space injection

Here’s a test Python pod that reads the secret from the volume and sends it in an Authorization header to httpbin.org/headers (which I use mainly to simulate an external API and see what it receives in the response).

import os, urllib.request, json

api_key = open("/run/secrets/api/api-key").read().strip()
print(f"[APP] Secret read from file: {api_key}")

req = urllib.request.Request(
    "https://httpbin.org/headers",
    headers={"Authorization": f"Bearer {api_key}"}
)
with urllib.request.urlopen(req) as resp:
    data = json.loads(resp.read())
    auth_header = data["headers"].get("Authorization", "")
    print(f"[NETWORK] Header received by httpbin.org: {auth_header}")

Result:

[APP] Secret read from file: kloak:Y1R3B718AGD3X4ZC01KQAHQ26
[APP] Length: 31 characters
[NETWORK] Header received by httpbin.org: Bearer super-secret-bearer-token-12345

The application read kloak:Y1R3B718AGD3X4ZC01KQAHQ26 and sent it as-is. It’s the eBPF uprobe hooked on SSL_write that replaced the placeholder with super-secret-bearer-token-12345 just before encryption — without my (magnificent) application ever knowing.

The DNS verification chain

This is the most interesting part of Kloak. Annotating a secret with getkloak.io/hosts: "httpbin.org" isn’t enough if an attacker can redirect it to their own malicious server to retrieve the secret content.

But Kloak has a function that ensures the current TCP connection actually corresponds to httpbin.org and not a server that intercepted the DNS resolution.

Kloak uses a complete verification chain:

  1. DNS capture: a kprobe on udp_recvmsg intercepts all DNS responses on the node. For hostnames in getkloak.io/hosts, the resolved IPs are stored in dns_ip_map with their TTL.
  2. Connection tracking: tracepoints on sys_enter/exit_connect record each TCP connection with the fd → IP mapping in conn_ip_map. If the IP is in dns_ip_map, the fd is marked as verified.
  3. Resolution at write time: at the SSL_write uprobe, Kloak chains fd → IP → hostname to identify the real destination of the TLS connection.
  4. Filtering: if the resolved hostname matches getkloak.io/hosts, it rewrites. Otherwise, the kloak:... placeholder is sent as-is — the remote server receives an invalid value, and the secret doesn’t leak.

Info

When I first read the explanation above in the documentation, I struggled a bit, so here’s my understanding after digging into it, along with a summary.

  • A kprobe (kernel probe) is a hook point placed on a Linux kernel function. When the kernel executes that function (here udp_recvmsg), you can inspect the content of the received UDP packet.
  • An uprobe (user probe) does the same thing but in user space: you hook an eBPF program onto a function of a shared library (SSL_write in libssl.so.3). The program fires on every call, with access to arguments and, crucially, the data buffer before encryption.
  • A tracepoint is a static hook point defined in the kernel (e.g. sched_process_exec for each exec(), sys_enter_connect for each TCP connection).
  • BPF maps are data structures shared between kernel space and user space. Kloak uses several: secret_map (placeholder → real value), dns_ip_map (IP → hostname), conn_ip_map (fd → IP), cgroup_map (tracked cgroups). The Go Controller updates them from user space; the eBPF programs read them from kernel space.

But that’s all in theory — in practice, I ran into some issues…

The PoC (and the start of the drama)

You thought this was a “presentation” article — it’s not! I’m going to walk you through how my tests actually went instead. You’re allowed to feel betrayed.

The first cluster I tested on was a simple kind on Docker Desktop macOS. Injection works. The real secret is correctly sent to httpbin.org (which is what we want).

=== httpbin.org (authorized in the annotation) ===
    "Authorization": "Bearer super-secret-bearer-token-12345"

However, host filtering does not block httpbin.une-pause-cafe.fr — the real secret gets sent there too. The reason is in the Controller logs:

{"msg":"Added trusted DNS server","ip":"10.96.0.10","source":"kube-dns auto-discovery"}
{"msg":"DNS server whitelist enabled","count":1}

Kloak only listens for DNS responses from kube-dns. In a kind cluster on macOS, the udp_recvmsg kprobe doesn’t capture those packets inside the linuxkit environment — dns_ip_map stays empty and Kloak injects toward all hosts. No choice: I needed a real cluster… which is why I turned to Talos (don’t act surprised, you all saw it coming).

Talos cluster (kernel 6.12): diagnosis and bug fix

Injection doesn’t work in v0.1.1. The eBPF programs load correctly, the Controller detects pods, but Successfully attached TLS uprobes never appears in the logs.

So the authorized server receives the Kloak UID instead of the real secret:

    "Authorization": "Bearer kloak:RZFT5K6YRGJ49D9H01KQA",

The diagnosis reveals the root cause: the Controller looks for the container’s cgroup by testing a list of path patterns (pkg/cgroups/utils.go). On Talos, Guaranteed QoS cgroups are organized like this in cgroupfs (which isn’t necessarily standard everywhere):

/sys/fs/cgroup/kubepods/pod<UID>/<containerID>/

To give a bit more detail on this, Kubernetes distributes pods into three QoS classes based on their requests/limits (you might know Denis and I have a talk on this topic 😁):

QoS ClassConditioncgroup layout
Guaranteedrequests == limits for all containerskubepods/pod<UID>/<CID>/
Burstableat least one request defined, not equal to limitskubepods/burstable/pod<UID>/<CID>/
BestEffortno request/limitkubepods/besteffort/pod<UID>/<CID>/

Most production workloads are Guaranteed (requests = limits). The pods in our test (httpbin-test with no resources:) are BestEffort — which is why they worked even without the fix.

The pattern list was testing /sys/fs/cgroup/kubepods/pod<UID> (the pod cgroup) before descending into the container directory. So the Controller was grabbing the inode of the pod’s parent cgroup, not the container’s:

# What the controller logged:
{"msg":"tracking container cgroup","cgroupID":336626}  ← pod inode

# Actual inode of the container cgroup:
stat /sys/fs/cgroup/kubepods/pod.../containerID/
  Inode: 336699  ← never found

The sched_process_exec tracepoint filters execs by exact cgroup ID. With the wrong ID, no syscall matches and uprobes are never attached. Effectively, it’s as if the container wasn’t set up for Kloak: it runs, but it ignores the uprobes.

The fix: add the missing pattern in FindContainerCgroupPath before the pod-level fallback:

 // containerd cgroupfs driver (k3s default)
 filepath.Join(cgroupRoot, "kubepods", "burstable",
     fmt.Sprintf("pod%s", podUID), containerID),
 filepath.Join(cgroupRoot, "kubepods", "besteffort",
     fmt.Sprintf("pod%s", podUID), containerID),
 filepath.Join(cgroupRoot, "kubepods", "guaranteed",
     fmt.Sprintf("pod%s", podUID), containerID),
+// Talos places Guaranteed containers directly under kubepods/pod<UID> without a QoS subdirectory
+filepath.Join(cgroupRoot, "kubepods",
+    fmt.Sprintf("pod%s", podUID), containerID),
 // pod-level fallback (last resort)
 filepath.Join(cgroupRoot, "kubepods", "pod"+podUID),

I’ll be honest — before finding this, it took me quite a few hours of debugging with DevPod and an LLM to help 😅! I opened a PR upstream so the fix benefits everyone.

Testing injection after the fix on Talos

With the patched image, the Controller correctly attaches the uprobes:

{"msg":"tracking container cgroup","cgroupID":347503}
{"msg":"Successfully attached TLS uprobes","pid":170996,"container":"curl"}

And injection works end-to-end. The pod sees the placeholder, httpbin.org receives the real secret:

$ kubectl exec -n kloak-test httpbin-test -- sh -c \
    'TOKEN=$(cat /var/secrets/token) && \
     echo "Pod sees: $TOKEN" && \
     curl -s -H "Authorization: Bearer $TOKEN" https://httpbin.org/headers | grep -i auth'

The application pod sees: kloak:63KNA74T0FV868SK01KQAHH78 And the destination server sees: “Authorization”: “Bearer super-secret-bearer-token-12345”,

Info

The fix covers Talos and any distribution that places Guaranteed QoS containers directly under kubepods/pod<UID>/<containerID> without a QoS subdirectory. In v0.1.1, the existing cgroupfs patterns already cover burstable, besteffort, and guaranteed inside QoS subdirectories — which works for k3s and kubeadm.

Now that we’ve validated that, let’s test the second major feature: only replacing with the real secret when communicating with the right domain.

$ kubectl exec -n kloak-test httpbin-test -- \
 sh -c 'TOKEN=$(cat /var/secrets/token) && \
 echo "Pod sees: $TOKEN" && \
 curl -sk -H "Authorization: Bearer $TOKEN" https://httpbin.org/headers | grep -i auth && \
 curl -sk -H "Authorization: Bearer $TOKEN" https://postman-echo.com/headers | grep -i auth

Pod sees: kloak:63KNA74T0FV868SK01KQAHH78
# httpbin.org (authorized in the annotation)
    "Authorization": "Bearer super-secret-bearer-token-12345"
# postman-echo.com (not authorized)
    "Authorization": "Bearer super-secret-bearer-token-12345"

TLS injection works. But when I use the label getkloak.io/hosts: "httpbin.org" on the secret, the real token reaches both httpbin.org and postman-echo.com — filtering isn’t working…

Digging into the logs

First thing to do: enable trace logging on the DaemonSet.

kubectl set env daemonset/kloak-controller -n kloak-system KLOAK_LOG_LEVEL=trace

The trace level exposes two interesting things: the detail of each secret sync to eBPF maps, and the internal eBPF program counters, dumped every 5 seconds.

{"msg":"synced secret into eBPF map","secret":"mytoken","key":"token","hostLen":11,"port":0,"protocol":0}
{"msg":"secret sync complete","enabledSecrets":1,"bpfKeys":2,"pruned":0,"watchedHosts":1}

hostLen:11 (length of "httpbin.org") — the Controller correctly loaded the host filter. watchedHosts:1: the watched_hosts map is active. Yet injection still reaches postman-echo.com. The filter is configured, but isn’t tracing the right calls; time to check the “source of truth”: the eBPF counters.

Analyzing the eBPF counters

The eBPF counters reveal the cause. They’re exposed in the Controller’s trace logs (enabled via KLOAK_LOG_LEVEL=trace), automatically dumped every 5 seconds:

# Enable trace logs if not already done
kubectl set env daemonset/kloak-controller -n kloak-system KLOAK_LOG_LEVEL=trace

# Read eBPF counters
kubectl logs -n kloak-system -l app.kubernetes.io/component=controller --tail=100 \
  | grep "eBPF debug counter"

In the output, you find these counters:

{"name":"kprobe_dport53",     "count":1854}
{"name":"kretprobe_read_ok",  "count":187}
{"name":"dns_parse_entry",    "count":187}
{"name":"dns_not_response",   "count":1852}
{"name":"dns_no_answers",     "count":1}
{"name":"dns_not_watched",    "count":1}

Decoding, because it’s not immediately obvious:

  • kprobe_dport53: 1854 — the kprobe on udp_recvmsg captured 1854 packets involving port 53.
  • dns_not_response: 1852 — 1852 of those packets have QR=0 (DNS query, not a response). The kprobe is seeing mostly queries, not responses.
  • kretprobe_read_ok: 187 / dns_parse_entry: 187 — only 187 complete UDP reads were parsed as potential DNS responses…
  • …of which 185 are still non-responses (dns_not_response). Only 2 packets pass the QR filter, and of those 2: 1 with no DNS answer (dns_no_answers), 1 with an unwatched hostname (dns_not_watched).

The dns_watched_hit metric is missing — it should have signaled that a request is destined for the right server (and therefore trigger the replacement).

In short: the DNS kprobe isn’t matching any domain. And you know which “actor” in my cluster handles networking a bit differently?

Here’s a new challenger: Cilium

This Talos cluster uses Cilium as the CNI. Cilium integrates a DNS proxy that sits between pods and kube-dns. The DNS flow with Cilium looks like this:

Pod → Cilium DNS proxy (port 53, node network namespace)
                  ↓
           kube-dns (resolution)
                  ↓
      Cilium DNS proxy (receives the response)
                  ↓
         Pod (receives the response... but via Cilium eBPF, not raw UDP)

Cilium intercepts the vast majority of DNS responses before they bubble up through the pod’s UDP socket. Kloak’s udp_recvmsg kprobe only sees a handful of responses (around 5%) — so dns_ip_map is almost always empty for watched names. Filtering becomes non-deterministic: it works when a DNS response happened to pass before Cilium’s fast-path, and silently fails otherwise.

TL;DR: Cilium’s eBPF fast-path bypasses Kloak’s filters in almost all cases.

I don’t have the low-level skills to inspect this from the kernel side directly, so the only way to validate this theory was to test it.

With Flannel: it works

There aren’t 36,000 solutions: I installed a cluster without Cilium, using only Flannel (sorry Joseph, but I had no choice here).

I could also have disabled eBPF on Cilium, but let’s keep it simple.

With the label getkloak.io/hosts=httpbin.org applied, the results:

# httpbin.org (authorized)
$ kubectl exec kloak-test-pod -n kloak-test -- curl -sk \
    -H "Authorization: Bearer $(cat /var/secrets/mytoken/token)" \
    https://httpbin.org/headers | grep Auth
    "Authorization": "Bearer super-secret-bearer-token-12345"

# postman-echo.com (not authorized)
$ kubectl exec kloak-test-pod -n kloak-test -- curl -sk \
    -H "Authorization: Bearer $(cat /var/secrets/mytoken/token)" \
    https://postman-echo.com/headers | grep auth
    "authorization": "Bearer kloak:46S12JCN0HS2GX3101KQATMNV"

The real secret only goes to httpbin.org. The placeholder is sent to postman-echo.com. Filtering works 🎉!

The eBPF counters confirm that on Flannel, the DNS kprobe correctly sees the responses:

{"name":"dns_parse_entry",   "count":619}
{"name":"dns_not_response",  "count":611}
{"name":"dns_watched_hit",   "count":3} // The metric we didn't have before
{"name":"dns_answer_stored", "count":24}

dns_answer_stored:24 — DNS responses are correctly landing in dns_ip_map 🎉!

Plot twist: Cilium is actually compatible

This section wasn’t in my original draft, but rather than rewriting the whole article, I prefer to leave the debug section as-is and add a short erratum.

While writing this article, I opened an issue and the Kloak team responded quickly. According to them, the problem comes from the fact that Kloak only trusts DNS responses from kube-dns, whereas with Cilium, responses arrive from the Cilium proxy rather than directly from kube-dns (which validated my initial theory).

Two solutions exist:

Option 1 — trustedServers: manually configure the Cilium DNS proxy IP in Kloak’s Helm values:

controller:
  dns:
    trustedServers:
      - <Cilium DNS proxy IP>

Option 2 — Cilium transparent mode: enable --dnsproxy-enable-transparent-mode on Cilium. With this flag, Cilium preserves the kube-dns source IP in the DNS responses it proxies — so Kloak recognizes them as trusted.

# In cilium-config (ConfigMap) or Cilium Helm values
dnsproxy-enable-transparent-mode: "true"

I retested on my Talos + Cilium 1.17 clusters with this flag enabled (it’s actually the default on recent installations), and host filtering works correctly:

Pod sees   : kloak:Y94DFH4VHAHNFN1P01KQASMXC

=== httpbin.org (authorized) ===
"Authorization": "Bearer super-secret-bearer-token-12345"  ✅

=== postman-echo.com (not authorized) ===
"Authorization": "Bearer kloak:Y94DFH4VHAHNFN1P01KQASMXC"  ✅

Kloak is therefore Cilium-compatible, provided transparent mode is enabled (which should be the case on all Cilium installations).

Supported runtimes

One thing I haven’t covered: not all languages handle SSL requests the same way. Kloak hooks into TLS functions depending on the runtime:

RuntimeTLS libraryHook point
Python, Rust, Ruby, PHP, curlOpenSSL (libssl.so)SSL_write / SSL_write_ex
Node.jsBoringSSL (statically linked)SSL_write
Gonative crypto/tlscrypto/tls.(*Conn).Write

For Go, it’s more complex: it encrypts itself without going through OpenSSL, so the rewrite path is different (still compatible, but they had to write a language-specific implementation).

For .NET folks: the runtime goes through OpenSSL (hence libssl.so), so Kloak should work the same way as for Python. However, this runtime isn’t listed in the officially supported runtimes and I wasn’t able to test it myself.

Conclusion

In the end, the initial promise is delivered. It took me several hours of debugging to really understand what was going on — and I’m not exactly the most comfortable with low-level stuff, so I definitely had to grind (even though the solutions were ultimately within reach).

What initially seemed like a no-go with Cilium turned out to be compatible. The team responded quickly to my issue and the solution exists, as detailed in the section above. At the time of writing, Kloak is only two weeks old, and the team’s responsiveness is already very encouraging!

While Kloak is primarily aimed at LLM agents running on clusters, I think the problem it solves is interesting on the infra side too — having an application use secrets without ever reading them is genuinely elegant. I hope this half-debug, half-presentation article was useful, and maybe it’ll inspire you to give Kloak a try.

Enjoy your coffee! ☕️