Copy Fail: What Detection Engineers Actually Need to Know
Copy Fail: What Detection Engineers Actually Need to Know
TL;DR: Your logging probably misses this one. Here's what to hunt and why getting to a real alert is harder than it should be.
What the exploit actually does
Copy Fail (CVE-2026-31431) is a logic flaw in authencesn, a kernel AEAD wrapper used by IPsec. The exploit binds an AF_ALG socket (the kernel's userspace crypto interface) to authencesn(hmac(sha256),cbc(aes)), uses splice() (a syscall that moves file data between descriptors without copying) to feed the kernel's in-memory copy of a setuid binary into the crypto scatterlist, and triggers a decryption operation. Setuid binaries run as root regardless of who calls them. su and sudo are the common targets.
A bug in authencesn writes 4 attacker-controlled bytes past the intended output boundary, landing in those in-memory pages. recvmsg() returns an error because the HMAC fails, but the write already happened. The exploit repeats this for each chunk of shellcode, then calls execve("/usr/bin/su"). The shellcode runs as root.
The disk file is never touched. Standard file integrity monitoring compares on-disk checksums and sees nothing. The corruption is gone on reboot, but the root shell is real.
Interim mitigation: disable the algif_aead module. The patch is kernel commit a664bf3d603d. Update your distribution's kernel package.
What your logging actually sees
The exploit chain lives entirely in syscall space: socket(AF_ALG), splice(), a recvmsg() that returns an error, then execve(). No network traffic. No disk writes.
Default auditd ships with no rules. You have no coverage. CIS Level 2 and DISA STIG hardening profiles add execve logging for setuid binaries, so you catch the final exec. Not the setup chain that made it dangerous.
MDE Linux surfaces process lineage by default. DeviceNetworkEvents won't see AF_ALG socket creation: AF_ALG is not a network socket family, and MDE has no DeviceSyscallEvents table. Falcon's behavioral engine may fire on a post-exploitation UID transition, but the AF_ALG and splice chain isn't in any customer-queryable schema.
Across all of these: you're detecting the outcome, not the exploit. Root is already obtained before your alert fires.
Start by hunting it: the KQL
The proof of concept (PoC) is Python. Python directly parenting su or sudo is unusual enough to hunt on right now. That signal has no shelf life: rewriting the exploit in C takes hours, and a frontier AI model given the write-up and the working PoC can produce a stripped binary with no Python fingerprint in minutes.
The durable signal is any unprivileged process parenting su, sudo, newgrp, or passwd where the parent isn't in your expected caller set. For most environments that set is short: bash, sh, zsh, fish, ksh, sshd, login, tmux, screen, cron, systemd.
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName in~ ("su", "sudo", "newgrp", "passwd")
| where InitiatingProcessAccountName != "root"
| where InitiatingProcessFileName !in~ (
"bash", "sh", "zsh", "fish", "ksh",
"sshd", "login", "tmux", "screen",
"cron", "systemd", "su", "sudo"
)
| project
Timestamp,
DeviceName,
AccountName,
FileName,
ProcessCommandLine,
InitiatingProcessFileName,
InitiatingProcessAccountName,
InitiatingProcessCommandLine
| order by Timestamp desc
Run this as a hunt, not an alert. Ansible, deployment tooling, and automation accounts will fire. Exclude known automation accounts (not process names, which are trivial to spoof) until false positives are manageable, then promote to an alert. Document it as post-exploitation detection. You're catching fast, not catching early.
The higher-fidelity option
Full detection of the exploit chain requires auditd with explicit syscall rules forwarding to your SIEM. Three rules cover it: socket with a0=38 (AF_ALG family), splice, and execve of setuid targets. Join all three on audit session ID (ses, the identifier auditd uses to group events from a single login session) to scope results to a single user session.
authencesn has no legitimate unprivileged userspace callers, so the AF_ALG socket rule fires on essentially nothing except this exploit class. splice is high volume on its own. Use it only as a join key against the socket events.
Most organizations don't have auditd configured with syscall rules, don't have it forwarding to a SIEM, and no default hardening profile ships these rules. If you have Linux multi-tenant hosts, CI runners, or Kubernetes nodes in scope, this is the right investment. For everyone else, the hunting query is where you are for now.
The actual takeaway
The parent-child heuristic isn't specific to Copy Fail. Any kernel LPE that execs a setuid binary as its payload looks the same. Build this detection. Don't treat this LPE as a moderate severity vulnerability.
Comments
Post a Comment