Linux Dealing With x86 32-bit Software Security Issue For Intel TDX & AMD SEV
AMD Secure Encrypted Virtualization (SEV) and Intel Trust Domain Extensions (TDX) are intended to help provide better security for virtual machines and are key elements to both companies investments around confidential computing. It turns out they have a common enemy in their VM security goals: x86 32-bit software.
Patches merged on Thursday for the Linux 6.7 are addressing a security vector due to 32-bit software and potential misuse by VMMs. Intel TDX and AMD SEV are not only about protecting VMs from other VMs, but also ensuring separation from the VMM/hypervisor itself. Due to x86 32-bit semantics, there's problems in this increasing security-focused world. One of the patches merged yesterday for Linux 6.7 ends up disabling 32-bit support by default when running on TDX and SEV. Intel Linux engineer Kirill Shutemov explained in the patch message:
Shutemov in a follow-up patch then restored with some additional changes:
With the patches as of yesterday, the x86 32-bit software support when running under AMD SEV for VMs remains disabled by default.
All of these patches to improve the AMD SEV and Intel TDX security to prevent int 0x80 misuse by the VMM are set to be back-ported to supported kernel versions since Linux 6.0.
On the plus side, these changes for cleaning up the code did lead to the kernel getting rid of a bunch of Assembly-written entry code and replaced by C code.
Patches merged on Thursday for the Linux 6.7 are addressing a security vector due to 32-bit software and potential misuse by VMMs. Intel TDX and AMD SEV are not only about protecting VMs from other VMs, but also ensuring separation from the VMM/hypervisor itself. Due to x86 32-bit semantics, there's problems in this increasing security-focused world. One of the patches merged yesterday for Linux 6.7 ends up disabling 32-bit support by default when running on TDX and SEV. Intel Linux engineer Kirill Shutemov explained in the patch message:
"The INT 0x80 instruction is used for 32-bit x86 Linux syscalls. The kernel expects to receive a software interrupt as a result of the INT 0x80 instruction. However, an external interrupt on the same vector triggers the same handler.
The kernel interprets an external interrupt on vector 0x80 as a 32-bit system call that came from userspace.
A VMM can inject external interrupts on any arbitrary vector at any time. This remains true even for TDX and SEV guests where the VMM is untrusted.
Put together, this allows an untrusted VMM to trigger int80 syscall handling at any given point. The content of the guest register file at that moment defines what syscall is triggered and its arguments. It opens the guest OS to manipulation from the VMM side.
Disable 32-bit emulation by default for TDX and SEV. User can override it with the ia32_emulation=y command line option."
Shutemov in a follow-up patch then restored with some additional changes:
"32-bit emulation was disabled on TDX to prevent a possible attack by a VMM injecting an interrupt on vector 0x80.
Now that int80_emulation() has a check for external interrupts the limitation can be lifted.
To distinguish software interrupts from external ones, int80_emulation() checks the APIC ISR bit relevant to the 0x80 vector. For software interrupts, this bit will be 0.
On TDX, the VAPIC state (including ISR) is protected and cannot be manipulated by the VMM. The ISR bit is set by the microcode flow during the handling of posted interrupts."
With the patches as of yesterday, the x86 32-bit software support when running under AMD SEV for VMs remains disabled by default.
All of these patches to improve the AMD SEV and Intel TDX security to prevent int 0x80 misuse by the VMM are set to be back-ported to supported kernel versions since Linux 6.0.
On the plus side, these changes for cleaning up the code did lead to the kernel getting rid of a bunch of Assembly-written entry code and replaced by C code.
1 Comment