Egalito: Layout-Agnostic Binary Recompilation

Session: Dynamic compilation--Who moved my cheese?

Authors: David Williams-King (Columbia University); Hidenori Kobayashi (Canon Inc.); Kent Williams-King (Brown University); Graham Patterson (Bloomberg L.C.); Frank Spano (Bloomberg L.C.); Yu Jian Wu (Columbia University); Junfeng Yang (Columbia University); Vasileios Kemerlis (Brown University)

For comprehensive analysis of all executable code, and fast turn-around time for transformations, it is essential to operate directly on binaries to enable profiling, security hardening, and architectural adaptation. Disassembling binaries is difficult, and prior work relies on a process virtual machine to translate references on the fly or inefficient binary code patching. Our Egalito recompiler leverages metadata present in current stripped x86_64 and ARM64 binaries to generate a complete disassembly, and allows arbitrary modifications that may affect program layout without any constraints from the original binary. We utilize our own layout-agnostic intermediate representation, which is low-level enough to make the regeneration of output code predictable, yet supports a dual high-level representation for sophisticated analysis. We demonstrate nine binary tools including a novel continuous code randomization technique where Egalito transforms itself, and software emulation of the control-flow integrity in upcoming hardware. We evaluated Egalito on a large set of Debian packages, completely analyzing 99.9% of a selection of 867 executables and libraries; a majority of 149 applicable Debian packages pass all tests under Egalito. On SPEC CPU 2006, thanks to our binary optimizations, Egalito actually observes a 1.7% performance speedup.