AvA: Accelerated Virtualization of Accelerators

Session: Virtualized acceleration--Don't keep it real!

Authors: Hangchen Yu (The University of Texas at Austin); Arthur Peters (The University of Texas at Austin); Amogh Akshintala (The University of North Carolina at Chapel Hill); Christopher Rossbach (The University of Texas at Austin & VMware Research)

Applications are migrating \emph{en masse} to the cloud, while accelerators such as GPUs, TPUs, and FPGAs proliferate in the wake of Moore's Law. These trends are in conflict: cloud applications run on virtual platforms, but existing virtualization techniques have not provided production-ready solutions for accelerators. As a result, cloud providers expose accelerators by \textbf{dedicating} physical devices to individual guests. Multi-tenancy and consolidation are lost as a consequence. We present AvA, which addresses limitations of existing virtualization techniques with automated construction of hypervisor-managed virtual accelerator stacks. AvA combines a DSL for describing APIs and sharing policies, device-agnostic runtime components, and a compiler to generate accelerator-specific components such as guest libraries and API servers. AvA uses \textbf{Hypervisor Interposed Remote Acceleration} (HIRA), a new technique to enable hypervisor-enforcement of sharing policies from the specification. We use AvA to virtualize nine accelerators and eleven framework APIs, including six for which no virtualization support has been previously explored. AvA provides near-native performance and can enforce sharing policies that are not possible with current techniques, with orders of magnitude less developer effort than required for hand-built virtualization support.