0sim: Preparing System Software for a World with Terabyte-scale Memories

Session: Huge memories and distributed databases--Now I remember!

Authors: Mark Mansi (University of Wisconsin - Madison); Michael Swift (University of Wisconsin - Madison)

Recent advances in memory technologies mean that commodity machines may soon have terabytes of memory; however, such machines remain expensive and uncommon today. Hence, few programmers and researchers can debug and prototype fixes for scalability problems or explore new system behavior caused by terabyte-scale memories. To enable rapid, early prototyping and exploration of system software for such machines, we built and open-sourced the 0sim simulator. 0sim uses virtualization to simulate the execution of huge workloads on modest machines. Our key observation is that many workloads follow the same control flow regardless of their input. We call such workloads data-oblivious. 0sim harnesses data-obliviousness to make huge simulations feasible and fast via memory compression. 0sim is accurate enough for many tasks and can simulate a guest system 20-30x larger than the host with 8x-100x slowdown for the workloads we observed, with more compressible workloads running faster. For example, we simulate a 1TB machine on a 31GB machine, and a 4TB machine on a 160GB machine. We give case studies to demonstrate the utility of 0sim. For example, we find that for mixed workloads, the Linux kernel can create irreparable fragmentation despite dozens of GBs of free memory, and we use 0sim to debug unexpected failures of memcached with huge memories.