Program

Wednesday, March 2, 2022

8:40 – 9:00 Opening
9:00 – 10:00 Keynote: Michael Franz, University of California, Irvine
10:00 – 10:20 Break
10:20 – 12:00 Session 1A: Accelerators
Session 1B: Address and Memory
12:00 – 13:40 Lunch
13:40 – 15:00 Session 2A: GPU and Data Analytics
Session 2B: Privacy and Software Security
15:00 – 15:20 Break
15:20 – 16:00 Session 3A: Hardware Security (1)
Session 3B: Misc.
16:00 – 18:00 Poster Session

Thursday, March 3, 2022

8:40 – 9:40 Keynote: Tim Harris, Microsoft
9:40 – 10:00 Break
10:00 – 11:40 Session 4A: Systems for Machine Learning
Session 4B: Operating Systems
11:40 – 13:20 Lunch
13:20 – 14:40 Session 5A: Quantum Computing
Session 5B: Data Center and Cloud Services
14:40 – 15:00 Break
15:00 – 16:20 Session 6A: Accelerating Emerging Applications
Session 6B: Bugs (1)
18:00 Olympic Museum

Friday, March 4, 2022

8:40 – 9:40 Keynote: Phillip Stanley-Marbell, University of Cambridge
9:40 – 10:00 Break
10:00 – 11:40 Session 7A: Serverless Computing
Session 7B: Bugs (2)
11:40 – 13:20 Lunch
13:20 – 14:40 Session 8A: Non-traditional Computing and Reconfigurable Hardware
Session 8B: Synthesis and Compilation
14:40 – 15:00 Break
15:00 – 16:20 Session 9A: Hardware Security (2)
Session 9B: Smart Networking
16:20 – 16:40 Closing Remarks

Session Details

Session 1A: Accelerators
Session Chair:
Accelerating Task-Parallel Workloads by Recovering Program Structure
DOTA: Detect and Omit Weak Attentions for Scalable Transformer Acceleration
A Full-stack Search Technique for Domain Optimized Deep Learning Accelerators
FINGERS: Exploiting Fine-Grained Parallelism in Graph Mining Accelerators
BiSon-e: A Lightweight and High-Performance Accelerator for Narrow Integer Linear Algebra Computing on the Edge
Session 1B: Address and Memory
Session Chair:
Software-defined Address Mapping: a Case on 3D Memory
Parallel Virtualized Memory Translation with Nested Elastic Cuckoo Page Tables
CARAT CAKE: Replacing Paging via Compiler/Kernel Cooperation
NVAlloc: Rethinking Heap Metadata Management in Persistent Memory Allocators
Every Walk€™s a Hit: Making Page Walks Single-Access Cache Hits
Session 2A: GPU and Data Analytics
Session Chair:
GPM: Leveraging Persistent Memory from a GPU
GPURip: A 50-KB GPU Stack for Client ML
ValueExpert: Exploring Value Patterns in GPU-accelerated Applications
SparseCore: Stream ISA and Processor Specialization for Sparse Computation
Streaming Semi-Structured Data with Bit-Parallel Skipping
Session 2B: Privacy and Software Security
Session Chair:
MineSweeper: a “clean sweep” for drop-in use-after-free prevention
Revizor: Testing Black-box CPUs against Speculation Contracts
Protecting Adaptive Sampling from Information Leakage on Low-Power Sensors
One Size Does Not Fit All: Security Hardening of MIPS Embedded Systems via Static Binary Debloating for Shared Libraries
ViK: Practical Mitigation of Temporal Memory Safety Violations through Object ID Inspection
Session 3A: Hardware Security (1)
Session Chair:
Eavesdropping User Credentials via GPU Side Channels on Smartphones
CRISP: Critical Slice Prefetching
Session 3B: Misc.
Session Chair:
Pinned Loads: Taming Speculative Loads in Secure Processors
DAGguise: Mitigating Memory Timing Side Channels
Session 4A: Systems for Machine Learning
Session Chair:
RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation
AStitch: Enabling A New Multi-Dimensional Optimization Space for Memory-Intensive ML Training and Inference on Modern SIMT Architectures
NASPipe: High Performance and Reproducible Pipeline Parallel Supernet Training via Causal Synchronous Parallel
VELTAIR: Towards High-Performance Multi-Tenant Deep Learning Services via Adaptive Compilation and Scheduling
Breaking the Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads
Session 4B: Operating System
Session Chair:
Clio: A Hardware-Software Co-Designed Disaggregated Memory System
Enzian: An Open, General, CPU/FPGA Platform for Systems Software Research
UCSA²: Untrusted Cores in a Shared System
FlexOS: Towards Flexible OS Isolation
Adelie: Continuous Address Space Layout Re-Randomization for Linux Drivers
Session 5A: Quantum Computing
Session Chair:
Suppressing ZZ Crosstalk of Quantum Computers through Pulse and Scheduling Co-Optimization
QUEST: Systematically Approximating Quantum Circuits for Higher Output Fidelity
HAMMER: Boosting Fidelity of Noisy Quantum Circuits by Exploiting Hamming Behavior of Erroneous Outcomes
LILLIPUT: A Lightweight Low-Latency Lookup-Table Based Decoder for Near-term Quantum Error Correction
Paulihedral: A Generalized Block-Wise Compiler Optimization Framework For Quantum Simulation Kernels
Session 5B: Data Center and Cloud Services
Session Chair:
Astraea: Towards QoS-Aware and Resource-Efficient Multi-stage GPU Services
Memory-Harvesting VMs in Cloud Platforms
IOCost : Block IO Control for Containers in Datacenters
TMO: Transparent Memory Offloading in Datacenters
SOL: Safe On-Node Learning in Cloud Platforms
Session 6A: Accelerating Emerging Applications
Session Chair:
GenStore: An In-storage Processing System for Genome Sequence Analysis
ProSE: The Architecture and Design of a Protein Discovery Engine
A One-for-All and $O(V\log(V))$-cost Solution for Parallel Merge Style Operations on Sorted Key-Value Arrays
Client-Optimized Algorithms and Acceleration for Encrypted Compute Offloading
Session 6B: Bugs (1)
Session Chair:
Finding Missed Optimizations through the Lens of Dead Code Elimination
A Tree Clock Data Structure for Causal Orderings in Concurrent Executions
RSSD: Defend Against Ransomware with Hardware-Isolated Network-Storage Codesign and Post-Attack Analysis
Creating Concise and Efficient Dynamic Analyses with ALDA
Session 7A: Serverless
Session Chair:
IceBreaker: Warming Serverless Functions Better with Heterogeneity
INFless: A Native Serverless System for Low-latency, High-throughput Inference
FaaSFlow: Enable Efficient Workflow Execution for Function-as-a-Service
Serverless Computing on Heterogeneous Computers
CoolEdge: Hotspot-relievable Warm Water Cooling for Energy-efficient Edge Datacenters
Session 7B: Bugs (2)
Session Chair:
Yashme: Detecting Persistency Races
EXAMINER: Automatically Locating Inconsistent Instructions between Real Devices and CPU Emulators for ARM
Path-Sensitive and Alias-Aware Typestate Analysis for Detecting OS Bugs
Efficiently Detecting Concurrency Bugs in Persistent Memory Programs
Who Goes First? Detecting Go Concurrency Bugs via Message Reordering
Session 8A: Non-traditional Computing & Reconfigurable Hardware
Session Chair:
CryoWire: Wire-Driven Microarchitecture Designs for Cryogenic Computing
REVAMP: A Systematic Framework for Heterogeneous CGRA Realization
SLID: Fast FPGA Compilation to Make Reconfigurable Acceleration Compatible with Modern Incremental Refinement Software Development
Debugging in the Brave New World of Reconfigurable Hardware
Temporal and SFQ Pulse-Streams Encoding for Area-Efficient Superconducting Accelerators
Session 8B: Synthesis and Compilation
Session Chair:
Understanding and Exploiting Optimal Function Inlining
CirFix: Automatically Repairing Defects in Hardware Design Code
Vector Instruction Selection for Digital Signal Processors Using Program Synthesis
HeteroGen: Transpiling C to Heterogeneous HLS Code with Automated Test Generation and Program Repair
Tree Traversal Synthesis Using Domain-Specific Symbolic Compilation
Session 9A: Hardware Security (2)
Session Chair:
SRAM Has No Chill: Exploiting Power Domain Separation to Steal On-chip Secrets
Randomized Row-Swap: Mitigating Row Hammer by Breaking Spatial Correlation Between Aggressor and Victim Rows
ShEF: Shielded Enclaves for Cloud FPGAs
Invisible Bits: Hiding Secret Messages in SRAM’s Analog Domain
Session 9B: Smart Networking
Session Chair:
Taurus: A Data Plane Architecture for Per-Packet ML
FlexDriver: A Network Driver for Your Accelerator
The Benefits of General-Purpose On-NIC Memory
Domain Specific Run Time Optimization for Software Data Planes