Remote attendees: Please access the live stream of the program and asynchronous Q&A for individual papers via Whova at this URL. For the best remote experience we recommend using a laptop or desktop computer. Register for remote attendance here. Here is a guide for remote attendees.
In-person attendees: Please download the Whova app to your mobile device here. The app gives you access to the conference agenda (with links to floor maps), online networking, asynchronous Q&A for individual papers, and important announcements. Here is a guide for in-person attendees.
Links to lightning talks can be found under each paper in the program below. The conference proceedings will be freely accessible without membership on the ACM Digital Library, not behind a paywall for 1 month starting on the first day of the conference.
ASPLOS 2023 Proceedings
Volume 1: https://doi.org/10.1145/3567955
Volume 2: https://doi.org/10.1145/3575693
Volume 3: https://doi.org/10.1145/3582016
Sunday, 6:00 PM PDT – 9:00 PM PDT: Welcome Reception
Location: Pavillion Ballroom (3rd floor)
Day 1: Monday, March 27
8:00 AM PDT – 8:40 AM PDT: Breakfast
Location: Junior Ballroom & Pavillion Ballroom (3rd floor)
8:40 AM PDT – 9:00 AM PDT: Opening Remarks
Location: Grand AB (GB level, below ground)
9:00 AM PDT – 10:00 AM PDT: Keynote 1 by Azalia Mirhoseini (Anthropic / Stanford Univ.)
Pushing the Limits of Scaling Laws in the Age of Generative Models
Location: Grand AB (GB level, below ground)
Abstract
The emergence of powerful generative AI (e.g. large language / vision models) would not have been possible without recent advances in computing systems and accelerators. This talk sheds light on the important role that generative AI itself can play in designing the next generation of computing systems and hardware that in turn would fuel the next generation of AI breakthroughs. Concretely, I will discuss our work on learned optimization for hardware resource allocation and model mapping, which inspired a new and ongoing trend of policy gradient methods for solving combinatorial optimization in computer systems, a generalizable deep reinforcement learning method for chip floorplanning that saved several weeks of design cycle for Google TPUs, and an automated framework for full-stack HW/SW co-design that resulted in drastic Perf/TCO improvements in custom accelerator design. Finally, I will discuss the opportunities and challenges for future computing systems in the era of large generative models.
Bio
Azalia Mirhoseini is a Member of Technical Staff at Anthropic, and an incoming Assistant Professor of Computer Science at Stanford University. Previously, she was a Staff Research Scientist and Team Lead at Google Brain, where she co-founded the Machine Learning for Systems Team. Azalia has published more than 40 peer-reviewed papers at scientific venues such as Nature, ICML, ICLR, NeurIPS, UAI, ASPLOS, SIGMETRICS, DAC, DATE, and ICCAD. She has received a number of awards, including the MIT Technology Review 35 under 35 award, the Best Ph.D. Thesis Award at Rice, and a Gold Medal in the National Math Olympiad in Iran. Her work has been covered in various media outlets, including WIRED, CNBC, ABC News, MIT Technology Review, and IEEE Spectrum.
10:00 AM PDT – 10:20 AM PDT: Coffee Break
Location: Grand Foyer (GB level, below ground)
10:20 AM PDT – 12:00 PM PDT
Session 1A: Systems for ML
Location: Grand AB (GB level, below ground)
Session 1B: Shared Memory/Mem Consistency
Location: Grand C (GB level, below ground)
Session 1C: IoT/Embedded/Mobile
Location: Grand D (GB level, below ground)
12:00 PM PDT – 1:00 PM PDT: Lunch
Location: Junior Ballroom & Pavillion Ballroom (3rd floor)
1:00 PM PDT – 2:40 PM PDT
Session 2A: Compiler Techniques & Optimization
Location: Grand AB (GB level, below ground)
Session 2B: GPUs
Location: Grand C (GB level, below ground)
Session 2C: Clouds/Datacenters
Location: Grand D (GB level, below ground)
2:40 PM PDT – 3:00 PM PDT: Coffee Break
Location: Grand Foyer (GB level, below ground)
3:00 PM PDT – 3:50 PM PDT
Session 3A: Sustainability
Location: Grand AB (GB level, below ground)
Session 3B: Accelerators A
Location: Grand C (GB level, below ground)
Session 3C: Persistence
Location: Grand D (GB level, below ground)
3:50 PM PDT – 5:10 PM PDT: Poster Session 1
Location: Junior Ballroom & Junior Foyer (3rd floor)
5:10 PM PDT – 6:10 PM PDT: WACI: Wild and Crazy Ideas
Location: Grand AB (GB level, below ground)
6:10 PM PDT – 7:10 PM PDT: Business Meeting
Location: Grand AB (GB level, below ground)
Day 2: Tuesday, March 28
8:00 AM PDT – 9:00 AM PDT: Breakfast
Location: Junior Ballroom & Pavillion Ballroom (3rd floor)
9:00 AM PDT – 10:00 AM PDT: Keynote 2 by Abhishek Bhattacharjee (Yale Univ.)
Direct Mind-Machine Teaming
Location: Grand AB (GB level, below ground)
Abstract
Direct mind-machine teaming will help us treat brain disorders, augment the healthy brain, and shed light on how the brain as an organ gives rise to the mind. Delivering on this promise requires the design of computer systems that delicately balance the tight power, latency, and bandwidth trade-offs needed to decode brain activity, stimulate biological neurons, and control assistive devices most effectively.
This talk presents my group's design of a standardized and general computer architecture for future brain interfacing. Our design enables the treatment of several neurological disorders (most notably, epilepsy and movement disorders) and lays the groundwork for brain interfacing techniques that can help augment cognitive control and decision-making in the healthy brain. Central to our design is end-to-end hardware acceleration, from the microarchitectural to the distributed system level. Key insights are undergirded via detailed physical synthesis models and chip tape-outs in a 12nm CMOS process.
Bio
Abhishek Bhattacharjee is an Associate Professor of Computer Science at Yale University. His work on hardware optimizations for memory translation has influenced the design of TLBs in AMD CPUs, starting with the Zen 1 architecture, and in NVIDIA's GPUs, starting with the Ampere architecture. His work on software optimizations for memory translation has been shipped in the Linux OS since the 4.14 kernel. More recently, Abhishek has been building flexible and low-power architectures for brain-computer interfacing — the topic of this talk.
10:00 AM PDT – 10:20 AM PDT: Coffee Break
Location: Grand Foyer (GB level, below ground)
10:20 AM PDT – 12:00 PM PDT
Session 4A: Design Tools
Location: Grand AB (GB level, below ground)
Session 4B: Memory Mgmt. / Near Data Processing
Location: Grand C (GB level, below ground)
Session 4C: Tensor Computation
Location: Grand D (GB level, below ground)
12:00 PM PDT – 1:40 PM PDT: Lunch
Location: Junior Ballroom & Pavillion Ballroom (3rd floor)
1:40 PM PDT – 3:20 PM PDT
Session 5A: Debugging
Location: Grand AB (GB level, below ground)
Session 5B: Storage
Location: Grand C (GB level, below ground)
Session 5C: Machine Learning
Location: Grand D (GB level, below ground)
3:20 PM PDT – 3:40 PM PDT: Coffee Break
Location: Grand Foyer (GB level, below ground)
3:40 PM PDT – 4:30 PM PDT
Session 6A: Quantum A
Location: Grand AB (GB level, below ground)
Session 6B: Networking
Location: Grand C (GB level, below ground)
Session 6C: Graphs A
Location: Grand D (GB level, below ground)
4:30 PM PDT – 6:00 PM PDT: Poster Session 2
Location: Junior Ballroom & Junior Foyer (3rd floor)
6:30 PM PDT – 11:00 PM PDT: Excursion, Banquet, and Awards: Vancouver Aquarium
- Buses depart starting at 5:45 PM
- Awards Ceremony: 7:00 PM – 7:30 PM in Upper Teck
Day 3: Wednesday, March 29
9:00 AM PDT – 10:00 AM PDT: Keynote 3 by Bryan Catanzaro (NVIDIA)
Language Models — The Most Important Compute Challenge of Our Time
Location: Grand AB (GB level, below ground)
Abstract
ChatGPT recently became one of the fastest growing new applications in history, thanks to its intriguing text generation capabilities that are able to answer questions, write poetry, and even problem solve. Large Language Models are now being integrated in fundamental ways into products around the tech industry. The possibilities are extraordinary, but much research remains to make these systems reliable and trustworthy, as well as integrate them into applications seamlessly. Additionally, the computational challenges behind large language modeling are also quite important. Systems for training and deploying these models must be highly scalable and run at extreme efficiency, because the amount of work necessary to converge a model can be extraordinarily large. The cost of deploying these models is a barrier to their deployment and must be lowered significantly. In this talk, I'll discuss the work we have been doing at NVIDIA to optimize systems for Large Language Model training and inference, and highlight some of the challenges that remain for future work.
Bio
Bryan Catanzaro is Vice President of Applied Deep Learning Research at NVIDIA, where he leads a team of AI researchers working on chip design, audio and speech, language modeling, graphics and vision, with the goal of finding practical new ways to use AI for NVIDIA's products and workflows. DLSS, Megatron, CUDNN, Pascaline, WaveGlow and DeepSpeech are some of the projects he's helped create. Bryan received his PhD in EECS from the University of California, Berkeley.