Accessible Foundation Models: Systems, Algorithms, and Science

Talk

Tim Dettmers

Talk Series:

Visitors

Time:

04.11.2024 11:00 to 12:00

Location:

IRB 4105 or https://umd.zoom.us/j/95853135696?pwd=VVEwMVpxeElXeEw0ckVlSWNOMVhXdz09

URL:

https://talks.cs.umd.edu/talks/3760

The ever-increasing scale of foundation models, such as ChatGPT and AlphaFold, has revolutionized AI and science more generally. However, increasing scale also steadily raises computational barriers, blocking almost everyone from studying, adapting, or otherwise using these models for anything beyond static API queries. In this talk, I will present research that significantly lowers these barriers for a wide range of use cases, including inference algorithms that are used to make predictions after training, finetuning approaches that adapt a trained model to new data, and finally, full training of foundation models from scratch. For inference, I will describe our LLM.int8() algorithm, which showed how to enable high-precision 8-bit matrix multiplication that is both fast and memory efficient. LLM.int8() is based on the discovery and characterization of sparse outlier sub-networks that only emerge at large model scales but are crucial for effective Int8 quantization. For finetuning, I will introduce the QLoRA algorithm, which pushes such quantization much further to unlock finetuning of very large models on a single GPU by only updating a small set of the parameters while keeping most of the network in a new information-theoretically optimal 4-bit representation. For full training, I will present SWARM parallelism, which allows collaborative training of foundation models across continents on standard internet infrastructure while still being 80% as effective as the prohibitively expensive supercomputers that are currently used. Finally, I will close by outlining my plans to make foundation models 100x more accessible, which will be needed to maintain truly open AI-based scientific innovation as models continue to scale.

Upcoming Events

Talk

04.30.2024 10:00 to 12:00

IRB 4105

AI Empowered Music Education
Snehesh Shrestha

Talk

04.30.2024 12:30 to 15:00

IRB 4107

Towards Trustworthy Models in Machine Learning
Xiaoyu Liu

Talk

05.01.2024 14:00 to 16:00

IRB 2137

PhD Proposal: Scaling Policy Gradient Methods to Open-Ended Domains
Ryan Sullivan

Talk

05.01.2024 15:00 to 17:00

IRB IRB-4105

PhD Defense: Feedback for Vision
Michael Maynord

Talk

05.02.2024 12:30 to 14:00

IRB 4107

Towards AI Alignment: Advancing Fairness, Reliability, and Human-Like Perception in AI
Bang An

Event

05.03.2024 11:00 to 12:00

IRB-4105

Computer Science APT Meeting

Event

05.03.2024 12:00 to 13:30

IRB-4105

Computer Science FFL

Talk

05.03.2024 15:00 to 16:45

IRB 4107

PhD Proposal: Multi-Agent Autonomous Decision Making in Artificial Intelligence
Saptarashmi Bandyopadhyay

Event

05.06.2024 12:00 to 13:00

IRB-2137

Computer Science Department Council Meeting

Talk

05.06.2024 14:00 to 15:00

IRB 4105

EXAMPLE AIDED DESIGN: A PATH TO AUTOMATING EXPRESSIVE VISUALIZATION DESIGN
Hannah Bako