18-667: Algorithms for Large-scale Distributed Machine Learning and Optimization
The objective of this course is to introduce students to state-of-the-art algorithms in large-scale machine learning and distributed optimization. Students will read and critique a curated set of research papers. A key discussion topic will be distributed stochastic gradient descent, and how to scale it to federated learning frameworks. Topics to be covered include but are not limited to: mini-batch SGD and its convergence analysis, momentum and variance reduction methods, synchronous and asynchronous SGD, local-update SGD, gradient compression/quantization, differential privacy in federated learning, decentralized SGD, and hyperparameter optimization. Foundational knowledge in undergraduate probability and linear algebra is strongly encouraged as a pre-requisite.
Last Modified: 2020-12-17 3:10PM
This course is currently being offered.
- Spring 2021
- Fall 2014
- Fall 2013
- Spring 2013
- Fall 2012
- Fall 2011
- Fall 2010
- Fall 2009