Course Information

18-667: Algorithms for Large-scale Distributed Machine Learning and Optimization

Units:

Description:

The objective of this course is to introduce students to state-of-the-art algorithms in large-scale machine learning and distributed optimization. Students will read and critique a curated set of research papers. A key discussion topic will be distributed stochastic gradient descent, and how to scale it to federated learning frameworks. Topics to be covered include but are not limited to: mini-batch SGD and its convergence analysis, momentum and variance reduction methods, synchronous and asynchronous SGD, local-update SGD, gradient compression/quantization, differential privacy in federated learning, decentralized SGD, and hyperparameter optimization. Foundational knowledge in undergraduate probability and linear algebra is strongly encouraged as a pre-requisite.

Prerequisites: 18-661 or 10701 or 10601 or 10301

Last Modified: 2025-07-17 11:03AM

Semesters offered:

Fall 2025
Fall 2024
Spring 2021
Fall 2014
Fall 2013
Spring 2013
Fall 2012
Fall 2011
Fall 2010
Fall 2009

Electrical and Computer Engineering

College of Engineering

Course Information

18-667: Algorithms for Large-scale Distributed Machine Learning and Optimization

Units:

Description:

Semesters offered:

Helpful links