Carnegie Mellon University

Electrical and Computer Engineering

College of Engineering

Course Information

18-667: Algorithms for Large-scale Distributed Machine Learning and Optimization




The objective of this course is to introduce students to state-of-the-art algorithms in large-scale machine learning and distributed optimization. Students will read and critique a curated set of research papers. A key discussion topic will be distributed stochastic gradient descent, and how to scale it to federated learning frameworks. Topics to be covered include but are not limited to: mini-batch SGD and its convergence analysis, momentum and variance reduction methods, synchronous and asynchronous SGD, local-update SGD, gradient compression/quantization, differential privacy in federated learning, decentralized SGD, and hyperparameter optimization. Foundational knowledge in undergraduate probability and linear algebra is strongly encouraged as a pre-requisite.

Last Modified: 2020-12-17 3:10PM

Semesters offered:

  • Spring 2021
  • Fall 2014
  • Fall 2013
  • Spring 2013
  • Fall 2012
  • Fall 2011
  • Fall 2010
  • Fall 2009