Course Information
18-495: Speech Technology for Conversational AI
This course provides both practical and theoretical knowledge on how we can leverage speech processing technologies to build a conversational AI system. The course encompasses speech recognition, speaker recognition, speech synthesis, speech enhancement, speech translation, spoken dialogue systems, speech foundation models, and other speech and audio processing tasks. In practical sessions, students will learn to build functional speech recognition and synthesis systems or utilize existing large speech and language models and integrate them to create a speech interface using existing toolkits. The course will also present details of algorithms, techniques, evaluation metrics, and limitations of state-of-the-art speech systems. This course is particularly designed for students who want to learn how to process actual data for real-world applications, applying AI and machine learning techniques while also being aware of the current technology limitations.
Last Modified: 2024-12-11 11:52AM
Current session:
This course is currently being offered.
Semesters offered:
- Spring 2025
- Spring 2024
- Spring 2023
- Fall 2021
- Fall 2020