Sök Kontakt
bild
Skolan för
datavetenskap
och kommunikation
KTH / CSC / Kurser / DT2112

DT2112 Speech technology

Introduction

Speech technology is a course at the Department of Speech, Music and Hearing (TMH), CSC, worth 7.5 ECTS credits.

The language of the course is English. The lectures will be given in English, lecture slides and the literature will be in English, instructions for the examination will be available in both English and Swedish.

The course is focused on the interaction between humans and computers using speech communication. Applications such as speaking and speech understanding computers, speaker verification as a personal identification method, and multimodal dialogue systems are presented. In order to explain the background for such applications, the course describes the basic concepts of human communication regarding speech, language and hearing as well as digital signal analysis and statistical methods for analysis and classification of speech.

The course will give the students theoretical and practical introductions to
- linguistic theory and phonetics.
- the basics of physiology and acoustics of speech as a base for speech technology models.
- measuring techniques and signal processing in speech analysis.
- the physiology of hearing, psychoacoustics and speech perception with applications in speech understanding systems.
- methods for automatic speaker verification.
- evaluation of speech communication systems.
- studies and experiments with text-to-speech and speech-to-text in systems for human-computer interaction, especially multimodal dialogue systems.

Aim

After completing the course, the students should be able to
* Give short descriptions of speech from the acoustic, phonetic, and linguistic perspectives for use in speech technology applications.
* Explain how computers recognize speech and speakers and describe common methods to do so, such as HMMs and neural networks.
* Give an overview of different methods used to produce speech with computers and how a computer-animated face may be used to improve speech perception.
* Exemplify speech-driven dialogue systems and choose type of system based on the area of application.
* Give an account of available state-of-the art speech technology and its applications.
* Summarize the current research areas in speech technology and how scientific results may be applied in e.g., mobile systems and IT.


Course responsible: Joakim Gustafson, jocke@speech.kth.se, 790 89 65