The control-synthesis approach for making expressive and controllable neural music synthesizers

Nicolas Jonason

Abstract

Many synthesizers are designed to emulate other instruments. Recently, progress has been made in using neural audio generation models to create synthesizers which can efficiently learn from data. However, these past approaches compromise either inter-note timbre dependencies or ease of use. I address this issue with the control-synthesis approach, a method for turning audio generation models into easily controllable synthesizers. In this approach, a control model transforms user input into a set of intermediate features with which we condition a synthesis model. I demonstrate this approach by implementing a MIDI-controllable synthesizer which can be trained to emulate a target instrument on unannotated audio.