Deep Generative Models for Music: Playing in the Latent Space

Thesis Type Master
Thesis Status
Currently running
Student Johannes Ebster
Thesis Supervisor

Artificial intelligence (AI) can not surpass the human intelligence before it masters the fine arts, such as painting, poetry and music. This is one motivation for research on generative systems that produce unseen and meaningful pictures, text or music. Currently, the best performing systems are built using various neural network architectures and trained in a supervised machine learning setting using human-made art as the training data. Instead of letting such systems entertain us with their artificial creativity, we can also try to embed them into the workflow of human artists where they may help by sparking creativity or by continuing some already existing idea in a sensible way. A model that is suited for this task has to offer a certain level of interactivity and control instead of producing something that is valid on its own but has no connection to the context at hand. This master thesis investigates deep generative models for music on their capabilities in terms of control, adaptability, interactivity and related concepts. Existing models are extended or adapted to provide more control. Moreover, we design an original purpose-built model and compare it to previous models. Both the existing and the altered or original systems are investigated by humans through a user study.