Generating Convincing Harmony Parts with Simple Long Short-Term Memory Networks

Andrei Faitas; Synne Engdahl Baumann; Torgrim Rudland Næss; Jim Torresen; Charles Patrick Martin

Generating Convincing Harmony Parts with Simple Long Short-Term Memory Networks

Andrei Faitas, Synne Engdahl Baumann, Torgrim Rudland Næss, Jim Torresen, and Charles Patrick Martin

Proceedings of the International Conference on New Interfaces for Musical Expression

Year: 2019
Location: Porto Alegre, Brazil
Pages: 325–330
DOI: 10.5281/zenodo.3672980 (Link to paper and supplementary files)
PDF Link

Abstract

Generating convincing music via deep neural networks is a challenging problem that shows promise for many applications including interactive musical creation. One part of this challenge is the problem of generating convincing accompaniment parts to a given melody, as could be used in an automatic accompaniment system. Despite much progress in this area, systems that can automatically learn to generate interesting sounding, as well as harmonically plausible, accompanying melodies remain somewhat elusive. In this paper we explore the problem of sequence to sequence music generation where a human user provides a sequence of notes, and a neural network model responds with a harmonically suitable sequence of equal length. We consider two sequence-to-sequence models; one featuring standard unidirectional long short-term memory (LSTM) architecture, and the other featuring bidirectional LSTM, both successfully trained to produce a sequence based on the given input. Both of these are fairly dated models, as part of the investigation is to see what can be achieved with such models. These are evaluated and compared via a qualitative study that features 106 respondents listening to eight random samples from our set of generated music, as well as two human samples. From the results we see a preference for the sequences generated by the bidirectional model as well as an indication that these sequences sound more human.

Citation

Andrei Faitas, Synne Engdahl Baumann, Torgrim Rudland Næss, Jim Torresen, and Charles Patrick Martin. 2019. Generating Convincing Harmony Parts with Simple Long Short-Term Memory Networks. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.3672980

BibTeX Entry

@inproceedings{Faitas2019,
 abstract = {Generating convincing music via deep neural networks is a challenging problem that shows promise for many applications including interactive musical creation. One part of this challenge is the problem of generating convincing accompaniment parts to a given melody, as could be used in an automatic accompaniment system. Despite much progress in this area, systems that can automatically learn to generate interesting sounding, as well as harmonically plausible, accompanying melodies remain somewhat elusive. In this paper we explore the problem of sequence to sequence music generation where a human user provides a sequence of notes, and a neural network model responds with a harmonically suitable sequence of equal length. We consider two sequence-to-sequence models; one featuring standard unidirectional long short-term memory (LSTM) architecture, and the other featuring bidirectional LSTM, both successfully trained to produce a sequence based on the given input. Both of these are fairly dated models, as part of the investigation is to see what can be achieved with such models. These are evaluated and compared via a qualitative study that features 106 respondents listening to eight random samples from our set of generated music, as well as two human samples. From the results we see a preference for the sequences generated by the bidirectional model as well as an indication that these sequences sound more human.},
 address = {Porto Alegre, Brazil},
 author = {Andrei Faitas and Synne Engdahl Baumann and Torgrim Rudland Næss and Jim Torresen and Charles Patrick Martin},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.3672980},
 editor = {Marcelo Queiroz and Anna Xambó Sedó},
 issn = {2220-4806},
 month = {June},
 pages = {325--330},
 publisher = {UFRGS},
 title = {Generating Convincing Harmony Parts with Simple Long Short-Term Memory Networks},
 url = {http://www.nime.org/proceedings/2019/nime2019_paper062.pdf},
 year = {2019}
}