Improvise+=Chain: Listening to the Ensemble Improvisation of an Autoregressive Generative Model

Atsuya Kobayashi, Ryo Nishikado, and Nao Tokui

Proceedings of the International Conference on New Interfaces for Musical Expression

Abstract:

This paper describes Improvise+=Chain, an audio-visual installation artwork of autonomous musical performance using artificial intelligence technology. The work is designed to provide the audience with an experience exploring the differences between human and AI-based virtual musicians. Using a transformer decoder, we developed a four-track (melody, bass, chords and accompaniment, and drums) symbolic music generation model. The model generates each track in real time to create an endless chain of phrases, and 3D visuals and LED lights represent the attention information between four tracks, i.e., four virtual musicians, calculated within the model. This work aims to highlight the differences for viewers to consider between humans and artificial intelligence in music jams by visualizing the only information virtual musicians can communicate with while humans interact in multiple modals during the performance.

Citation:

Atsuya Kobayashi, Ryo Nishikado, and Nao Tokui. 2023. Improvise+=Chain: Listening to the Ensemble Improvisation of an Autoregressive Generative Model. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.11189329

BibTeX Entry:

  @inproceedings{nime2023_94,
 abstract = {This paper describes Improvise+=Chain, an audio-visual installation artwork of autonomous musical performance using artificial intelligence technology. The work is designed to provide the audience with an experience exploring the differences between human and AI-based virtual musicians. Using a transformer decoder, we developed a four-track (melody, bass, chords and accompaniment, and drums) symbolic music generation model. The model generates each track in real time to create an endless chain of phrases, and 3D visuals and LED lights represent the attention information between four tracks, i.e., four virtual musicians, calculated within the model. This work aims to highlight the differences for viewers to consider between humans and artificial intelligence in music jams by visualizing the only information virtual musicians can communicate with while humans interact in multiple modals during the performance.},
 address = {Mexico City, Mexico},
 articleno = {94},
 author = {Atsuya Kobayashi and Ryo Nishikado and Nao Tokui},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.11189329},
 editor = {Miguel Ortiz and Adnan Marquez-Borbon},
 issn = {2220-4806},
 month = {May},
 numpages = {4},
 pages = {633--636},
 title = {Improvise+=Chain: Listening to the Ensemble Improvisation of an Autoregressive Generative Model},
 track = {Demos},
 url = {http://nime.org/proceedings/2023/nime2023_94.pdf},
 year = {2023}
}