Real-Time Co-Creation of Expressive Music Performances Using Speech and Gestures

Ilya Borovik, and Vladimir Viro

Proceedings of the International Conference on New Interfaces for Musical Expression

Abstract:

We present a system for interactive co-creation of expressive performances of notated music using speech and gestures. The system provides real-time or near-real-time dialog-based control of performance rendering and interaction in multiple modalities. It is accessible to people regardless of their musical background via smartphones. The system is trained using sheet music and associated performances, in particular using notated performance directions and user-system interaction data to ground performance directions in performances. Users can listen to an autonomously generated performance or actively engage in the performance process. A speech- and gesture-based feedback loop and online learning from past user interactions improve the accuracy of the performance rendering control. There are two important assumptions behind our approach: a) that many people can express nuanced aspects of expressive performance using natural human expressive faculties, such as speech, voice, and gesture, and b) that by doing so and hearing the music follow their direction with low latency, they can enjoy playing the music that would otherwise be inaccessible to them. The ultimate goal of this work is to enable fulfilling and accessible music making experiences for a large number of people who are not currently musically active.

Citation:

Ilya Borovik, and Vladimir Viro. 2023. Real-Time Co-Creation of Expressive Music Performances Using Speech and Gestures. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.11189321

BibTeX Entry:

  @inproceedings{nime2023_91,
 abstract = {We present a system for interactive co-creation of expressive performances of notated music using speech and gestures. The system provides real-time or near-real-time dialog-based control of performance rendering and interaction in multiple modalities. It is accessible to people regardless of their musical background via smartphones. The system is trained using sheet music and associated performances, in particular using notated performance directions and user-system interaction data to ground performance directions in performances. Users can listen to an autonomously generated performance or actively engage in the performance process. A speech- and gesture-based feedback loop and online learning from past user interactions improve the accuracy of the performance rendering control. There are two important assumptions behind our approach: a) that many people can express nuanced aspects of expressive performance using natural human expressive faculties, such as speech, voice, and gesture, and b) that by doing so and hearing the music follow their direction with low latency, they can enjoy playing the music that would otherwise be inaccessible to them. The ultimate goal of this work is to enable fulfilling and accessible music making experiences for a large number of people who are not currently musically active.},
 address = {Mexico City, Mexico},
 articleno = {91},
 author = {Ilya Borovik and Vladimir Viro},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.11189321},
 editor = {Miguel Ortiz and Adnan Marquez-Borbon},
 issn = {2220-4806},
 month = {May},
 numpages = {6},
 pages = {620--625},
 title = {Real-Time Co-Creation of Expressive Music Performances Using Speech and Gestures},
 track = {Work in Progress},
 url = {http://nime.org/proceedings/2023/nime2023_91.pdf},
 year = {2023}
}