Human-in-the-Loop: Crossmodal AI Alignment between Movement and Audio Latent Spaces for Expressive Sonification in Dance Performance
Koray Tahiroğlu, Mikael Hokkanen, and Ariana Marta
Proceedings of the International Conference on New Interfaces for Musical Expression
- Year: 2026
- Location: London, United Kingdom
- Track: paper
- Pages: 936–943
- Article Number: 113
- DOI: 10.5281/zenodo.20784352 (Link to paper and supplementary files)
- PDF Link
- Presentation/Demo Video
Abstract
This paper presents Crossmodal AI alignment, a generative AI framework for expressive sonification of human movement in dance performance. The system connects two variational autoencoders: a Movement VAE encoder, capturing real-time expressive dance movement features, and an Audio VAE (RAVE) decoder, generating corresponding musical textures and responses. A central alignment module links their latent spaces, allowing dynamic adaptation between movement and sound. Unlike fixed or rule-based mapping approaches, SonicMove alignment system introduces a human-in-the-loop alignment process, where the dancer calibrates the crossmodal relationship through embodied exploration prior to performance. This enables an adaptive and intuitive co-creative dialogue between performer and AI, producing sonifications that respond to subtle variations in movement. Exploratory sessions with invited dancers, centred on the latent space alignment process, suggest how performer-driven calibration shapes the perceived coherence and expressivity of generated sound in relation to movement, offering a possible direction toward more adaptive, multimodal performance systems that integrate movement, sound, and creative interpretation.
Citation
Koray Tahiroğlu, Mikael Hokkanen, and Ariana Marta. 2026. Human-in-the-Loop: Crossmodal AI Alignment between Movement and Audio Latent Spaces for Expressive Sonification in Dance Performance. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.20784352 [PDF]
BibTeX Entry
@inproceedings{nime2026_113,
abstract = {This paper presents Crossmodal AI alignment, a generative AI framework for expressive sonification of human movement in dance performance. The system connects two variational autoencoders: a Movement VAE encoder, capturing real-time expressive dance movement features, and an Audio VAE (RAVE) decoder, generating corresponding musical textures and responses. A central alignment module links their latent spaces, allowing dynamic adaptation between movement and sound. Unlike fixed or rule-based mapping approaches, SonicMove alignment system introduces a human-in-the-loop alignment process, where the dancer calibrates the crossmodal relationship through embodied exploration prior to performance. This enables an adaptive and intuitive co-creative dialogue between performer and AI, producing sonifications that respond to subtle variations in movement. Exploratory sessions with invited dancers, centred on the latent space alignment process, suggest how performer-driven calibration shapes the perceived coherence and expressivity of generated sound in relation to movement, offering a possible direction toward more adaptive, multimodal performance systems that integrate movement, sound, and creative interpretation.},
address = {London, United Kingdom},
articleno = {113},
author = {Koray Tahiroğlu and Mikael Hokkanen and Ariana Marta},
booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
doi = {10.5281/zenodo.20784352},
editor = {Benedict Gaster and João Tragtenberg and Anna Xambó and Tom Mitchell},
issn = {2220-4806},
month = {June},
note = {},
numpages = {8},
pages = {936--943},
presentation-video = {https://youtu.be/T2RhjNKdJi0},
title = {Human-in-the-Loop: Crossmodal AI Alignment between Movement and Audio Latent Spaces for Expressive Sonification in Dance Performance},
track = {paper},
url = {http://nime.org/proceedings/2026/nime2026_113.pdf},
year = {2026}
}