VoixTenue: Exploring Real-Time Gestural Control of Vocal Synthesis on a Mobile Phone

Adrien Scazzola, and Xiao Xiao

Proceedings of the International Conference on New Interfaces for Musical Expression

Abstract

We present VoixTenue, an interface exploring how a mobile phone’s on-board sensing can be used to control pitch, dynamics, and intonation in real time for expressive vocal synthesis. Voix-Tenue supports two main interaction modalities: one where users draw and replay fingertip-drawn intonation curves for real-time pitch control, and another where the phone’s orientation controls pitch and dynamics through inertial sensing. Using Pink Trombone as its synthesis engine, VoixTenue supports two modes for phonetic content: a vowel mode, in which users select a sustained vowel, and a phrase mode, in which a short English text can be entered, whose intonation is controlled by the user. We describe the system architecture and gestural mappings, and discuss potential use cases, including expressive performance and language learning.

Citation

Adrien Scazzola, and Xiao Xiao. 2026. VoixTenue: Exploring Real-Time Gestural Control of Vocal Synthesis on a Mobile Phone. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.20784501 [PDF]

BibTeX Entry

@inproceedings{nime2026_165,
 abstract = {We present VoixTenue, an interface exploring how a mobile phone’s on-board sensing can be used to control pitch, dynamics, and intonation in real time for expressive vocal synthesis. Voix-Tenue supports two main interaction modalities: one where users draw and replay fingertip-drawn intonation curves for real-time pitch control, and another where the phone’s orientation controls pitch and dynamics through inertial sensing. Using Pink Trombone as its synthesis engine, VoixTenue supports two modes for phonetic content: a vowel mode, in which users select a sustained vowel, and a phrase mode, in which a short English text can be entered, whose intonation is controlled by the user. We describe the system architecture and gestural mappings, and discuss potential use cases, including expressive performance and language learning.},
 address = {London, United Kingdom},
 articleno = {165},
 author = {Adrien Scazzola and Xiao Xiao},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.20784501},
 editor = {Benedict Gaster and João Tragtenberg and Anna Xambó and Tom Mitchell},
 issn = {2220-4806},
 month = {June},
 note = {},
 numpages = {4},
 pages = {1326--1329},
 title = {VoixTenue: Exploring Real-Time Gestural Control of Vocal Synthesis on a Mobile Phone},
 track = {paper},
 url = {http://nime.org/proceedings/2026/nime2026_165.pdf},
 year = {2026}
}