VoixTenue: Exploring Real-Time Gestural Control of Vocal Synthesis on a Mobile Phone
Adrien Scazzola, and Xiao Xiao
Proceedings of the International Conference on New Interfaces for Musical Expression
- Year: 2026
- Location: London, United Kingdom
- Track: paper
- Pages: 1326–1329
- Article Number: 165
- DOI: 10.5281/zenodo.20784501 (Link to paper and supplementary files)
- PDF Link
Abstract
We present VoixTenue, an interface exploring how a mobile phone’s on-board sensing can be used to control pitch, dynamics, and intonation in real time for expressive vocal synthesis. Voix-Tenue supports two main interaction modalities: one where users draw and replay fingertip-drawn intonation curves for real-time pitch control, and another where the phone’s orientation controls pitch and dynamics through inertial sensing. Using Pink Trombone as its synthesis engine, VoixTenue supports two modes for phonetic content: a vowel mode, in which users select a sustained vowel, and a phrase mode, in which a short English text can be entered, whose intonation is controlled by the user. We describe the system architecture and gestural mappings, and discuss potential use cases, including expressive performance and language learning.
Citation
Adrien Scazzola, and Xiao Xiao. 2026. VoixTenue: Exploring Real-Time Gestural Control of Vocal Synthesis on a Mobile Phone. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.20784501 [PDF]
BibTeX Entry
@inproceedings{nime2026_165,
abstract = {We present VoixTenue, an interface exploring how a mobile phone’s on-board sensing can be used to control pitch, dynamics, and intonation in real time for expressive vocal synthesis. Voix-Tenue supports two main interaction modalities: one where users draw and replay fingertip-drawn intonation curves for real-time pitch control, and another where the phone’s orientation controls pitch and dynamics through inertial sensing. Using Pink Trombone as its synthesis engine, VoixTenue supports two modes for phonetic content: a vowel mode, in which users select a sustained vowel, and a phrase mode, in which a short English text can be entered, whose intonation is controlled by the user. We describe the system architecture and gestural mappings, and discuss potential use cases, including expressive performance and language learning.},
address = {London, United Kingdom},
articleno = {165},
author = {Adrien Scazzola and Xiao Xiao},
booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
doi = {10.5281/zenodo.20784501},
editor = {Benedict Gaster and João Tragtenberg and Anna Xambó and Tom Mitchell},
issn = {2220-4806},
month = {June},
note = {},
numpages = {4},
pages = {1326--1329},
title = {VoixTenue: Exploring Real-Time Gestural Control of Vocal Synthesis on a Mobile Phone},
track = {paper},
url = {http://nime.org/proceedings/2026/nime2026_165.pdf},
year = {2026}
}