Beyond Direct Geometry: Spring-Mass Control of Tongue Articulation for Vocal Synthesis

Debasish Mohapatra, Ziyi Xia, and Sidney Fels

Proceedings of the International Conference on New Interfaces for Musical Expression

Abstract

Human speech production relies on tightly coupled neuromuscular control of articulators and the aeroacoustic properties of the vocal tract. Vocal synthesizers employing direct geometric control of articulatory positions often struggle to generate smooth nonlinear trajectories between target vowels, as required for diphthong synthesis. We propose a biomechanically inspired control approach using a lightweight spring--mass--damper framework coupled to an acoustic wave solver, in which spring forces are parameterized to generate target tongue shapes. This physics-based interface enables synthesis through an input modality analogous to natural muscle activation. We conducted a pilot study comparing the proposed physics-based controller with a conventional geometry-driven controller on identical trajectory-generation tasks, subsequently coupling both to a vocal synthesizer. The pilot study served to refine the experimental design and verify that the system captures meaningful differences between the two controllers. Results revealed large, observable differences in the ability of each controller to generate nonlinear articulatory trajectories, both quantitatively and qualitatively. These findings support a planned controlled user study with a larger and more diverse participant pool, aimed at providing statistically valid assessments of the proposed controller's effectiveness for smooth trajectory generation.

Citation

Debasish Mohapatra, Ziyi Xia, and Sidney Fels. 2026. Beyond Direct Geometry: Spring-Mass Control of Tongue Articulation for Vocal Synthesis. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.20784470 [PDF]

BibTeX Entry

@inproceedings{nime2026_156,
 abstract = {Human speech production relies on tightly coupled neuromuscular control of articulators and the aeroacoustic properties of the vocal tract. Vocal synthesizers employing direct geometric control of articulatory positions often struggle to generate smooth nonlinear trajectories between target vowels, as required for diphthong synthesis. We propose a biomechanically inspired control approach using a lightweight spring--mass--damper framework coupled to an acoustic wave solver, in which spring forces are parameterized to generate target tongue shapes. This physics-based interface enables synthesis through an input modality analogous to natural muscle activation. We conducted a pilot study comparing the proposed physics-based controller with a conventional geometry-driven controller on identical trajectory-generation tasks, subsequently coupling both to a vocal synthesizer. The pilot study served to refine the experimental design and verify that the system captures meaningful differences between the two controllers. Results revealed large, observable differences in the ability of each controller to generate nonlinear articulatory trajectories, both quantitatively and qualitatively. These findings support a planned controlled user study with a larger and more diverse participant pool, aimed at providing statistically valid assessments of the proposed controller's effectiveness for smooth trajectory generation.},
 address = {London, United Kingdom},
 articleno = {156},
 author = {Debasish Mohapatra and Ziyi Xia and Sidney Fels},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.20784470},
 editor = {Benedict Gaster and João Tragtenberg and Anna Xambó and Tom Mitchell},
 issn = {2220-4806},
 month = {June},
 note = {},
 numpages = {4},
 pages = {1263--1266},
 presentation-video = {https://youtu.be/2vOX7R0BHrs},
 title = {Beyond Direct Geometry: Spring-Mass Control of Tongue Articulation for Vocal Synthesis},
 track = {paper},
 url = {http://nime.org/proceedings/2026/nime2026_156.pdf},
 year = {2026}
}