Token Telephone

Hugo Flores Garcia, and Stephan Moore

Proceedings of the International Conference on New Interfaces for Musical Expression

Abstract

Token Telephone is a co-creative AI sound installation. Participants enter a space equipped with a microphone and a quartet of generative sound neural networks, each represented by a loudspeaker. Upon vocalizing into the microphone, the participants' utterance is transformed into neural acoustic tokens and played back, initiating a game of telephone between the neural networks. Each network encodes, processes and reconstructs the sound, distorting the original utterance into new textures guided by the network's training data. The newly reconfigured sound is then passed to the next network/loudspeaker in a clockwise direction, and the process repeats. The sound produced by the fourth network is passed back to the first network in the cycle, creating a feedback loop wherein the original utterance incrementally loses all of its original characteristics and disintegrates into textures that reflect the inherent biases of the generative models in play. In time, the resonant properties of the processes are revealed in front of the participant. Inspired by the popular children's game of telephone, Token Telephone illuminates the gradual formation of hallucinations through the iterative processing and re-processing of audio, reflecting the biases introduced by the model's understanding of sound objects, as well as the data that was provided to it.

Citation

Hugo Flores Garcia, and Stephan Moore. 2024. Token Telephone. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.15027154

BibTeX Entry

@article{nime2024_installations_3,
 abstract = {Token Telephone is a co-creative AI sound installation. Participants enter a space equipped with a microphone and a quartet of generative sound neural networks, each represented by a loudspeaker. Upon vocalizing into the microphone, the participants' utterance is transformed into neural acoustic tokens and played back, initiating a game of telephone between the neural networks. Each network encodes, processes and reconstructs the sound, distorting the original utterance into new textures guided by the network's training data. The newly reconfigured sound is then passed to the next network/loudspeaker in a clockwise direction, and the process repeats. The sound produced by the fourth network is passed back to the first network in the cycle, creating a feedback loop wherein the original utterance incrementally loses all of its original characteristics and disintegrates into textures that reflect the inherent biases of the generative models in play. In time, the resonant properties of the processes are revealed in front of the participant. Inspired by the popular children's game of telephone, Token Telephone illuminates the gradual formation of hallucinations through the iterative processing and re-processing of audio, reflecting the biases introduced by the model's understanding of sound objects, as well as the data that was provided to it.  },
 address = {Utrecht, Netherlands},
 articleno = {3},
 author = {Hugo Flores Garcia and Stephan Moore},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.15027154},
 editor = {Laurel Smith Pardue and Palle Dahlstedt},
 issn = {2220-4806},
 month = {September},
 numpages = {4},
 pages = {10--13},
 presentation-video = {https://youtu.be/vEaYoEgtSUo},
 title = {Token Telephone},
 track = {Installations},
 url = {http://nime.org/proceedings/2024/nime2024_installations_3.pdf},
 year = {2024}
}