Support System for Improvisational Ensemble Based on Long Short-Term Memory Using Smartphone Sensor

Haruya Takase, and Shun Shiramatsu

Proceedings of the International Conference on New Interfaces for Musical Expression

Abstract:

Our goal is to develop an improvisational ensemble support system for music beginners who do not have knowledge of chord progressions and do not have enough experience of playing an instrument. We hypothesized that a music beginner cannot determine tonal pitches of melody over a particular chord but can use body movements to specify the pitch contour (i.e., melodic outline) and the attack timings (i.e., rhythm). We aim to realize a performance interface for supporting expressing intuitive pitch contour and attack timings using body motion and outputting harmonious pitches over the chord progression of the background music. Since the intended users of this system are not limited to people with music experience, we plan to develop a system that uses Android smartphones, which many people have. Our system consists of three modules: a module for specifying attack timing using smartphone sensors, module for estimating the vertical movement of the smartphone using smartphone sensors, and module for estimating the sound height using smartphone vertical movement and background chord progression. Each estimation module is developed using long short-term memory (LSTM), which is often used to estimate time series data. We conduct evaluation experiments for each module. As a result, the attack timing estimation had zero misjudgments, and the mean error time of the estimated attack timing was smaller than the sensor-acquisition interval. The accuracy of the vertical motion estimation was 64%, and that of the pitch estimation was 7.6%. The results indicate that the attack timing is accurate enough, but the vertical motion estimation and the pitch estimation need to be improved for actual use.

Citation:

Haruya Takase, and Shun Shiramatsu. 2020. Support System for Improvisational Ensemble Based on Long Short-Term Memory Using Smartphone Sensor. Proceedings of the International Conference on New Interfaces for Musical Expression. DOI: 10.5281/zenodo.4813434

BibTeX Entry:

  @inproceedings{NIME20_77,
 abstract = {Our goal is to develop an improvisational ensemble support system for music beginners who do not have knowledge of chord progressions and do not have enough experience of playing an instrument. We hypothesized that a music beginner cannot determine tonal pitches of melody over a particular chord but can use body movements to specify the pitch contour (i.e., melodic outline) and the attack timings (i.e., rhythm). We aim to realize a performance interface for supporting expressing intuitive pitch contour and attack timings using body motion and outputting harmonious pitches over the chord progression of the background music. Since the intended users of this system are not limited to people with music experience, we plan to develop a system that uses Android smartphones, which many people have. Our system consists of three modules: a module for specifying attack timing using smartphone sensors, module for estimating the vertical movement of the smartphone using smartphone sensors, and module for estimating the sound height using smartphone vertical movement and background chord progression. Each estimation module is developed using long short-term memory (LSTM), which is often used to estimate time series data. We conduct evaluation experiments for each module. As a result, the attack timing estimation had zero misjudgments, and the mean error time of the estimated attack timing was smaller than the sensor-acquisition interval. The accuracy of the vertical motion estimation was 64%, and that of the pitch estimation was 7.6%. The results indicate that the attack timing is accurate enough, but the vertical motion estimation and the pitch estimation need to be improved for actual use.},
 address = {Birmingham, UK},
 author = {Takase, Haruya and Shiramatsu, Shun},
 booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
 doi = {10.5281/zenodo.4813434},
 editor = {Romain Michon and Franziska Schroeder},
 issn = {2220-4806},
 month = {July},
 pages = {394--398},
 presentation-video = {https://youtu.be/WhrGhas9Cvc},
 publisher = {Birmingham City University},
 title = {Support System for Improvisational Ensemble Based on Long Short-Term Memory Using Smartphone Sensor},
 url = {https://www.nime.org/proceedings/2020/nime2020_paper77.pdf},
 year = {2020}
}