Speech interface for building musical score collections


Building machine readable collections of musical scores is a tedious and time consuming task. The most common interface for performing music data entry is a mouse and toolbar system; using the mouse, the user selects a rhythm (note shape) from a toolbar, then drags the note to the correct position on the staff. We compare the usability of a hybrid speech and mouse-driven interface to a traditional mouse-driven one. The speech-enhanced interface allows users to enter note rhythms by voice, while still using the mouse to indicate pitches. While task completion time is nearly the same, users (N = 13) significantly preferred the speech-augmented interface. A second study using the first two authors of this paper (N = 2) indicates that experienced users can enter music 11% faster with the speech interface. Many users expressed a desire to enter pitches, as well as rhythms, by speech. A third study, however, shows that the recognizer is unable to reliably distinguish among A, B, C, D, E, F and G (N = 10).

Document Type


Publication Date


Journal Title

Proceedings of the ACM International Conference on Digital Libraries