Concatenative Sound Synthesis

Sanna Wager (Ph.D. student, Indiana University SoIC), Liang Chen (Ph.D. candidate, Indiana University SoIC), Minje Kim (Indiana University SoIC), Christopher Raphael (Indiana University SoIC)


This research aims to translate an expressive gesture performed on one musical instrument to another. The goal is to match the target instrument's gestures as closely as possible while preserving the natural characteristics of the source instrument. The synthesis is generated from a sequence of audio samples from a database of performances on the source instrument. Applications of this method include information retrieval using audio queries, source separation by humming/singing, and humming-to-instrument synthesis.

Here is the ICASSP 2017 paper.

Violin-to-bassoon synthesis

A violin performance is synthesized on the bassoon with the goal of matching the violin's musical gestures. The example below demonstrates the synthesis program applied to the opening of the Sibelius violin concerto. The bassoon database was generated using recordings by a professional bassoonist (many thanks to Professor Kathleen McLean, professor of bassoon at the Indiana University Jacobs School of Music). For comparison, recordings are added of the same bassoonist performing the Sibelius excerpt, imitating the violin's expressive gestures, and of a Garage Band synthesis of the same melody.

Target violin performance

Synthesized bassoon

Bassoon imitation of violin performance

Garage Band synthesis of bassoon performance