Zathras-1 the time stretching technology we developed for BpmDj. The name itself came from Babylon 5. Zathras, an alien who helps the B5 crew warp time.
Time strecthing: or how to change the speed/length of an audiotrack ? If the audio were simply stretched by interpolating missing samples, or shrunk by throwing away samples, then its pitch would change. Slower speed results would result in a lower pitch. Higher speeds would result in a higher pitch.
To modify the length of a track without altering its pitch is a difficult business. Most tools you will find use some form of overlap and add. That is, they will extract small segments from the audio, and place them at the correct position in the outputstream. Each segment has the correct pitch, so overlapping multiple of them should result in a non-pitch shifted longer/shorter sound. The problem with this is that transients are literally repeated.
An approach that does not suffer this problem is the phase vocoder. By investigating the spectrum of the input sound, and realigning the phases, so that multiple frames correctly piece together, it is possible to lengthen the sound. The problem with that one is that the result sounds flanged and that it is difficult to keep the volumes of the various lead sounds in tact. Transients will sound smeared.
To mediate this problem we resorted to sinusoidal modelling. By correctly analyzing the musical content from the input, and synthesizing it into the output, we are able to successfully modify the speed of an audiotrack, without altering its pitch.
The creation of Zathras-1 took about 9 months. Initially I Started working on timestretching at Des Pudels Kern (makers of Audiotool). In 2011 we studied a variety of overlap-and-add techniques. To improve the results we even started to create a grain-compiler to make timestretching in the time domain reasonable efficient (August 2012). When the company threw everbody out, I kept on working on the problem but resorted to the spectrum.
In December 2013, I started looking into phase vocoding by means of an explicit calculation of a standard vocoder (see heterodyning). In February and March 2013 I made a breakthrough wrt to the analysis stage. Yet I could not use this breakthrough to actually timestretch audio. In May 2015 I tried again yet, again I hit the wall. Finally in August 2015 I was able to realize a working timestretcher. From then until October 2015 I spend time optimizing the hell out of it.
The optimization stage was interesting because I learned a few thing. First of all C++ is not much faster than Java. At about 2.7% speed increase, it is often not worth the trouble. And the misery one has with C++ memory allocation/deallocation and unsafe pointers is certainly such that I would advice everybody to keep programming modern programming languages.
Currently Zathras-1 can timestretch 14.7 stereo tracks (44.1kHz) simultaneously on a 1.4Ghz computer.
In the audio demo you hear the following songs in sequence. The speed of the track changes with a period of 10 seconds.
- Voodoo People – The prodigy
- Memories – Sara Brightman
- Macarena – Los Del Rio
- Business – Eminem
- Electroplasm – Shpongle
- Asche Zu Asche – Rammstein
- Jotunheim – Therion
- Noises – AstralProjection
- AskTheMountains – Vangelis
- Not Going Home (Eric Prydz Remix) – Faithless
- One – Metallica
- On The Boardwalk – Gramatik-Beatz&PiecesVol1
- Born Slippy (Nuxx) – Underworld
- Two Stroke Engine – Two Stroke
The research paper that describes the math behind our timestretcher can be found at http://werner.yellowcouch.org/Papers/zathras15/