Creative Control Strategies: Audio-Controlled Synthesis

Converting Audio to MIDI + CV

Eldar T · 07/15/21

When it comes to choosing a way to interact with synthesizers, perhaps the last thing that comes to mind for many of us is a plain old audio signal. After all, with such an abundance of sequencers, keyboards, and idiosyncratic controllers of all shapes and sizes, why would we even want to resort to something as intangible as an audio signal? Well, there may be many reasons—some pragmatically functional, others purely experimental, and a whole lot that fall somewhere in between. For example, let's say you don't play any musical instruments, but have great control over your voice, and sincerely wish to make music with synthesizers. Conversely, perhaps you are an expert flautist or guitarist or bassist, and wish to expand the default sonic palette of your instrument with an endless variety of synthesized tones. One solution to both of these scenarios requires a conversion of the source audio into CV/Gate or MIDI signals—allowing you to control a synthesizer directly from your instrument (or voice!) of choice.

Being an instrumentalist myself who is deeply interested in the world of electronic sound, I've been continuously researching and experimenting with different ways of combining the two elements. Understanding that I'm not the only one interested in this topic, I've put this article together with the intention of providing a glimpse into the devices, strategies, and techniques involved in audio-to-synth control applications. While it will certainly be very useful to reconstruct the historic timeline of the technology, and we will surely attempt to do this in the later paragraphs, for the sake of clarity, let's start by identifying general concepts involved in converting audio into control signals.

Dissecting an Audio Signal

Any given audio signal bears plenty of useful information about its content that, given the right tools, can be extracted and repurposed for control over synthesizer parameters. Transients, changes in amplitude, frequency, noise amount, as well as harmonic content (brightness)—these all have the potential to be distilled and turned into independent control signals. Some of these processes are simpler and more common than others, and of course, the complexity of an audio signal itself plays a vital role in the success of the results. The methods can be implemented in both analog and digital domains—however, it is worth noting that the increasing power of the DSP provides more fruitful ground for accurate signal analysis, and as such, more consistent and reliable outcomes.

Out of all of the possible approaches, the real-time detection of pitch and loudness shape are perhaps the most interesting—after all, these are the basic building blocks of most music and sound-focused artistic practices. Thus, for keeping this article short and to the point, we will primarily focus on them. (But don't worry—we'll discuss spectral analysis techniques in the near future as well!)

For detecting loudness, the most widely used tools in this arena are envelope followers and comparators. The former copies the shape of the input signal's amplitude and outputs it as a control voltage curve. When paired with an envelope follower, a comparator is very effective as an onset detector that sends out a gate every time a signal reaches a certain pre-defined (or even a changing) threshold. In the modular synth world, envelope followers and comparators are often paired together into a single circuit for the sake of convenience.

Detecting the frequency or pitch of a signal is a more demanding task, and without special hardware customization of the instrument (to be discussed later), it is usually most successful with monophonic instruments, and works significantly better with signals that have simpler harmonic structures. Polyphonic pitch tracking is not out of the realm of possibilities, however; because players of one of the most popular instruments of 20th century—the electric guitar—created a demand for full control over synthesizers with their instrument, a number of inventions surfaced that tackled this problem, and in some cases with quite successful results.

Now that we've established the basic concepts, before we go any further with technicalities and techniques, let's take a slight detour to browse through some highlights from the evolution of audio controlled synthesis.

The Evolution of Audio Controlled Synthesis

Buchla Sili-con Cello—image via The Audities Foundation Buchla Sili-con Cello—image via The Audities Foundation

There seems to be no way to be certain about who came up with the idea to convert audio into control signals for musical applications first, and when, yet some of the earliest commercial devices to tackle this concept emerged as early as late 60s early 70s. Both Donald Buchla and Robert Moog featured envelope follower modules in their early modular systems from the 1960s. In as early as February 1972 Buchla finished a more advanced design—the 232 Frequency Detector module, a 200-series module designed specifically for converting both the loudness and the frequency of incoming audio signals into control voltages. Because the design of the module was based on phase-locked loop circuit (we'll look into PLLs later in the article), it also features a pulse wave oscillator output. A few years later, Buchla ventured to create an experimental audio-to-synth control instrument/musical piece for his then-wife and accomplished Bay Area cellist and mathematician Ami Radunskaya—the Sili-Con Cello (1979). Sili-Con Cello consisted of five 200-series modules, and a special one-of-a-kind AI binary counter that would analyze the musician's performance, and reinterpret that data to produce musical responses from the synthesizer. (Side note: eventually, c. 2002, the breadboard-based "AI binary counter" was reprised as the 298 Sili Con Cellosax module—only a couple of units made in order to accommodate performances without needing to ship the original instrument.)

Perhaps the first synthesizer to gain wide mainstream success that had an audio-to-control voltage feature built-in was the illustrious Korg MS-20 released in 1978. Its External Signal Processor (ESP) function offered the combination of an envelope follower, gate detector, and frequency to voltage converter. This paired nicely with the semi-modular nature of the instrument, and its overall characterful tone. A user could connect a microphone or an instrument to the MS-20, and flexibly route the extracted control signal to any available parameter. No wonder this synthesizer remains a popular choice for many musicians even 40 years after its original release.

Instruction from the Avatar manual on how to install the proprietary pickup. The position of the hex pickup remains the same for modern guitar synthesizer systems Instruction from the Avatar manual on how to install the proprietary pickup. The position of the hex pickup remains the same for modern guitar synthesizer systems

In the same year, ARP released the Avatar, an Odyssey-inspired synthesizer specifically designed to be controlled by an audio signal coming from a guitar. In order to operate the synth, a special hexaphonic pickup had to be installed on the guitar, allowing for independent audio stream from each string. This was a clever idea to improve tracking and to eliminate the issues extracting individual notes from a single polyphonic audio stream, yet in didn't perform that well. Moreover, the release of Avatar was disastrous for the company, and many have cited it as a major contributor to ARP's demise. In the years since, Avatar has been embraced by many users as a great alternative to the Odyssey, essentially offering exactly the same features at a cheaper price. The Avatar can be heard on a number of historical recordings, including "You Shouldn't-Nuf Bit Fish" by George Clinton, and "Tough Boys" by Pete Townshend.

However, the company that really embarked on a journey to revolutionize guitar synthesizer technology was the young Roland Corporation. Released in 1977, GR-500 was the company's first guitar synthesizer system, manufactured in partnership with FujiGen, who developed the GS-500—a specially modified electric guitar controller outfitted with a hexaphonic pickup that connected to the synth module via Roland's proprietary 24-pin interface. The knobs of the guitar controller were used to adjust the levels of different sections: guitar, polyensemble, bass, solo section, and external synth. It was also equipped with a comprehensive infinite sustain system that effectively worked like six independent EBows. GR-500 also featured Roland's first "fundamental generator" chip for pitch-to-voltage applications.

Roland's SPV-355 Roland's SPV-355

Then, in 1979, the company released SPV355—a dedicated rack mountable pitch-to-voltage synth. The unit's pitch-tracking capabilities were of varied success, and depended largely on the instrument it was interfaced with. While it was primarily targeted at guitarists, in reality extracting control voltage from vibrating strings by all accounts wasn't its strength, and it worked much better with other sound sources like vocals and a handful of wind instruments. This "minor" shortcoming in performance didn't stop this quirky machine from appearing in many recording studios. What is more important, though, is that the SPV355 was a major stepping stone for the company in audio-controlled synthesis endeavours, and in the following years Roland would perfect the a point where, in some circles, the brand's name has become almost synonymous with guitar synthesizers.

Throughout the '80s, the GR-series underwent a number of revisions, with each new version adapting to the rapid advancement of technology. GR-300, released in 1979, took the form of a foot-controlled floor unit, yet it still required a dedicated guitar-controller. In 1985, the company released the GR-700, which boasted the ability to communicate over the newly-developed MIDI protocol, and with the same guts as Roland MKS-30 and Jupiter JX-3P, it was the first model in the GR series to use a sound engine based on the structure of a keyboard synth. Then in 1986, Roland introduced GK-1—a pickup system that could be fit on any guitar to be used with the brand's guitar synthesizers. Also in the mid-80s, the company introduced a rack mountable guitar-to-MIDI converter (the GM-70) that delivered surprisingly effective results. And in 1992, Roland introduced the potent combo of GK-2 pickup and GR-1 synthesizer, featuring onboard effects and a flexible sampling/synthesis engine. Modern variants GK-3 and GR-55 demonstrate that this line of products is still continuously updated by the manufacturer, with every new release gaining more and more features and performance improvements. (Side note: if you want to check out some of the bizarre things you can do with a Roland guitar synth—or guitars in general—we strongly recommend checking out Nick Reinhart of bombastic math rock outfit Tera Melos. This guy is wild and has a remarkable sense of humor built into his playing and sound design.)

Audio Controlled Synthesis Today

Roland's line of guitar synthesizers remains quite possibly the most stable platform for polyphonic audio controlled synthesis. There are only a handful of options these days that do a decent job at this, such as Fishman's Triple Play pickup/interface, some dedicated guitar synthesizer pedals like Boss SY‑1000 (which is Roland-related, and definitely benefits from a hexaphonic pickup), Electro Harmonix POG, Mel9, B9, Meris Enzo, and Jam Origin's MIDI Guitar 2, on the software side of things.

At this point you may have noticed that pretty much all of the powerful commercially available solutions for polyphonic pitch-tracking were and still are designed primarily for guitars. Luckily, polyphony is not a prerequisite for having fun and getting great results with audio controlled synthesis. Using different combinations of sound sources and synths/processors opens up a vista of new sonic territories, and in the modular synth formats specifically, the possibilities are nearly endless—as you get a full control over almost every single detail of the synthesizer. So now that we are past the hurdle of identifying the best options for handling polyphonic signals from guitars, we can dive into the variety of modern hardware and software tools that are worth exploring.

If your ears and heart are longing for a state-of-the-art analog audio controlled synthesizer, then perhaps there are no better options than the UniSyn from Second Sound. It is based on a purpose-built ACO design (audio controlled oscillator) that combines the flavor of an analog VCO with the precision and accuracy of digital pitch tracking. Outfitted with responsive envelope followers, variable waveforms, flavorful filters, and other exciting features, it can be used a complete synth controlled by your instrument of choice. Additionally, it can interface with external gear via CV/Gate outputs, and in the case of the UniSyn also MIDI—so you could use these boxes to translate control information over to the synthesizer of your choice. And best of all, they work with any old audio signal—it accepts 1/4" and XLR inputs, and even features a phantom power option for condenser microphones.

For the world of Eurorack, Sonicsmith has also recently developed a pair of audio to synth modules— ConVertor E1, and MIDIVertor E1—thus allowing you to design your own custom audio controlled synthesizer in modular format. The Expert Sleepers Disting Mk4 and Super Disting EX Plus Alpha also include a pitch tracking algorithm, but it works best for monophonic sounds with simpler harmonic structure, so to use it successfully with something like a guitar it is important to filter the signal prior to sending it into the pitch tracker and to remain very mindful of your technique as you play.

Aside from dedicated pitch-trackers, a variable amount of success can be achieved with the aforementioned phase-locked loop circuits. Essentially, PLLs are systems that generate an output with its phase matched to that of an input signal, and as such, the frequencies match as well. A typical phase locked loop consists of three elements: a phase detector which receives a signal in its input, a low pass filter, and a VCO that both produces the final output, and is fed back into the phase detector. In this configuration, the pitch of the VCO on the output stage follows, or rather attempts to follow the frequency of the input signal, and as such its frequency control becomes more like a timbre control. While at certain settings the results of pitch tracking can be surprisingly accurate, when the frequency of the oscillator is set too high or too low, it results in sonically interesting "breaks" and "tears" that can be incredibly fun to experiment with. Eurorack and other modular formats PLLs can be patched together using rudimentary modules like ring modulators, VCOs, LPFs and/or slew limiters—however there are a few modules either dedicated to or capable of PLL functionality, including Doepfer's A-196, Make Noise's Wogglebug, and the now-discontinued (but nonetheless legendary) WMD Synchrodyne. Also it is worth mentioning that there are several effects pedals whose design is based entirely on misusing PLLs creatively: Mantic's Flex Pro, Earthquaker Devices's Data Corrupter, and Glou-Glou's Moutarde Extra Forte et al.

As for envelope followers, some good examples in Eurorack are XAOC Devices Sewastopol II, Boredbrain Injectr, Doepfer A-119, and Steady State Fate Detect-Rx. Koma Elektronik's Field Kit and Make Noise's Strega and Maths also sport envelope followers in their design, and both are overall great tools for all kinds of sonic experiments.

As you may imagine, if the world of hardware has such a plethora of options in this field there must be some equivalently awesome software counterparts—and indeed, there are some. Possibly one of the most exciting dedicated audio controlled software synths is Virta from Madrona Labs. Semi-modular in nature, it offers a multitude of options for creative synthesizer control via audio that go beyond traditional synth leads and wonky bass sounds. A big reason for this is a cleverly-assembled collection of modules featuring two VCOs, formant vocoder filled with several distinct algorithms, and pitch-shifter/delay.

Audio programming platforms like Max, Pure Data, and Supercollider are excellent frameworks for working with audio-to-synth applications, and there are freely available objects and libraries created specifically for audio analysis, pitch, and loudness detection. You will also easily find audio-to-control maxforlive devices if Ableton Live is your preferred music-making platform. Bitwig Studio with its immensely modulatable interface and even deeper Grid environment is a great tool for such applications too. On top of that, the virtual modular environment VCV Rack has a wealth of envelope followers, and a very good pitch-to-CV converter inside the Nysthi library. So as you may see, the options are bountiful.

Tips and Tricks

Besides dedicated guitar synthesizers that only work with those special hexaphonic pickups, the majority of other options mentioned can be creatively used with a variety of audio sources. As stated earlier, the simpler the harmonic content of the sound source, the more accurate the results will be, albeit glitches and inaccuracies in some situations may be just what you may want. For example, the charm of Keith Fullerton Whitman's Playthroughs is in many ways facilitated by mishaps in pitch tracking a guitar via a custom Max/MSP patch. Thus, occasionally the best results are achieved by embracing the errors rather than trying to fix or avoid them.

Embracing glitches in pitch detection

If precision is really important for your applications, there are a few things that can be done to the source signal to improve tracking. Essentially, the more narrow the bandwidth of information you provide to pitch trackers, the better they behave. Thus using a bandpass filter before the signal enters a pitch-tracker can dramatically improve the results. In the Korg MS-20 synthesizer, a bandpass filter is implemented in the audio processing circuit for that exact reason. Another interesting approach can be extrapolated from the design of Buchla's 232 Frequency detector module. In that case, the inventor placed a sample and hold circuit right after the frequency detection stage, which is triggered by the gate derived from the envelope follower—a clever approach to prevent noise and other unwanted signals below a certain loudness threshold from passing to the voltage output.

Another worthwhile thing to do is to make sure that the signal level is optimized before it enters the pitch tracking stage. Inconsistencies in dynamics can cause a lot of issues, and using some compression may prove to be very useful for evening out the signal level.

UniSyn + dry signal with a bit of delay for sweetening

Last but not least, is the playing technique itself. Often in order to get the best results from pitch trackers we need to adjust our playing technique. The more precise and articulated your playing style is, the better will be the outcome. Having more distinctly isolated notes as you play may feel counterintuitive, but it really helps when dealing with easily-confused pitch-trackers.

So far we've looked primarily at linear approaches in audio-to-synth applications, meaning that the mapping of extracted voltages to synth parameters is 1:1—pitch is routed to an oscillator frequency input, and the amplitude envelope controls the level of a VCA, for instance. Although this already sets the playground for tons of joyous jamming, things really get exciting when you start getting creative with audio derived control voltages. For example, instead of triggering an envelope with the extracted gate signal, use it to move through the steps of a sequencer. Connect the outputs of that sequencer to control a voice or a combination of voices, and now instead of expecting the synthesizer to repeat everything you play exactly, you can engage in a duet performance with the machine. Setting up a system where what and how you play affects the musical response from the machine can be a very meaningful and engaging approach to both compose and perform music. Jin Hi Kim's work with electric komungo, and Sarah Belle Reid's trumpet-controlled experiments such as "Collectible Rectangles" seem to fit here as examples of such interactive systems quite well.

When an onset is detected, it forces two sequencers to advance—one immediately, and another one delayed. Additionally, each onset triggers four synth voices and samples sequentially

Smashing results can be obtained by using envelope follower / onset detector modules, as well as other trigger-to-MIDI technology with percussive sources. In such a setups, you can program your synthesizer to respond and change behavior based on the rhythmic hits. Depending on the synthesizer patch, this can be used subtly to combine acoustic percussion with electronics, or more elaborately to engage and disengage complex chains of interaction if desired. Some good musical examples of this include Eli Keszler's Stadium, Greg Fox's Gradual Progression, RRUCCULLA's SHuSH, David Rosenboom's Zones of Influence, and Booker Stardum's Temporary etc​.

On the more utilitarian side of applications, envelope followers can also be used for ducking and sidechaining effects. For example, you can send the output of your kick drum into the input of the envelope follower, and another sustained sound into a VCA. Then you reverse the output of the envelope follower using an attenuverter/polarizer and route the result into the CV input of the VCA. Voila, now every time the kick drum hits the output of the sustained sound will be proportionally and dynamically attenuated.

Complex patch made with Madrona Labs Virta. The synth acts as a rhythmic accompaniment to guitar

Following the same principles, instead of sending the CV from a pitch tracker directly into the 1V/Oct of you oscillator, try using it to control some other parameter in the patch like filter cutoff, delay time, clock rate, reverb amount, or anything else that could potentially be fun to interact with by following the ascending and descending pitches of your instrument.

You can also route your CV signal into a multiple, and then use the copies to control a variety of parameters in your patch—effectively causing a multitude of changes every time you hit a new note. Attenuators and polarizers are also very helpful utilities here, allowing you to create several related but different variations of the control voltage signal.

One of my favorite complements to pitch trackers are analog shift register circuits. Essentially a chain of sample and hold circuits, ASRs receive a control voltage at the input, and pass it to a series of outputs sequentially with every trigger/gate. This creates an interesting effect of having a variety of related CV signals—all based on the same pitch of your instrument—that are never the same simultaneously. Combining pitch trackers with ASRs is an excellent way to indulge in creation of complex evolving patches for days. Great ASR options in Eurorack are Intellijel's Shifty, and the Verbos Electronics Random Sampling modules. There's no shortage of interesting ASR techniques out there—I'd suggest checking out our most recent article about synthesizer polyphony for some creative ideas.

As you may see, there is more to be done with audio signals besides simply listening to them, and experimenting with how you use these control signals can be very rewarding. While we have by no means reached technological perfection in the realm of audio controlled synthesis, it is nevertheless a powerful method for interacting with electronic music equipment, and holds a load of creative potential. We hope that this article has inspired you to try your own audio controlled synth patches, and if you have already been investigating this field before, perhaps you've discovered a couple of new ideas here that could improve your results and/or workflow. Stay tuned for more creative control strategies in future articles!