ANATOMY OF A BARE-METAL SYNTH, PART 4
This is the fourth part in a series about a bare metal synth. It assumes knowledge from the last post about pin multiplexing and the Daisy UART driver from the previous post.
Daisy Synth Code
We’ve now reached the part of the Daisy diagram where we get to do something interesting with the UART MIDI signal described in the last few posts.
We can use the Daisy MIDI library to parse the UART signal as MIDI note-on and note-off messages and generate audio samples to send to the audio hardware based on those messages.
This synth combines a few Daisy examples (specifically PolyOsc and PluckEcho) with some edits to give us polyphony, a nice reverb, and an ADSR envelope / filter on each voice’s oscillator.
The source code for this is pretty small, a couple hundred lines, and lives here.
Here’s what the synth sounds like as well:
There are several high level components that make up the program:
DaisySeed seed: The interface that libDaisy exposes to talk to the crucial parts of the hardware on the Daisy Seed. In this program, we use it to talk to Daisy’s audio codec.
MidiHandler midi: The interface to the UART hardware on the Daisy that listens for UART messages and translates them to MIDI that we can use in our program
VoiceManager<24> voice_handler: A manager of all the voices active in our program. Each voice has an oscillator, and the manager handles passing our MIDI messages to those voices and generating audio that represents all active voices
ReverbSc verb: A Sean Costello reverb module from DaisySP that we run the
voice_handleraudio through to mix with reverb
Program Entry Point
We’ll start going over the program by taking a look at it’s entry point,
main begins by setting up the
seed hardware. The configure and init calls set up the core functionality of the Daisy, such as
SDRAM, QSPI, the audio codec, USB, and some others. UART and MIDI are handled separately (since not all applications use them)
so we separately initialize the MIDI handler, which internally initializes the USART1 peripheral, multiplexes the GPIOs to receive
MIDI, and sets up a DMA interrupt to populate the MIDI handler with incoming messages.
After the initial hardware setup, we initialize the two main effect components of our program: the reverb and the voice handler. The reverb just requires that we pass in some initial params, and the voice handler allocates and initializes all 24 of its internal voices.
Now that our effects and hardware are initialized, we are ready to ask the
seed to start sending us audio interrupts and the
interface to send us MIDI messages that come in on our GPIO. Note that the
seed will call the function we pass in,
whenever an audio interrupt happens, which happens when the audio hardware needs more samples (more on that later).
The last part of our program runs indefinitely until the program is quit. Essentially we ask for any MIDI messages that have come in on the UART driver, and we respond to those messages whenever they arrive. Let’s dive into that MIDI handling.
AudioCallback is not called explicitly by our code – remember that this is called externally by the audio hardware,
whereas we are the ones explicitly driving the MIDI callback.
Let’s talk about the callback that happens when we receive a MIDI message. All told, this is pretty much the bare minimum MIDI handling for a playable synth. If interested, there is a table of all the MIDI messages here. We only respond to Note On and Note Off messages that come in; everything else is dropped on the floor.
Really the only thing we do is inform the voice handler that a note on or off happened, which note was affected, and what the velocity was. The voice handler uses this to start or stop notes on different voices.
The voice handler itself does not generate any audio, it just wrangles available voices. The number of voices available is set at compile time via a
template argument that we pass in. Currently it is set at
24, which means 24 individual notes can be played at once. When the polyphony is reached,
no new notes will sound until an existing note is silenced via note off or the release portion of an existing note finishes. In code, that looks like the following:
See that when we get a note on, we look to find if any voices are available; if so, tell that voice to start servicing the note. When we get a note off, we go through all the voices and trigger note off behavior for every voice servicing the current note.
The last interesting part of the handler is the
Again, no direct dealing with oscillators/filters etc – this is the function exposed to the audio callback as we’ll see later, and it just sums up the audio according to each individual voice.
The next part to dig into is how each individual voice functions.
Each voice is made up of three DSP objects made available by the DaisySP signal processing library, described in the following comments:
Oscillator osc_: The oscillator generates the actual tone of the voice at the given MIDI note. It can generate many types of signals; we chose the polyblep triangle wave configuration.
Svf filt_: The state variable filter controls the harmonic contour of the oscillator, producing outputs for several types of filters. We use the low pass output.
Adsr env_: The envelope controls the amplitude contour of each “note” generated by the oscillator, producing an attack, decay, sustain, release (ADSR) envelope for each note on.
Some more reading here if interested in the implementations of these components:
The only time these components are used directly in our MIDI handling is when a note on comes in –
we set the frequency of the oscillator at this time according to the given note. (Using the
mtof convenience function to convert
a MIDI message to a frequency in Hz). The rest of the handling simply sets flags that are used in the process function to drive the other components.
The last interesting part to check out here is the process function, where we use those flags we just mentioned.
Not a lot of code! But let’s break down each section a bit. First, we service the envelope:
If we received a note on for this voice,
env_gate_ will be true. As long as this is true, subsequent calls to the
Process call will work through the attack, decay, and sustain portions of the envelope. When we receive
a note off and
env_gate_ becomes false, the final “release” part of the envelope will be triggered. And if no new note on
comes in, after release finishes, we will set
active_ to false as an optimization to simply pass audio samples of
0.0f until we
receive another note on. The envelope’s
Process itself gives us a coefficient
amp which we will use to modify the amplitude (volume)
of the oscillator for this particular sample.
Next, we service the oscillator:
sig is the next raw sample in the oscillator’s waveform that we are outputting for the current pitch (MIDI note). The oscillator itself
continually generates audio samples; it doesn’t know that we are using it to create notes using the envelope. (In other words, the envelope and
tone generation are decoupled; the oscillator is always running at full volume and we determine the current actual volume later by multiplying by a handful of coefficients.)
Then we run the oscillator sample through our filter:
We pass that raw sample of the polyblep tone into the SVF (state variable filter), which runs the signal through its filter coefficients and stores the result of each of its internal filter outputs for use later.
Then we get the actual output sample from all those pieces-parts:
We take the low pass output sample that we just generated from the oscillator’s current sample run through the filter, and multiply it by two
volume coefficients: one is the
velocity_ of the current MIDI note as a value between 0-127 (usually corresponding to how hard the user hit the key) normalized to the 0.0-1.0 range and the
amp coefficient that we generated before, which corresponds to where we are in the current envelope.
Now that we have covered the internals of the synth, let’s look at the final piece: the
As mentioned before, the
AudioCallback is a function with a special signature that we pass to the audio driver. When the audio driver
wants more output samples, it interrupts our program with this call so that we can fill the
See the previous post for a discussion about interrupts.
We can break this one down as well. The callback starts by initializing our variables. The left and right channels are named just for convenience/clarity, rather than accessing the buffer’s first and second elements directly throughout. Then we initialize the sample that we will use for the dry (mono) synth signal, the send to the reverb (whose amplitude we want to attenuate to mix with the dry signal), and the stereo wet samples coming out of the reverb that we will mix with the dry signals.
Next we will loop to generate all of the requested samples. This is a standard pattern you’ll see in every audio application; in this case, we are processing both left and right channels at once since we know we will always be stereo, but it’s common to also see an inner loop here to dynamically process a requested number of channels.
In the loop, we first grab the latest input sample from the voice handler. As we discussed before, this is the cumulative sum of all active voices. We cut it in half to avoid clipping, since we will sum this mono dry signal with stereo reverb signals in a moment.
The “send” is just the dry signal, attenuated by a little less than half. We pass that signal into the reverb module, which is an algorthmic FDN-based reverb based on a design by Sean Costello. From that we get a left and right stereo reverb frame.
Finally, we sum the left and right reverb samples with the dry sample and put it in the output buffer that gets sent to the audio codec.
And that’s pretty much it! Next up we will go over where that audio goes after it is generated by our application.