The definitive MIDI controller | This is not rocket science

RPC

On USB latency

Every so often someone mentions the dreaded USB latency. MIDI is MIDI, and USB is USB, do we need to mix both, and can that work reliably?

So let’s say we want to provide a Raspberry Pi with some USB MIDI connectivity, and run a software synth to produce sound. Since you would need some kind of a MIDI-to-USB conversion box in between, that must inherently produce some kind of sluggishness. Right?

Even worse, let’s run a sequencer on the Raspberry Pi, so that it must receive MIDI notes and knob twists played on a keyboard, and as a response send them to some other synthesizer to produce sound. That must be twice as bad, as the MIDI notes go once through the USB conversion to the Pi, and a second time back.

USB MIDI latency diagram

So how bad is that then? We can measure it!

The Arduino works fine as a simple MIDI device. Its serial ports can be easily configured for the 31250 bps bitrate needed by MIDI, and you only need a few extra components for interfacing with other MIDI devices. It’s not too difficult to program the Arduino to first send some MIDI commands to the Pi, and then make it wait for a reply, and at the same time measure the time that transaction takes to finish.

MIDI latency measurement setup

It’s an Arduino Mega 2560 under the Ethernet shield there! This is a good moment to reuse those MIDI boards from before… Here is the schematic of the MIDI interface electronics. Click for a larger picture.

MIDI UART schematic

Use only 4N28 optocouplers or similar with this circuit! 6N138 and PC-900 require a different schematic.

The MIDI0_RXD and MIDI0_TXD wires can be directly connected to the Arduino. The separate driver IC (74AC04 hex inverter suggested here) is not absolutely necessary, so those parts of the schematic are faded out. VCC for this schematic is 3.3V, and for 5V from a standard Arduino you only need to replace the two 56 ohm resistors with 220 ohm instead. If you look carefully, you can see the resistor values in the picture below.

The interface is very simple, and the MIDI interface boards I made earlier look empty with just the mandatory components populated… I soldered wires from the Arduino UART directly to the bottom of the board, on the pins of the through-hole components.

midi_latency_3

The program for the Arduino goes something like this. I left the timer functions out, get the whole source file instead!

bool measuring = false;

void setup() {
  // initialize UARTs
  
  // Arduino serial monitor
  Serial.begin(115200);
  
  // MIDI in/out
  Serial1.begin(31250);
}

void loop() {
  if (!measuring) {
    // not measuring yet, go!
    timer_start();
    measuring = true;
    
    // start sending a MIDI CLOCK byte
    char c = 0xF8;
    Serial1.write(c);
  }
  else {
    if (timer_overflow()) {
      // no reply before timer overflowed
      Serial.println("timeout");
      measuring = false;
    }
  }
}

void printTime(long time)
{
  // convert timer count to 0.1 milliseconds
  time = time * 10000 / timer_ticks_per_sec;
  Serial.print(time / 10);
  Serial.print(".");
  Serial.print(time % 10);
  Serial.println("ms");
}

void serialEvent1() {
  // if measurement was started, finish and report time
  if (measuring) {
    long endtime = timer_count();
    printTime(endtime);
    measuring = false;
  }
  
  // flush all input
  while (Serial1.available()) {
    Serial1.read();
  }
}

In short, the program tries to send one-byte MIDI clock commands through the MIDI out, waits for a reply, and prints the time in milliseconds from starting of the output to the end of the reception of the reply. It doesn’t care about whether it gets the correct data back… You’re welcome to add that part yourself for practice. :-)

First test should always be to connect the latency tester to itself in loopback, with a single MIDI cable, from output to input. This way I measured about 0.4 milliseconds round-trip time, sometimes 0.3. The MIDI bitrate is 31250 bits per second, and a single transmitted byte includes one start bit and one stop bit, 10 bits total. A single MIDI output port can therefore transmit 31250/10 = 3125 bytes per second, or 1/3125 = 0.0032 seconds = 3.2 milliseconds per byte. Hey, that matches what we measured, the program always sends just one byte at a time! Looking good. And unplugging the cable gives the message “timeout” as expected.

The Edirol MIDI interface I used here has a OUT/THRU switch for each of its inputs, allowing direct output of whatever it received on the input. With the switch engaged, I also measured 0.4 ms round-trip time. So the switch probably just connects the input port directly electrically to the output port.

Now it gets more interesting. With the USB MIDI interface plugged in and configured for both input and output, I start JACK on the Pi, using just the internal audio device since I don’t really care about the audio side for this experiment. Then in Patchage I just hooked the MIDI input directly to the MIDI output, in a loopback configuration. The MIDI commands from the Arduino will be received by the Linux software, the Jack2 daemon specifically, and sent back out as quickly as possible.

midi_loopback_patchage

The result: 1.4 ms round-trip time! Since it takes 0.32 ms to transmit a byte, and that must happen in order (first in, then out) the overhead of the data passing through USB drivers, the Linux kernel, and the userspace software is 1.4 – 2*0.32 = 0.76 milliseconds. Not bad! An ideal hardware sequencer could only react 0.76 milliseconds faster than the Pi. There is some variance in the results, sometimes it takes 2.0 to 2.3 ms to get a reply. Maybe it would be a good idea to add some kind of average measurement to the program.

The software running on the Pi would of course have to process the input it receives, and prepare the output, but an optimized program should not add much overhead on top of these measurements. For reference, I got the same 1.4 ms minimum latency with the exact same setup on my much more powerful desktop Linux PC, but overall the measured time varied much more, often peaking around 5 ms probably due to Firefox and other programs running on the background.

But any Ethernet traffic or disk I/O immediately degrades the performance on the Raspberry Pi. Connecting to the Pi remotely with SSH and running some simple “ls” and “cd” commands immediately causes the round-trip times to jump up to 10 ms. That’s bad! A real-time kernel might help with the disk I/O, that could be worth investigating further. This page on the Linux-Sound wiki gives another clue: the Ethernet controller is connected to the CPU core over USB. Any Ethernet traffic may steal some bandwidth from all other USB communication. In the end it’s probably easier to just disable all unnecessary peripherals, as instructed on the Linux-Sound wiki.

Still the result is very reassuring: if the Pi is able to respond to MIDI in less than 2 milliseconds, it should be perfectly capable of using standard USB MIDI devices while running some sequencer software.

Next time I’ll measure the round-trip time of the RPC for comparison…


ATmega2560 as an SPI slave

SPI is an inter-ic bus, usually applied to connect MCUs with other peripherals. The usual setup consists of four synchronously controlled wires, with the bus master called “master” and the responding device called “slave”. The protocol is extensively documented.

The SPI hardware implementation on the AVR is rather curious. The way the datasheet puts it, the interface is single buffered in transmit direction and double buffered in receive direction. By “double buffered” they mean that the byte that was just completely received is immediately moved to a second register, and is available to be read while the next byte is already streaming in. The program interested in the data may then retrieve the incoming byte whenever it suits it, as long as it does it before the next byte transfer has finished.

(The bus captures were taken with one of these USBee logic analysers from DX.)

input_ready

But in transmit direction the controller is not double buffered. Writes to the transmit register are immediately reflected in the state of the output, instead of being stored for later use. If the program writes into the transmit register while a byte transfer is ongoing, the bits that actually get sent out on the wire will be partially from the first register value, and the second part from the updated value. Therefore it is necessary for the value in the register to stay constant (and correct) the whole time of the transmission of that byte.

If the device is to transmit a particular stream of bytes, the program code running on the processor must update the register value between the individual bytes on the bus. And in case of a slave device, that means after the previous byte has been finished, before the master starts clocking in/out the next byte. Another interesting detail is that in the slave device, the received bytes are used as default data to output, at least on the ATmega2560. If the program does not update the data register SPDR, it will still contain the previously received byte when the transfer of the next byte begins, and the AVR will simply start transmitting that byte.

Luckily the master devices often have a short pause between individual bytes, so that there’s some time to set the data register value, and in the case of the Raspberry Pi, this is indeed also true.

output_ready

In a bus master device this controller setup is convenient, as the bus can be clocked (by the master itself) whenever data is ready to be sent out, and double buffering for received data is not strictly necessary. The program can always read the received byte before initiating a new transfe.r But when the AVR is used in a slave, the transmit register update timing becomes a bottleneck; the master device cannot know how much time the slave needs to prepare the next data byte. But how bad can that be?

Taking the RPC as an example, the Raspberry Pi is the SPI bus master, and sends bursts of bytes to the Arduino. The bursts contain data that the ATmega MCU on the Arduino buffers and then feeds to the MIDI devices connected to its serial bus UART outputs (contrasted with the SPI bus, at a much slower bit rate). The ATmega also receives bytes from the serial MIDI inputs, processes the incoming MIDI commands, and possibly queues them for transmission to the Raspberry Pi for further processing. Below is a diagram of the setup.

rpc_queues

The whole idea of this structure is to allow the Arduino to run its other tasks while the Raspberry Pi is not talking to it. The sparser and shorter the SPI transmission bursts get, the more time the Arduino has to for doing work between them. Ideally you’d then transfer data at the highest clock frequency you possibly can, so that you could quickly get over with your interruption.

databurst

How does the ATmega program then deal with the SPI hardware? The ATmega datasheet only gives this short piece of code as a clue:

char SPI_SlaveReceive(void)
{
  /* Wait for reception complete */
  while(!(SPSR & (1 << SPIF)))
    ;
  /* Return Data Register */
  return SPDR;
}

(Yes, the Arduino libraries provide much nicer wrappers for this, but I’m going to ignore them here.)

Interestingly enough, the AVR instruction set has the instructions SBIS/SBIC that are meant as a quick way of testing bits in I/O registers, but those can’t be used with SPI on the ATmega2560 in particular. The SPSR register is placed too high in the I/O space to be reachable! Instead of

Wait_Transmit:
  sbis    SPSR, SPIF
  rjmp    Wait_Transmit

something more like this is needed, increasing the number of cycles by one per iteration:

Wait_Transmit:
  in      r0, SPSR
  sbrs    r0, SPIF
  rjmp    Wait_Transmit

More instructions in the loop of course means longer latency before the program gets to react to it. The worst case latency with this loop is 7 clocks (if I got it right), if the execution goes like this:

  in      r0, SPSR      ; SPIF bit is still off
  sbrs    r0, SPIF      ; 1) now SPIF is set, but we already missed it
  rjmp    Wait_Transmit ; 2) jump
Wait_Transmit:
  in      r0, SPSR      ; 1) now we read the new value with SPIF set
  sbrs    r0, SPIF      ; 3) now bit is on, but we just missed it
  rjmp    Wait_Transmit ; 0) no jump
                        ; ----
                        ; 7 clocks

Disregarding the loop latency, this approach is surely fine if you know that a byte is incoming, and if you have all the time in the world to spend waiting for it. It won’t work in the RPC: instead of just idly waiting there in the while loop, the RPC actually has work to do instead. Also, since the Arduino is the SPI slave, it must be available exactly then whe master wants to talk to it (i.e. it had better get into that while loop at the right time, or some of the received data could be lost.) Worse still, the SPI transmit buffer register has to be set properly on time, or the master will get garbage back. This calls for some preemption.

The AVR SPI controller can be configured to interrupt the CPU the moment a byte transfer is finished, as explained in the AVR151 application note and in much more detail at least on the rocketnumbernine blog. The fun starts when you implement this.

Most examples that I could find to describe the interrupt handlers suggested that the received byte(s) could be processed immediately in the interrupt handler, and nearly all of them received bytes one at a time, returning from the interrupt handler in between bytes. The closest to a proper analysis of a high performance SPI handler I could find was on matuschek.net, and I found another interesting writeup in Avian’s blog, but both were only dealing with a master device, and both suggested a loop to process the data. I got curious about what was really causing the slowness of the interrupt handlers; also I didn’t see a way around having the SPI transmission get initiated by an interrupt in the RPC.

interrupt_latency2

The problems appeared when going above 200 kHz bus clock speeds, and only got worse the higher the transfer rate got. The picture above shows a common issue: a missing bit. The cursor shows where the ‘1’ bit should have been. Sometimes the first bits (MSB) would be incorrectly high, sometimes low, depending on the last received byte.

interrupt_latency

I used an extra pin as an output to find the exact timing of the the interrupt handler and the time it took to finish, by pulling the pin up while the the handler was active. It turned out that exactly at those moments where the output was corrupted, the interrupt handler started to run too late. But it would be late much more often: in most cases, as in the picture above, the activity debug pin would show that the handler is late even though there is no data to be sent.

The first thing the handler did was to pull the ACT pin high. And already then it was late to react to the SPI byte burst; setting SPDR would no longer help, it had simply missed the first bit! But I did see the interrupt get triggered correctly also, as most of the time it would. Something else must have blocked it. Indeed I had also other interrupts in use, two for each UART, eight in total. So I rewrote those UART interrupt handlers, and that helped somewhat, but still the SPI interrupt would be late every now and then at higher bus clock rates. It also takes some time to set up and read the send queue before the first byte can be transmitted, on top of the interrupt latency.

To make the story short, I found that a very simple solution was to process all of the incoming bytes in the interrupt handler, in an unrolled loop, waiting for the SPIF just like in the polling examples, and to quit the loop and the handler if the SS line went high. Something like this:

ISR (SPI_STC_vect)
{
  // first byte has already been sent
  SPDR = 0; // second byte to transmit
  spi_receive_queue.insert(SPDR); // keep received byte
  
  bool didtransmit = false;
  while (1)
  {
    // Wait for transfer to finish
    while (!(SPSR & (1 << SPIF))
    {
      if (PINB & 1)
      {
        // SS is high
        SPDR = 0;
        return;
      }
    }

    if (didtransmit)
    {
      spibuf_send.popbyte(); // remove sent byte from queue
    }

    // Set output, get input
    SPDR = spibuf_send.peekbyte(); // read but don't pop frontmost byte
    spibuf_receive.insert(SPDR);
    didtransmit = true;
  }
}

The first two bytes were hard to exploit, so I just had them set to zero. The SPI interrupt handler would run with interrupts disabled, and as long as the SPIF-high-detection was fast enough, all the data bytes starting with the third were stable.

Due to the interrupt latency the second byte was not reliable to use, the handler might miss one or two MSBs.

The first byte was also difficult, and could only be reliably transmitted if SPDR was set early enough. Since the SPI slave would simply starting send on the bus whatever was in the SPDR when the master was clocking the bus, it looked like the register would have to be set before the transmission begins. The register could indeed be set at the end of the last interrupt with the value of the next byte from the transmit queue. Or if the queue was empty, it could be set to zero, and updated whenever a byte was added in the queue from the main program… bah, it got really difficult beyond that, with atomic updates and whatnot. It was much easier to just leave also the first byte zero, by clearing SPDR at the end of each SPI interrupt.

But that wasn’t enough. The compiled code was too slow. It would do unnecessary things and in the wrong order. The compiler always wanted to push all the clobbered registers on the stack before being able to set SPDR initially. It would use unnecessarily many registers, resulting in more push and pop operations than what seemed necessary. It didn’t know how to make use of the fact that my queues were aligned in memory to generate faster code.

So I wrote it in assembly. And it worked fine. But it wasn’t nice. And I found another trick that should have been completely obvious. I could hook the SS line itself: in addition to its role as SS, it worked also fine as PCINT0!

pcint0

The Raspberry Pi leaves a generous amount of time between SS-low and the beginning of the first byte. I could use that time to both set the SPDR to some useful initial value and to prepare for sending the next byte. The next critical moment was only when the first byte had been completely received (and transmitted), and SPDR had to be reloaded. With the new approach there was plenty of time to initialize all the registers and fetch the first data byte.

For the first byte I chose to send out the total number of bytes in queue, to help schedule further requests on the Raspberry side; 8 AVR instructions, total some 10 clocks plus interrupt latency. Then the handler pushes the values in all the rest of the needed registers, gets the first data byte to send out from the queue, and waits for SPIF. On SPIF it immediately (almost…) outputs the next byte and spends some time updating queue indices and storing the received byte, before repeating the loop to wait for the next SPIF. At the 2 MHz bus speed that I aimed for, it would still sometimes take too long for the SPIF to be detected, indicating that the SPIF/SS loop was still too slow. So I unrolled it out asymmetrically, with SS (LSB of PINB) checked less often than SPIF.

Here’s how it looks like now… The unrolling is perhaps slightly excessive.

ISR (PCINT0_vect, ISR_NAKED)
{
	__asm__(
//		"sbi	0x11, 1		$" // set ACT pin

		// prepare work registers
		"push	r24		$"
		"push	r0 		$"
		
		// store SREG
		"in	r24, 0x3f	$"
		"push	r24		$"
		
		// no other registers used yet
		
		"lds	r24, spibuf_send+1	$" // output buffer read pos
		"lds	r0, spibuf_send		$" // output buffer write pos
		
		// 1st byte to transmit is the number of bytes in queue
		"sub	r0, r24		$"
		"out	0x2e, r0	$"
		
		// now there's plenty of time to do the rest of the preparation
		
		// was #SS low at all?
		"in     r0, 0x03	$" // PINB
		"sbrc	r0, 0		$" // (SPSR & SPIF)
		"rjmp	5f		$" // if (PINB & 1) = #SS is high, quit
		
		// push the rest of the registers
		"push	r22		$"
		"push	r23		$"
		"push	r25		$"
		"push	r26		$"
		"push	r27		$"
		"push	r28		$"
		"push	r29		$"
		"push	r30		$"

		// retrieve fifo pointers
		"lds	r26, spibuf_send+1	$" // output buffer read pos
		"lds	r30, spibuf_send	$" // output buffer write pos
		"ldi	r27, hi8(spisend_b)	$" // load output buffer address
		"lds	r28, spibuf_receive	$" // input buffer write pos
		"ldi	r29, hi8(spireceive_b)	$" // load input buffer address

		// there is no input yet, skip first part of loop
		"jmp	6f			$"

		// main receive loop
		"4:				$"
		"in	r24, 0x2e		$" // get Nth byte in

		// update output buffer read pos: one byte has been consumed
		// r26: buffer read pos after the transmit that is about to start
		// r23: buffer read pos after the transmit that just finished
		"sts	spibuf_send+1, r23	$"

		// keep input byte
		"st	Y, r24			$" // store input byte
		"inc	r28			$" // increment write pos
		"sts	spibuf_receive, r28	$" // input buffer write pos

		"6:"
		// if the xmit finishes, the output buffer read pos must be
		// updated; the xmit means that the previous transmitted
		// byte was transferred correctly, not the one we are preparing
		// here, and therefore the previous value of r26 is kept safe
		// in r23.
		"mov	r23, r26	$" // output buffer read pos
		
		// retrieve next byte to be transmitted
		"clr	r25		$" // zero if no data in queue
		"cp	r30, r26	$" // got data in send queue?
		"breq	1f		$" // jump if not
		"ld	r25, X		$" // read byte to output
		"inc	r26		$" // increment index
		"1:			$"

//		"cbi	0x11, 1		$" // clear ACT pin

		// wait for next byte (SPIF) or end of transmission
		// have to react quickly to SPIF, so interleave
		// #SS checks with multiple SPIF checks
		"1:			$"
	
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2

		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2
		
		"in     r0, 0x03	$" // PINB
		
		"in	r24, 0x2d	$" // SPSR
		"sbrc	r24, 7		$" // (SPSR & SPIF)
		"rjmp	2f		$" // if SPIF is set, goto 2

		"sbrc	r0, 0		$" // (PINB & 1) = #SS
		"rjmp	3f		$" // if #SS high, goto 3

		"in	r24, 0x2d	$" // SPSR
		"sbrs	r24, 7		$" // (SPSR & SPIF)
		"rjmp	1b		$" // if SPIF is cleared, goto 1
		"2:			$"

		// previous xmit is done, do the next one
		"out	0x2e, r25	$" // set Nth byte out
//		"sbi	0x11, 1		$" // set ACT pin
		"jmp	4b		$" // most critical part is done, loop

		// arrive here if SS high
		"3:			$"
//		"sbi	0x11, 1		$" // set ACT pin
		"out	0x2e, r1	$" // clear SPDR just in case
		
		// teardown
		"pop	r30		$"
		"pop	r29		$"
		"pop	r28		$"
		"pop	r27		$"
		"pop	r26		$"
		"pop	r25		$"
		"pop	r23		$"
		"pop	r22		$"

		// arrive here if SS high before first byte
		"5:			$"
		"pop	r0		$" // return old SREG
		"out	0x3f, r0	$" //
		"pop	r0		$"
		"pop	r24		$"
		"sbi	0x1b, 0		$" // ack PCINT0
//		"cbi	0x11, 1		$" // clear ACT pin
		"reti			$"
	       );
}

This code currently works perfectly at 2 MHz. At the moment I don’t think it’s possible to make it fast enough for 4 MHz, but there were also some other issues at that bitrate. It was as if the SPI hardware itself was too slow and was losing bits; perhaps my signal wires are not good enough. But 2 MHz is plenty enough for the mere 16 MHz ATmega2560!


The RPC hardware

The first RPC build is finally done!

front

back

Click the pictures for bigger versions. The photo below is a quick shot of the system up and running. The next step of the project is to implement some rudimentary software for the Raspberry Pi. I was thinking of starting off by writing a small arpeggiator first…

powered

Yes, it says “Hello, world!” on the screen. Look closer.


The internals turned out surprisingly clean, given that there’s quite some wiring going on in there. Click for high-resolution pictures.

open_angle

open_top

The Raspberry Pi is placed on the left, and the Arduino Mega 2560 on the right. They are connected by both USB and SPI, but USB is only usable for power supply and reflashing; I’m using all the serial ports of the ATmega for MIDI, so the Arduino USB port cannot be used for communications! The 3.3V/5V SPI bus level conversion and power distribution board is placed in between them (the devices use different operating voltages, remember the architecture?) The yellow stripboards near the back are the MIDI inputs (left) and outputs (right). The single rightmost board is the 5V power supply.

The USB hub is visible at the very bottom on the left, bound to the front panel with a wooden fastener plate. I tried hard to keep the front panel clean, so the only visible elements in addition to the USB ports are the 8 leds and the power switch. On the back I used an RJ45 Ethernet coupler to tidy up the cabling, and a HDMI/VGA converter for the VGA display.

Finally, after taking these pictures, I attached similar wooden fastener bars behind the MIDI ports to give them some more strength. Otherwise plugging and unplugging devices would quickly tear off the sockets and break the PCBs.

I made a few small mistakes, for example the MIDI port holes in the back of the enclosure aren’t perfectly aligned with the sockets. The signal ground connections between the boards are far from ideal. The USB cable from the Raspberry Pi to the Arduino is way too long, because I didn’t have a suitable shorter cable at hand. There’s too little space around the Raspberry Pi to mount it properly, and I couldn’t fit one mounting fastener in because of that. But all of these can be easily fixed for the second revision.

As I wrote earlier, the enclosure was designed by myself and manufactured by Protocase, exactly per my specification. And the quality is very nice: the measurements are perfect and there are no sharp corners anywhere. The case is made of steel, and it has a good feel of solidity to it.

All in all, it’s working perfectly! Success!


Unpacking RPC

The RPC enclosure arrived from the factory! Here’s some unpacking pictures, more to come…

The quality of the work and finish looks indeed very nice!

IMG_3052

IMG_3054

IMG_3058

IMG_3059


RPC architecture

The RPC is built from bottom up as a precise and reliable MIDI sequencer. It has of an autonomous MIDI core capable of processing 4 serial MIDI inputs and outputs and a master SPI bus, controlled by the internal high-frequency timers of the Arduino Mega 2560.

The core can generate a MIDI clock or synchronize with an external clock, and implements a full scheduling facility with MIDI message priority levels, allowing for ideal MIDI message timing even during bus contention. Messages can be routed in real-time directly from any MIDI port or the SPI bus to any other port and filtered against a table of rules. All messages are automatically interleaved with other traffic on the same buses.

The RPC core takes the role of an SPI slave device. The SPI bus protocol is designed upon the idea that the SPI master is very likely running sequencer software, and can thus buffer in song data over SPI slightly ahead of time. The RPC SPI bus protocol is designed with a relatively high level of abstraction, hiding details of internal memory organization and data management.

rpc design

The MIDI core is controlled by the Raspberry Pi sequencer. The sequencer handles all user interaction and also supports a number of different methods of generative MIDI control, such as arpeggiators, chord harmonies and creative transponations and transformations in diatonic scales. Thanks to the separate autonomous MIDI core, the real-time requirements for the sequencer are much less strict than those of a traditional software sequencer. The choice of (G)UI for the RPC will be the tracker. Songs can be saved on the internal SD card or exported on USB sticks.

The sequencer will most likely be extensible with Python.

The Raspberry Pi hardware has such a great amount of features that care must be taken to make best use of them. For this project only a limited feature set is actually used: the audio interfaces are neglected to provide as good MIDI performance as possible. The Pi is packaged into the RPC and is not meant to be “in the open” for hacking (but of course you can just open the cover of the box and do whatever you like with it.)

The primary goal of this project is to build a unit that is reliable, robust and resistant to physical disturbances such as power loss. The secondary goal is to provide a smooth user experience and to integrate the device as part of a live system, either as the centerpiece of a studio setup (producing MIDI clock for other devices) or as an instrument (receiving MIDI clock and controlling one or more synthesizers).


RPC enclosure design

I spent some time playing with the Protocase Designer, and finished the first sketch design for the RPC rack enclosure!

Just to remind you, there are two projects going on in parallel at the moment that I’ve documented on this site. This enclosure is for the RPC: a MIDI sequencer/tracker built of common parts. The heart of the RPC is the Arduino Mega 2560, pumping data and clock sync through four MIDI inputs and outputs, and alongside a Raspberry Pi takes care of the graphical user interface and other high-level activities such as feeding song data to the Arduino.

RPC enclosure rev1

The cutouts in the front are for 4 USB ports (mouse, keyboard, Akai MPD controller, USB stick perhaps?) and a power switch, and in the back for the power cable+fuse, 4+4 MIDI ports, VGA connector for the display and Ethernet for whatnot.


Raspberry Pi cross-compilation

In the end it took several hours to set up a working cross-compilation environment for the RPi. The one that worked was the Vagrant/VirtualBox+QEmu setup from https://github.com/nickhutchinson/raspberry-devbox .

The instructions are rather short and self-explanatory, and what you end up with is a Vagrant (VirtualBox) VM with a Scratchbox/QEmu cross-compiler environment inside. Nice! Even though it’s a bit slow at least with the default 380MB memory share it gets, it was able to build Armstrong and the UI projects without any tricks.

The other compilers I tried were the QT cross-compilation environment from http://qt-project.org/wiki/RaspberryPi_Beginners_guide, which I got to compile everything, but the binaries simply wouldn’t run on the device. Some problem with the instruction set dialect or floating-point support, I guess. I also tried http://www.bootc.net/archives/2012/05/26/how-to-build-a-cross-compiler-for-your-raspberry-pi/ but couldn’t get the linker paths to work out right. Armstrong depends on several libraries and fbgui on even more; those I copied for these experiments as binaries from the RPi box itself after installing the packages there with apt-get.

With the Vagrant raspberry-devbox all of this was trivial: just boot the Vagrant VM, jump into the scratchbox as root, and apt-get all the necessary packages. And packages you need: this was the final list (plus dependencies I didn’t include here, but apt-get would find automatically) that worked:

libjack-dev libjack0 libportmidi0 libportmidi-dev libboost-dev libsndfile1-dev libsndfile1 libboost1.49-dev libboost-dev liblua5.2-0 liblua5.2-dev libsdl1.2debian libsdl1.2-dev autoconf libsqlite3-dev

The build command-line turned out to be something like this:

CFLAGS="-I../target/include -L../target/lib" CPPFLAGS="-I../target/include -L../target/lib" CXXFLAGS="-I../target/include -L../target/lib" ./configure --prefix=/home/vagrant/rpc-buze/target && make && make install

Nothing out of the ordinary then, as it should be. After pushing the binaries from target/ to the RPi, it just works!

rpc-buze first build