RPiDAC - producing audio without a sound card

Excuse me…?

Yea, you read right. How to produce sound using a Raspberry Pi, without using its integrated sound card.

This idea was originally created for an IoT course I took; I thought it would be amusing for wider audiences as well, so I decided to write a blog post about it as well.

How does that ever work?

Picture of the construct

As someone may guess, this is a fairly typical, 5-bit R-2R resistor ladder. The same kind of a construct was used, for example, in the Covox Speech Thing. By manipulating digital current inputs, a signal of changing voltage can be generated easily.

For this specific implementation, BCM pins 17, 27, 22, 13, 19 were used, ordered from least significant to most significant.

Handwritten schematics

Where did this idea originate from.

The origin of this idea has a backstory. I had some time prior bought some random electronics components and a breadboard, but had not found use for them yet before this. Looking through the components, I happened to stumble upon a pair of buzzers; one being an active buzzer with an integrated oscillator and another a passive one without one.

Whilst testing those with the Pi, a wish was expressed that I’d play something more musical. And that’s how this got started.

Software implementation.

Essentially, what is required from the software is that it can read an audio file in a suitable format, and then appropriately switch the proper GPIO pins.

For this, a special 5-bit format was invented; the file is purely raw audio data, but the 5-bit words are split over the native 8-bit words with special provisions for appropriate encoding and decoding. Later on, this format will be termed using the name 5da.

Raw to 5da converter

# This is a simple 5-bit audio converter tool
# Its only task is to convert read bytes into a 5-bit format, which is both easy to decode and saves space.
# It expects to receive a stream from Standard Input, with following properties:
#    - Every read 8-bit word is a single sample, ranging from 0-255, with 127 being the center/zero point (no current in either direction)
#    - Audio data is properly centered, otherwise the conversion will be distorted.
#
require 'optparse'

class AudioConv
  def initialize(options)
    @options = options
  end

  # Convert from input to output stream
  # @param input Input stream
  # @param output Output stream
  def convert_from_stream(input, output)
    # Due to the file format, we must read inputs in batches of 8; when we have read 8, we can generate 5 full-length (8-bit) words, which include eight 5-bit samples 
    eof = false

    until eof
      data_array = []
      while data_array.length < 8
        begin
          byte = input.readbyte()
          data_array << byte
        rescue EOFError
          data_array = (data_array + [0,0,0,0,0,0,0,0])[0..7]
          eof = true
        end
      end

      # Sufficient data exists now
      conversion = convert_8_sample_to_5(data_array)
      conversion.each do |sample|
        output.write([sample].pack("C"))
      end
    end
  end

  def convert_8_sample_to_5(input)
    # Retrieve values first;
    boolean_arr = input.map {|input| bit_set_for_sample(input)}.flatten
    # We now have a list containing exactly 40 bits

    # Observe this byte layout:
    # 12345123 45123451 23451234 51234512 34512345
    # 87654321 87654321 87654321 87654321 87654321
    #
    # So, if we want a series where each native byte has value '128', we must convert [8, 64, 0, 16, 128, 0, 32, 0], observing the scaling by 8 (to fit into 5 bits)

    # Interpreting the input so that for each 8-bit byte, the leftmost bit is the most significant bit.
    sample_arr = []
    (0..4).each do |i|
      bool_subset = boolean_arr[(i*8)..(i*8)+7]
      sample_arr << ((bool_subset[0] ? 128 : 0) + (bool_subset[1] ? 64 : 0) + (bool_subset[2] ? 32 : 0) + (bool_subset[3] ? 16 : 0) + (bool_subset[4] ? 8 : 0) + (bool_subset[5] ? 4 : 0) + (bool_subset[6] ? 2 : 0) + (bool_subset[7] ? 1 : 0))
    end

    return sample_arr
  end


  # Returns a 5-element list of booleans (least significant bit first)
  def bit_set_for_sample(sample)
    adjusted_sample = nil
    if (@options[:half_center])
      # If so configured, let's adjust the sample so that center point does not emit any sound
      adjusted_sample = (sample-128).abs
    else
      # Otherwise, let's assume that absolute zero means no audio
      adjusted_sample = sample
    end
    # Divide by 8 to scale appropriately.
    scaled = (sample / 8).to_i
    return [scaled & 1, scaled & 2, scaled & 4, scaled & 8, scaled & 16].map {|x| x > 0}
  end
end

if __FILE__==$0
  options = {} # Configure here.

  AudioConv.new(options).convert_from_stream(STDIN, STDOUT)
end

One can operate this tool, for example, with a suitable Bash script; this one uses Sox to convert audio into a suitable raw format, and compress them so that audio is a bit more audible.

#!/bin/bash
set -e
FILES="highvoltage imperialmarch sandstorm youspinmeround"
for f in $FILES
do
 (sox $f.wav -e unsigned-integer -b 8 -c 1 -r 7000 -t raw - compand 0.3,1 -90,-90,-70,-70,-60,-20,0,0 -10 0 0.2) > $f.pgraw
 (cat $f.pgraw | ruby ~/git/rpidac/ht/audioconv/audioconv.rb) > $f.5da
done

Naturally, we also need a method to play the files with. I developed a C program precisely for that purpose.

#include <wiringPi.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

// Observe altered numbering with WiringPi; on Adafruit Cobbler, this would be {17, 27, 22, 13, 19}
const int audio_pins[5] = {0,2,3,23,24};

// We will read values in blocks of 5 eight-bit words; each block contains 8 five-bit samples, LSB first. For that, we have a helper matrix of sorts.
// For each sample, we have 5 bit locations and masks; this will allow the program to locate the correct byte to read and inteprete, as native words do not exactly line up with our 5-bit words.
const int reading_byte_matrix[8][5][2] = {
 	{{0, 128}, {0, 64}, {0,32}, {0,16}, {0,8}},
	{{0, 4}, {0, 2}, {0,1}, {1, 128}, {1,64}},
	{{1, 32}, {1,16}, {1,8}, {1,4}, {1,2}},
	{{1,1}, {2,128}, {2,64}, {2,32}, {2,16}},
	{{2,8}, {2,4}, {2,2}, {2,1}, {3,128}},
	{{3,64}, {3,32}, {3,16}, {3,8}, {3,4}},
	{{3,2}, {3,1}, {4, 128}, {4,64}, {4,32}},
	{{4,16},{4,8},{4,4},{4,2},{4,1}}
};

void reset_pins() {
	int i;
	for (i = 0; i < 5; i++) digitalWrite(audio_pins[i], 0);
}

int main(int argc, char *argv[])
{
	wiringPiSetup(); // Set up WiringPi
	piHiPri(99); // We require high priority for our audio playing
	atexit(reset_pins); // Ensure that we zero out pins upon exit.
	// Initialize pins to output mode.
	int i;
	for (i = 0; i < sizeof(audio_pins); i++) {pinMode(audio_pins[i], OUTPUT);}

	// Let's read the file next
	if (argc < 2) {
		puts("Please enter a filename to play");
		exit(1);
	} else {
		struct stat st;
		// Try to retrieve file stats
		if (stat(argv[1], &st) != 0) {
			puts("Unable to determine file size!");
			exit(2);
		}
        // Ensure that this is a relatively regular file
 		if (!(S_ISREG(st.st_mode))) {
			puts("This is not a normal file! We cannot play this safely just yet");
			exit(3);
		}

		// OK, all good
		unsigned int buf_size = st.st_size + 6;
		uint8_t* file_data = (uint8_t*)calloc(buf_size, sizeof(uint8_t));
		if (file_data == NULL) {
			puts("Could not allocate memory!");
			exit(4);
		}
		// Try to ingest the entire file
		FILE* mfile = fopen(argv[1], "rb");
		if (mfile == NULL) {puts ("Unable to open file!"); exit(5);}

		size_t res;
		res = fread(file_data, 1, st.st_size, mfile);
		fclose(mfile);
		if (res != st.st_size) {puts("We were unable to read the entire file in! Exiting.."); exit(6);}

		// OK, whole file is loaded in now.
		// Let's start the play loop
		int read_offset = 0;
		puts("OK, playing..");
		for (read_offset = 0; read_offset < st.st_size; read_offset += 5) {
			// As specified above, we hop in 5-byte steps
			int sample_c;
			for (sample_c = 0; sample_c < 8; sample_c++) {
				// Change GPIO state
				int gpio_i;
				for (gpio_i = 0; gpio_i < 5; gpio_i++) {
					int byte_offset = reading_byte_matrix[sample_c][gpio_i][0];
					int mask = reading_byte_matrix[sample_c][gpio_i][1];
					int gpio_pin = audio_pins[gpio_i];

					// Set the value for a GPIO pin
					digitalWrite(gpio_pin, file_data[read_offset + byte_offset] & mask);
				}

				delayMicroseconds(130); // This is roughly 7000hz; let's ignore minor discrepancies
			}
		}
	}
	puts("Done!");
	return 0;
}

How about the results?

The results were pretty interesting. It was definitely clear from the start that the GPIO pins are probably not the best instruments for generating sound.. but that didn’t kill the project from the start.

There was reasonably audible background noise; however, it could be tuned with a potentiometer to remain within tolerable levels for the listener. A more substantial limiter was the 7000hz sample rate; it is perhaps possible to get a higher sample rate with more advanced programming, but it was out of scope for this project.

Most likely, the 5-bit precision was a significant limiter, removing a large range of nuances possible to play. It is also possible that both the GPIO ports and cheap Chinese resistors produce uneven voltage levels, which further distort the sound.

Nevertheless, it was possible to generate recognizable renditions of various songs. Particularly pieces designed for simple sound generators, like the NES SMB World 1 tune, played out quite well; whileas some other tracks, like Dead or Alive’s You Spin Me Round Like A Record fared substantially worse with its complex background rhythms and singing, but was still somewhat identifiable.

It was proven that the buzzer is an awful instrument - it could not render any sane frequency range properly - and in the end, headphones were used for actual listening tests.

Sadly, I did not have an opportunity to test the output with an oscilloscope. It would have granted me significant insight on the qualities displayed by the output.

Conclusion

Nevertheless, this was a fun project; perhaps not the most useful one, but it did make an amusing story and was fun both for myself and occasional live audience during course sessions. And now anyone can replicate this at will. I’ll be interested in knowing if someone else tries this, and actually gets the sound quality to improve!