ΑΡΙΘΜΟΣ: The Sound of F#

I'm doing a bit or reading and dabbling the in the world of sound with the hope that it will help me to tackle an interesting future project. One of the first steps to a successful project is to know more about what the project will actually entail. Right now I'm in exploratory mode. It's a bit like doing drills in sports or practicing simple songs from a book that are not really super interesting to you except for helping you learn the instrument you want to learn. I'm thinking about learning to "play" the signal processor, but it seemed like a good start to learn to produce signals first. And sound signals are what interest me at the moment.

Sound is pretty complicated, but we'll start with one of the most basic of sounds and what it takes to make them in Windows, writing in F#. Our goal in this post is to play sounds asynchronously in F# Interactive.

The first iteration of this comes from Shawn Kovac's response to his own question found here at Stack Overflow. Below is my port to F#. I have used the use keyword to let F# handle the Disposable interfaces on the stream and writer to take care of all the closing issues. It's important to note that the memory stream must remain open while the sound is playing, which means this routine is stuck as synchronous.

// ported from Shawn Kovac's response to his own question at: 
    // http://stackoverflow.com/questions/19672593/generate-morse-code-or-any-audio-in-net-c-or-vb-net-without-3rd-party-depe
    let PlaySound frequency msDuration volume =
        let TAU = 2.0 * System.Math.PI
        let formatChunkSize = 16
        let headerSize = 8
        let formatType = int16(1)
        let tracks = int16(1)
        let samplesPerSecond = 44100
        let bitsPerSample = int16(16)
        let frameSize = int16(tracks * ((bitsPerSample + int16(7)) / int16(8)))
        let bytesPerSecond = samplesPerSecond * int(frameSize)
        let waveSize = 4
        let samples = (samplesPerSecond * msDuration) / 1000
        let dataChunkSize = samples * int(frameSize)
        let fileSize = waveSize + headerSize + formatChunkSize + headerSize + dataChunkSize
        
        // var encoding = new System.Text.UTF8Encoding();
        use mStrm = new MemoryStream()
        use writer = new BinaryWriter(mStrm)
        writer.Write(0x46464952) // = encoding.GetBytes("RIFF")
        writer.Write(fileSize)
        writer.Write(0x45564157) // = encoding.GetBytes("WAVE")
        writer.Write(0x20746D66) // = encoding.GetBytes("fmt ")
        writer.Write(formatChunkSize)
        writer.Write(formatType)
        writer.Write(tracks)
        writer.Write(samplesPerSecond)
        writer.Write(bytesPerSecond)
        writer.Write(frameSize)
        writer.Write(bitsPerSample)
        writer.Write(0x61746164) // = encoding.GetBytes("data")
        writer.Write(dataChunkSize)
        
        let theta = frequency * TAU / (float samplesPerSecond)
        // 'volume' is UInt16 with range 0 thru Uint16.MaxValue ( = 65 535)
        // we need 'amp' to have the range of 0 thru Int16.MaxValue ( = 32 767)
        let amp = volume >>> 2 // so we simply set amp = volume / 2

for step = 0 to samples-1 do
            let s = int16(float(amp) * sin(theta * (float step)))
            writer.Write(s)

// set ourselves up at the beginning of the file
        mStrm.Seek(int64(0), SeekOrigin.Begin) |> ignore

// we use a player object which requires its memory stream to exist in order to play
        use player = new System.Media.SoundPlayer(mStrm)
        player.PlaySync()

To get an asynchronous sound we can shove the PlaySound function into a background thread and let it go. This looks like the following:

A limitation of the PlaySound() approach given above is that it limits your method of sound production. The details of the sound produced are buried inside the function that plays the sound. I don't want to have a bunch of separate routines: one that plays sine waves, one that plays saw tooth, one that plays chords with two notes, one that plays chords with three notes, one that include an experimental distortion guitar—whatever. The sound player shouldn't care about the details, it should just play the sound, whatever it is. (Not a criticism against Kovac, he had a narrower purpose than I do; his choice was valid for his purpose.)

I want to decouple the details of the sound to be produced from the formatting and playing of the sound. Here we have the basic formatting and playing of the sound packaged up that will take any wave form sequence we give it (sequence of integer values):

// suggested sample_rate = 44100
    let PlayWave sample_rate wave = 
        let TWOPI = 2.0 * System.Math.PI
        let formatChunkSize = 16
        let formatType = int16(1)
        let tracks = int16(1)
        let bitsPerSample = int16(16)
        let frameSize = int16(tracks * ((bitsPerSample + int16(7)) / int16(8)))
        let bytesPerSecond = sample_rate * int(frameSize)

use mStrm = new MemoryStream()
        use writer = new BinaryWriter(mStrm)
        // we write the initial byte of each item of data in the comment to the right
        // this is so we can come back later and write to two of these locations
        writer.Write(0x46464952)        // = encoding.GetBytes("RIFF"): 0
        writer.Write(0)                 // filesize needs to be calculated and filled in at byte 4
        writer.Write(0x45564157)        // = encoding.GetBytes("WAVE"): 8
        writer.Write(0x20746D66)        // = encoding.GetBytes("fmt "): 12
        writer.Write(formatChunkSize)   //: 16
        writer.Write(formatType)        //: 20
        writer.Write(tracks)            //: 22
        writer.Write(sample_rate)       //: 24
        writer.Write(bytesPerSecond)    //: 28
        writer.Write(frameSize)         //: 32
        writer.Write(bitsPerSample)     //: 34
        writer.Write(0x61746164)        // = encoding.GetBytes("data"): 36
        writer.Write(0)                 // dataChunkSize needs to be calculated and filled in: 40
        
        let mutable samples = 0

wave 
        |> Seq.iter (fun (s:int) -> 
                        writer.Write((int16 s))
                        samples <- samples + 1)

// fill in the datachunk
        mStrm.Seek(int64(40), SeekOrigin.Begin) |> ignore
        let dataChunkSize = samples * int(frameSize)
        writer.Write(dataChunkSize)

// fill in the filesize
        let filesize = 4 + (8 + formatChunkSize) + (8 + dataChunkSize)
        mStrm.Seek(int64(4), SeekOrigin.Begin) |> ignore
        writer.Write(filesize)

// set ourselves up at the beginning of the file to play it
        mStrm.Seek(int64(0), SeekOrigin.Begin) |> ignore

// we use a player object which requires its memory stream to exist in order to play
        use player = new System.Media.SoundPlayer(mStrm)
        player.PlaySync()

let PlayWaveAsync sample_rate wave =
        let soundThread = Thread((fun () -> PlayWave sample_rate wave))
        soundThread.IsBackground <- true
        soundThread.Start()

I haven't decoupled this as far as we could. If we wanted to handle multiple tracks, for example, you would need to take this abstraction a little further.

Now, we also need some functions for generating wave forms. Sequences are probably a good choice for this because you will not need to put them into memory until you are ready to write them to the memory stream in the PlayWave function. What exactly happens to sequence elements after they get iterated over for the purpose of writing to the MemoryStream, I'm not quite clear on. I know memory is allocated by MemoryStream, but whether the sequence elements continue to have an independent life for some time after they are iterated over, I'm not sure about. The ideal thing, of course, would be for them to come into existence at the moment they are to be copied into the memory stream (which they will, due to lazy loading of sequences) and then fade immediately into the bit bucket—forgotten the moment after they came into being.

Here are a few routines that relate to making chords that are not without an important caveat: the volume control is a bit tricky.

let ChordEOpen = seq[64; 71; 76; 80; 83; 88]
    let ChordAOpen = seq[69; 76; 81; 85; 88]

// duration is in seconds
    let sinewave volume freq_hertz sample_rate duration =
        let TWOPI = 2.0 * System.Math.PI
        let timestep = 1.0 / (float sample_rate)
        let steps = int(floor(duration / timestep))
        let theta = freq_hertz * TWOPI * timestep
        seq { for i = 1 to steps do 
                yield volume * sin(theta * float(i)) }

let frequency note =         
        440.0 * exp(float(note - 69)/12.0 * log(2.0))

// produce sine wave form for several different notes
    // volume is the volume of each individual note; 
    // it is up to the caller to deal with the possible need for compression on the resultant signal
    // or to keep the volume under control from the start
    let sineChord volume notes sample_rate duration = 
        notes
        |> Seq.map frequency
        |> Seq.map (fun f -> sinewave volume f sample_rate duration)
        |> Seq.reduce (fun a b -> Seq.map2 (fun sa sb -> sa + sb) a b)

Here is a sample script to run in F# Interactive:

The above plays a version of an A chord (5th fret E-form bar chord on the guitar).

Note that I have divided the volume by the number of notes in the chord. If we don't watch the volume like this and we let it get away on us, it will change the sound fairly dramatically. At worst something like the "snow" from the old analog TVs when you weren't getting a signal, but if you just go a little over, you will get a few pops like slightly degraded vinyl recordings.

We could include volume control in our wave form production routines. This wouldn't necessarily violate our decoupling ideals. The main concern I have is that I want the volume indicated to represent the volume of one note played and the other notes played should cause some increase in volume. I want the dynamic fidelity of the wave forms to be preserved with different sounds joining together. Even after that, you may want to run a limiter over the data to make sure the net volume of several sounds put together doesn't take you over your limit.

However, I need to do more research to achieve these ideals. For now, I will break it off there.

ΑΡΙΘΜΟΣ

Pages

Thursday, December 29, 2016

The Sound of F#

No comments:

Blog Archive

Most Viewed