Andrew Que Sites list Photos
Projects Contact
Main

March 07, 2022

Python shared-objects and output redirection

Ran into an interesting artifact with some Python code today.  I've been using the multiprocessing unit to run processes in parallel.  These processes needed a large dictionary as part of their operation, but the size of the object was slowing down the script because it was duplicated for each running process.  I found I could create shared objects using multiprocessing.Manager which allowed the large dictionary to be shared rather then duplicated.  Oddly it had another side-effect.  When using them my print statements inside the running processes were no longer getting piped on the command line.  Confused I narrowed down the problem to the following:

#!/usr/bin/env python3

import sys
import multiprocessing

#------------------------------------------------------------------------------
def _process( name ) :
    print( "Process", name )

#------------------------------------------------------------------------------
if __name__ == "__main__" :

    _PROCESS_COUNT = 4

    # Creating an instance of `manager` data causes prints not to work.
    manager = multiprocessing.Manager()
    shared_data = manager.list()

    work = range( _PROCESS_COUNT )

    print( "Processing..." )

    with multiprocessing.Pool( _PROCESS_COUNT ) as pool :
        pool.map( _process, work )

    print( "Done." )

Now executing this without any output redirection works as expected.

$ ./test.py 
Processing...
Process 0
Process 1
Process 3
Process 2
Done.

However, if the output is redirected, the print statements from the _process function no longer come through.

$ ./test.py | cat -
Processing...
Done.

I was baffled.  Redirection should have no effect on the output.  I decided to reach out to the Stackoverflow community.  In the comments I got some feedback asking if I tried flushing the stdout after printing.  Honestly I didn't think that was going to work but gave it a test.  Sure enough, it did work. 

#!/usr/bin/env python3

import sys
import multiprocessing

#------------------------------------------------------------------------------
def _process( name ) :
    print( "Process", name, flush=True )

#------------------------------------------------------------------------------
if __name__ == "__main__" :

    _PROCESS_COUNT = 4

    # Creating an instance of `manager` data causes prints not to work.
    manager = multiprocessing.Manager()
    shared_data = manager.list()

    work = range( _PROCESS_COUNT )

    print( "Processing..." )

    with multiprocessing.Pool( _PROCESS_COUNT ) as pool :
        pool.map( _process, work )

    print( "Done." )

Now I'm more perplexed.  Why would one need to flush the output if using shared objects? If the shared_data line is removed, the output also works.  However, I have a working solution.

After my success in getting my Waveshare CoreH7XXI single-board computer’s SDRAM running it was time to get the sound running. I had previously accomplished this streaming sound from a single-board using an STM32 F4 micro-controller. It was time to port that to the H7.

While it didn’t take long to get the code ported, it wasn’t working at all. The PWM channel was working fine, but I was unable to get DMA to feed in the waveform. After some tracing I found I was getting a Transfer Error Interrupt from the DMA controller as soon as launched the DMA request. After some digging, I found out why. There are 5 RAM regions on the H7. By default, data is placed in a 128 KiB region starting 0x2000000. DMA is unable to access this region and I had to use an area marked RAM_D2 starting at 0x3000000.

That made sense and I modified the linker script so I had a way to specify this memory region. However, I found two linker script: one for RAM and one for flash. So my initial attempt failed because I modified the RAM script. I assume this is meant for running the code out of RAM. I never need to do this so I added the same lines to the flash script. That did work, and so did the DMA.

This was a longer walk than expected. Typically STM controllers are pretty straightforward, but the H7 is more complex and hence has a couple of gotchas like this. Now that I know it shouldn’t be a problem working around it.

July 05, 2021

C11 Prime Number Counter

Back in February of 2010 I wrote a simple program to count all the prime numbers between 2 and 232.  It use pthreads so I could fully utilize the duel processor Red-Dragon.  The C11 standard introduced threads right into the language.  Several years ago I thought I'd port my prime counting program to use this because any system that supported the full C11 standard would be able to compile and run it.  Sadly I discovered that gcc had not implemented C11 threads at that time.  However, as of version GCC 9.3.0 C11 threads are present—you just have to link with the pthreads library as it uses pthreads under the hood. 

Porting didn't take too long.  However, there was one item I did not have from the start that I would need: counting semaphores.  Counting semaphores are used by the program to keep some specified number of running threads—typically one thread for each core of the CPU.  The C11 threads implementation does have mutexes and c conditional waits and those can be used to make a counting semaphore.  So I wrote a very simple header file with inline functions for counting semaphores.

//-----------------------------------------------------------------------------
// Name: primeCount.c
// Uses: Calculate the number of prime number between some range of numbers.
// Date: 2010-02-02
// Author: Andrew Que (https://www.DrQue.net/)
// Revisions:
//  1.0 - 2010-02-02 - Creation.
//  1.1 - 2014-08-22 - Bug fix.  Prime table needs to be initialized
//    outside of threads or problems could occur.
//  1.2 - 2021-07-05 - Converted to C11 threads.
//
// Build instructions:
//   This software was designed to compile and run in with strict C11
//   compliance with threads (pthreads).
//
//   Linux:
//     gcc -Wall -Wextra -Werror -pedantic -std=c11 -O2 primeCount.c -o primeCount -lpthread
//
//                 (C) Copyright 2010,2014,2021 by Andrew Que
//                         Released as public domain.
//-----------------------------------------------------------------------------
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <time.h>
#include <threads.h>
#include "semaphore.h"

// There happen to be 6542 primes between 2 and 65536.
enum { NUMBER_OF_LOOKUP_PRIMES = 6542 };
static unsigned primeNumbers[ NUMBER_OF_LOOKUP_PRIMES ];

// How many numbers to check in a thread.
enum { NUMBERS_PER_THREAD = 0x100000 };

// number of threads used for calculations.
// (Set this to the number of CPU cores available on the target machine).
enum { NUMBER_OF_THREADS = 16 };

// What range of numbers to check.
static uint32_t const START_NUMBER = UINT32_C( 0x2 );
static uint32_t const END_NUMBER   = UINT32_C( 0xFFFFFFFF );

// Semaphore used to dispatch work to threads.
static Semaphore semaphore;

// Structure to hold worker thread information.
typedef struct
{
  uint32_t startNumber;   // Where to start.
  uint32_t numberToCheck; // Numbers to check--usually NUMBERS_PER_THREAD.
  uint32_t numberFound;   // How many primes were found (return value).
} WorkType;

//-----------------------------------------------------------------------------
// Uses:
//   Build a lookup table (primeNumbers) of all prime numbers between 2 and
// 65536 (or the square root of 2^32).  This function doesn't take long
// despite having to check 2^16-2 values.
//-----------------------------------------------------------------------------
static void generatePrimeNumberLookup()
{
  unsigned index;
  unsigned primeNumberCount = 0;

  // First prime number in our list is 2.  We can get all the rest
  // knowing this.
  primeNumbers[ primeNumberCount++ ] = 2;

  // Get all remaining prime numbers between 3 and 2^16.
  // Count by 2 since all even numbers are divisible by two.
  for ( index = 3; index < UINT32_C( 0x10000 ); index += 2 )
  {
    // Assume the number is prime until it is determined to be otherwise.
    bool isPrime = true;

    // Check to see if this number is prime...
    unsigned subIndex = 0;
    while ( ( subIndex < primeNumberCount )
         && ( isPrime ) )
    {
      // Does it divide evenly by this prime number?
      if ( 0 == ( index % primeNumbers[ subIndex ] ) )
      {
        // If so, this number isn't prime.
        isPrime = false;
      }

      ++subIndex;
    }

    // If this number doesn't divide evenly, it is prime...
    if ( isPrime )
      // Add numbe to our prime list.
      primeNumbers[ primeNumberCount++ ] = index;
  }

} // generatePrimeNumberLookup

//-----------------------------------------------------------------------------
// Uses:
//   Return the integer square root of some unsigned 32-bit value.  This
// function uses a power-of-two bit trick such that the function will always
// have a value in 16 iterations.
//
// Input:
//    argument - The number of which to find the square root.
//
// Output:
//    Integer portion of the square root of "argument".
//-----------------------------------------------------------------------------
static inline uint16_t squareRoot( uint32_t argument )
{
  uint32_t test;
  uint16_t root    = 0;
  uint16_t bitMask = ( 1U << 15 );

  // 16 laps.
  while ( bitMask )
  {
    test = root + bitMask;

    // argument >= test^2?
    if ( argument >= ( test * test ) )
      root = test; // <- Use result.

    bitMask >>= 1;
  }

  return root;

} // squareRoot

//-----------------------------------------------------------------------------
// Uses:
//   test to see if a number is a prime number.  Works on a 32-bit unsigned
// value by dividing it by all prime number up to the square root of the
// number.  This works because because of the nature of prime numbers.  A
// number (call it x) is prime if there are no two number (call them a and b)
// such that a * b = x.  All non-prime numbers can be expressed as the sum of
// two or more prime numbers.  For example, 125 can be made from 5 * 25, but
// 25 can be made from 5 * 5.  So 5 * 5 * 5 = 125, and represents the most
// factored version of 125.  This is true of any number.  Since it takes at
// least two prime numbers to create a factor, we only need to check the primes
// up to x^1/2 (or the square root) of the number.  This is because the if
// x = a * a, then x^1/2 = a.  If x = a * b, and b is greater a, then a must
// be less then x^1/2. Thus, we only need to check primes up to x^1/2.
//   Since the input is a 32-bit number, maximum number that can be represented
// is 2^32-1.  (2^32)^1/2 = 2^16.  So we need to check all primes up to 2^16.
// To do this, we keep a lookup table of all primes in this range.
//
// Input:
//   number - A unsigned 32-bit value to test.
//
// Output:
//   Returns true if number is prime, false if not.
//-----------------------------------------------------------------------------
static inline bool isPrime( uint32_t number )
{
  // Assume the number is prime until it is determined to be otherwise.
  bool isPrime = true;

  // Is number even (and not the number 2)?
  if ( ( 0 == ( number & 1 ) )
    && ( 2 != number ) )
  {
    // No even numbers (except 2) are prime.
    isPrime = false;
  }
  else
  {
    // We only need to check up to the square root of the number.
    uint16_t root = squareRoot( number );
    unsigned index = 1// <- Start with 3.

    // While we still have prime numbers to test, the number is less
    // then the squre root of the number, and nothing so far has divided
    // evenly...
    while ( ( index < NUMBER_OF_LOOKUP_PRIMES )
         && ( primeNumbers[ index ] <= root )
         && ( isPrime ) )
    {
      // Does this prime divide into the number?
      if ( 0 == ( number % primeNumbers[ index ] ) )
        // Then the number is not prime.
        isPrime = false;

      ++index;
    }
  }

  // Return the results.
  return isPrime;

} // isPrime

//-----------------------------------------------------------------------------
// Uses:
//   Thread used to count the number of primes in a given range of numbers.
// The range checked is from the 32-bit unsigned integer pointed to by
// argumentPointer to argumentPointer + NUMBERS_PER_THREAD.
//
// Input:
//   argumentPointer - A pointer to a 32-bit unsigned integer that contains
// the first number to check.
//
// Output:
//   The function itself returns nothing.  The unit global "numberOfPrimes" is
// updated by the number of primes found.
//-----------------------------------------------------------------------------
static int primeThread( void * argumentPointer )
{
  // Get the work data passed to the thread.
  WorkType * data = (WorkType *)argumentPointer;
  uint32_t number = data->startNumber;
  uint32_t count = 0;
  unsigned index;

  // For all the numbers to check...
  for ( index = 0; index < data->numberToCheck; ++index )
  {
    // Is this number a prime?
    if ( isPrime( number ) )
      // Then count it.
      ++count;

    // Next number.
    ++number;
  }

  // Save results.
  data->numberFound = count;

  // This thread is now complete.  Release one count from the dispatch
  // semaphore.
  semaphoreRelease( &semaphore );

  // End this thread.
  thrd_exit( 0 );

  // Never reached--here for language consistency.
  return 0;

} // primeThread

//-----------------------------------------------------------------------------
// Uses:
//   Program main function.  This function will setup the dispatch semaphore,
// and work threads that will count all the prime number in a range given by
// the unit globals START_NUMBER and END_NUMBER.  The total count is displayed when the
// program completes.
//
// Output:
//   This function (and program as a whole) always returns 0.
//-----------------------------------------------------------------------------
int main()
{
  // Print a header.
  printf( "============================ " );
  printf( "Prime number count " );
  printf( "============================ " );
  printf
  (
    "Counting the number of primes between %u and %u ",
    (unsigned)START_NUMBER, (unsigned)END_NUMBER
  );

  generatePrimeNumberLookup();

  // Mark the time this program began.
  time_t startTime = time( NULL );

  // Worker threads.
  thrd_t threads[ NUMBER_OF_THREADS ];

  // data storage for threads.
  WorkType data[ NUMBER_OF_THREADS ];

  // Setup this dispatch semaphore such that it can handle NUMBER_OF_THREADS counts
  // before it blocks the request.
  semaphoreInit( &semaphore, NUMBER_OF_THREADS, NUMBER_OF_THREADS );

  // index for the next available thread.
  unsigned threadIndex;

  // Zero out thread data (used so we can tell if the thread has been used
  // yet or not).
  for ( threadIndex = 0; threadIndex < NUMBER_OF_THREADS; ++threadIndex )
    data[ threadIndex ].numberToCheck = 0;

  // Starting number to pass to the next work thread.
  uint32_t number = START_NUMBER;
  uint32_t numbersLeft = END_NUMBER - START_NUMBER;

  // Total number of primes found so far.
  unsigned numberOfPrimes = 0;

  // Zero thread index.
  threadIndex = 0;

  // Loop until all the numbers have been checked...
  while ( numbersLeft )
  {
    // Display progress.
    printf( "%08X => %u "(unsigned)number, (unsigned)numberOfPrimes );
    fflush( stdout )// <- Make sure the screen is updated.

    // Wait for a free worker thread.
    semaphoreWait( &semaphore );

    // Was this thread running?
    if ( data[ threadIndex ].numberToCheck )
    {
      // Rejoin the thread--should be finished now.
      thrd_join( threads[ threadIndex ], NULL );

      // Accumulate the number of primes found in this thread.
      numberOfPrimes += data[ threadIndex ].numberFound;
    }

    // How many number to check.
    if ( numbersLeft > NUMBERS_PER_THREAD )
      data[ threadIndex ].numberToCheck = NUMBERS_PER_THREAD;
    else
      data[ threadIndex ].numberToCheck = numbersLeft;

    // Create a worker thread to check this number set.
    data[ threadIndex ].startNumber = number;
    thrd_create
    (
      &threads[ threadIndex ],
      primeThread,
      (void *)&data[ threadIndex ]
    );

    // Move to next number set.
    number += data[ threadIndex ].numberToCheck;
    numbersLeft -= data[ threadIndex ].numberToCheck;

    // Advance thread index with wrap around.
    ++threadIndex;
    if ( threadIndex >= NUMBER_OF_THREADS )
      threadIndex = 0;
  }

  // At this point, all work has been dispatched.  We just need to wait for
  // the worker threads to finish.

  // Denote the current state.
  printf( "Finishing...               " );
  fflush( stdout )// <- Make sure the screen is updated.

  // Wait for each thread to finish.
  for ( threadIndex = 0; threadIndex < NUMBER_OF_THREADS; ++threadIndex )
  {
    // Was this thread running?
    if ( data[ threadIndex ].numberToCheck )
    {
      // Wait for thread to finish.
      thrd_join( threads[ threadIndex ], NULL );

      // Accumulate the number of primes found in this thread.
      numberOfPrimes += data[ threadIndex ].numberFound;
    }
  }

  // Let go of dispatch semaphore.
  semaphoreDestroy( &semaphore );

  // Calculate how long the program ran.
  unsigned elapsedTime = (unsigned)difftime( time( NULL ), startTime );

  // Display results.
  printf
  (
    "Found %u primes between %u and %u, %u seconds. ",
    (unsigned)numberOfPrimes,
    (unsigned)START_NUMBER,
    (unsigned)END_NUMBER,
    elapsedTime
  );

  // Done, exit with no error.
  return 0;

} // main

//--------------------------------------=--------------------------------------

The implementation is nearly identical, just using the C11 thread structures and my semaphore unit.  I don't know the speeds of the original Red Dragon, but I ran a speed test in May of 2017.  My fastest machine at the time, a duel-core 4-thread Intel i7, required 28.26 minutes.  My AMD Ryzen 7 1700, 8-core/16-thread CPU needs 16.58 minutes.

Prime Counter v1.2

SHA-256: cfa16f3ca5c38d0c99f8181eeeac440c84ac4a942e5b07ad5d2afe39b4e67692

April 30, 2021

Waveform Audio File Format

In previous articles of the series on Amiga MOD files I wrote about implementing a system to read the file format and print the notes. The goal is to be able to render audio output. Rather than directly have audio go to a sound device I decided that writing the output to an audio file would be easier. One of the simplest audio file formats is the uncompressed Waveform Audio File Format, WAVE, or just WAV. It was created in the early 90s by IBM and Microsoft—right around the time I was learning how to write my own software. So along with BMP it became one of the first file formats I reverse engineered sometime between 1994 and 1996. With ready access to the Internet it is no longer necessary to manually workout the details by trial and error. In this article I want to look at what will take me much longer to write about than it did to implement.

Although my goal was to write WAV files, the first step was to be able to read them. The format supports storing multiple chunks of data, but most of the time there is just a single chunk. This site gave me enough detail on how the header to a WAV file is laid out. To correctly read a WAV file I would really need to account for all of the chunks. But the assumption was we were only going to use a single chunk—so that is all I was concerned about.

Now there is a weird mix of little/big endian. Why anyone would do this I couldn’t say—usually you pick one or the other. Intel has always been little endian and it turns out the specification only defines headers descriptions in big endian. It is easier just to tread those as 4 characters. Right away I could most of the all the data was 32-bit aligned. The are 16-bit fields but they come in pairs. Knowing I was writing for a little endian system I could take a shortcut—I could read the header directly into a C structure. C structures are get complected due to alignment. However, this structure was already aligned so I could use the structure trick. To be sure I used bitfields, making the 16-bit words actually bit fields in 32-bit words.

Let’s look at code that simply reads the WAV file header and prints the fields.

#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include "fileUtilities.h"

int main( int argumentCount, char * * arguments )
{
  bool isError = ( argumentCount <= 1 );

  if ( isError )
    fprintf( stderr, "Syntax: %s <MOD file> ", arguments[ 0 ] );

  char const * fileName = arguments[ 1 ];

  FILE * inputFile = NULL;
  if ( ! isError )
  {
    inputFile = fopen( fileName, "rb" );

    isError = ( NULL == inputFile );
    if ( isError )
      fprintf( stderr, "Unable to open `%s`. ", fileName );
  }

  typedef struct
  {
    uint32_t chunkId;
    uint32_t chunkSize;
    uint32_t chunkFormat;

    uint32_t subChunkId1;
    uint32_t subChunkSize1;

    uint32_t format : 16;
    uint32_t channels : 16;

    uint32_t sampleRate;
    uint32_t byteRate;
    uint32_t blockAlign : 16;
    uint32_t bitesPerSample : 16;

    uint32_t subChunkId2;
    uint32_t subChunkSize2;

  } WaveHeader;

  WaveHeader waveHeader;
  isError |= fileRead( inputFile, &waveHeader, sizeof( waveHeader ) );

  if ( ! isError )
  {
    printf( "chunkId.........: %.4s "(char*)&waveHeader.chunkId     );
    printf( "chunkSize.......: %d ",   waveHeader.chunkSize           );
    printf( "chunkFormat.....: %.4s "(char*)&waveHeader.chunkFormat );
    printf( " " );
    printf( "subChunkId1.....: %.4s "(char*)&waveHeader.subChunkId1 );
    printf( "subChunkSize1...: %d ",   waveHeader.subChunkSize1       );
    printf( " " );
    printf( "format..........: %04X ", waveHeader.format              );
    printf( "channels........: %d ",   waveHeader.channels            );
    printf( " " );
    printf( "sampleRate......: %d ",   waveHeader.sampleRate          );
    printf( "byteRate........: %d ",   waveHeader.byteRate            );
    printf( "blockAlign......: %d ",   waveHeader.blockAlign          );
    printf( "bitesPerSample..: %d ",   waveHeader.bitesPerSample      );
    printf( " " );
    printf( "subChunkId2.....: %.4s "(char*)&waveHeader.subChunkId2 );
    printf( "subChunkSize2...: %d ",   waveHeader.subChunkSize2       );
  }

  if ( inputFile )
    fclose( inputFile );

  int returnResult = 0;
  if ( isError )
    returnResult = -1;

  return returnResult;
}

This code was created in a very short amount of time so very little through was given to cleaning it up. What I really wanted now was to create a WAV file. Armed with the header information I wrote a program to produce a cord and write the results to a WAV file.

#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
#include "fileUtilities.h"

enum { SAMPLE_RATE = 4000 };
enum { DURATION = 3000 };
enum { SAMPLES = SAMPLE_RATE * DURATION / 1000 };

enum
{
//C   C#   D   D#   E   F   F#   G   G#   A   A#   B
  C0, C_0, D0, D_0, E0, F0, F_0, G0, G_0, A0, A_0, B0,
  C1, C_1, D1, D_1, E1, F1, F_1, G1, G_1, A1, A_1, B1,
  C2, C_2, D2, D_2, E2, F2, F_2, G2, G_2, A2, A_2, B2,
  C3, C_3, D3, D_3, E3, F3, F_3, G3, G_3, A3, A_3, B3,
  C4, C_4, D4, D_4, E4, F4, F_4, G4, G_4, A4, A_4, B4,
  C5, C_5, D5, D_5, E5, F5, F_5, G5, G_5, A5, A_5, B5,
  C6, C_6, D6, D_6, E6, F6, F_6, G6, G_6, A6, A_6, B6,
  C7, C_7, D7, D_7, E7, F7, F_7, G7, G_7, A7, A_7, B7,
  C8, C_8, D8, D_8, E8, F8, F_8, G8, G_8, A8, A_8, B8,

  NUMBER_OF_NOTES
};

typedef struct
{
  char     chunkId[ 4 ];
  uint32_t chunkSize;
  char     chunkFormat[ 4 ];

  char     subChunkId1[ 4 ];
  uint32_t subChunkSize1;

  uint32_t format : 16;
  uint32_t channels : 16;

  uint32_t sampleRate;
  uint32_t byteRate;
  uint32_t blockAlign : 16;
  uint32_t bitesPerSample : 16;

  char     subChunkId2[ 4 ];
  uint32_t subChunkSize2;

} WaveHeader;

static WaveHeader const DEFAULT_HEADER =
{
  { 'R''I''F''F' },
  SAMPLES + 36,
  { 'W''A''V''E' },

  { 'f''m''t'' ' },
  16,

  0x01,
  0x01,

  SAMPLE_RATE,
  SAMPLE_RATE,
  1,
  8,

  { 'd''a''t''a' },
  SAMPLES
};


int main( int argumentCount, char * * arguments )
{
  bool isError = ( argumentCount <= 1 );

  if ( isError )
    fprintf( stderr, "Syntax: %s <out file> ", arguments[ 0 ] );

  char const * fileName = arguments[ 1 ];

  FILE * outputFile = NULL;
  if ( ! isError )
  {
    outputFile = fopen( fileName, "wb" );

    isError = ( NULL == outputFile );
    if ( isError )
      fprintf( stderr, "Unable to open `%s`. ", fileName );
  }

  if ( ! isError )
  {
    WaveHeader waveHeader;
    memcpy( &waveHeader, &DEFAULT_HEADER, sizeof( waveHeader ) );
    fwrite( &waveHeader, sizeof( waveHeader )1, outputFile );

    // Generate frequencies for all notes.
    float notes[ NUMBER_OF_NOTES ];
    for ( unsigned index = 0; index < NUMBER_OF_NOTES; index += 1 )
      notes[ index ] = pow( 2( ( (float)index - 57 ) / 12.0 ) ) * 440.0;

    uint8_t samples[ SAMPLES ];
    for ( unsigned index = 0; index < SAMPLES; index += 1 )
    {
      // Linear fade out.
      float volume = 128.0 * ( 1.0 - (float)index / SAMPLES );

      float time = (float)index / SAMPLE_RATE;
      float const TWO_PI = 6.28318530717958647692528676655;
      float radians = TWO_PI * time;

      // C major seventh
      float sample =
        (
          sin( radians * notes[ C3 ] )
        + sin( radians * notes[ E3 ] )
        + sin( radians * notes[ G3 ] )
        + sin( radians * notes[ B3 ] )
        ) * volume / 4.0 + 128.0;

      if ( sample > 255 )
        sample = 255;

      samples[ index ] = sample;
      //printf( "%3.8f %3i %02X ", sample, samples[ index ], samples[ index ] );
    }

    fwrite( samples, sizeof( samples )1, outputFile );
  }

  if ( outputFile )
    fclose( outputFile );

  int returnResult = 0;
  if ( isError )
    returnResult = -1;

  return returnResult;
}

Here is a simple file that uses a default header for an 8-bit, mono wave file of a fixed duration. Small modifications such as to duration and sample rate can be made and tested. It simply makes a C major seventh chord with a linear fade out and writes the data to the wave file. Part of this process requires building a table of note frequencies. A typical piano has 88 keys, but I didn’t like the odd range—7.3 octaves. So I went for an extended note rage of 108 keys—a full 9 octaves, from C0 to B8. The frequency of each notes follows this equation:

Where f is frequency, and n is the note from 0 to 107 with 0 being C0 and 107 being B8. This is a slight modification of the equation found for piano key freuencies being offset of account for the large range. Most of the file is fairly self-explanatory. The note generation happens here:

      float time = (float)index / SAMPLE_RATE;
      float const TWO_PI = 6.28318530717958647692528676655;
      float radians = TWO_PI * time;

      // C major seventh
      float sample =
        (
          sin( radians * notes[ C3 ] )
        + sin( radians * notes[ E3 ] )
        + sin( radians * notes[ G3 ] )
        + sin( radians * notes[ B3 ] )
        ) * volume / 4.0 + 128.0;

      if ( sample > 255 )
        sample = 255;

      samples[ index ] = sample;

First, we find the time in seconds. Then we calculate the angle of this time in radians. Multiplying this by the note frequency and take the sine and we get a the desired note. Add these notes up and we get a chord. The amplitude of each sine wave is 1, so adding four of them together will result in a maximum amplitude of 4. Thus we divide by 4. That produces a number between -1 and 1. For 8-bit PCM we need a value between 0 and 255 with 128 being the zero point. The maximum volume is 128 so we multiply by this changing our range from -128 to +128. Adding 128 will give a range between 0 and 256. 256 is too high so we clip it to 255. We now have the 8-bit PCM value that is stored in the sample buffer.

Just for fun I generated a couple of other chords, picked difference sample frequencies and duration. Naturally they are all functional as there isn’t anything special. All of this was to demonstrate I could write a valid WAV file. Now satisfied, it was time to make a library to create WAV files from a sample set.

My first library was more or less a wrapper of what is shown here. A single function that took a sample frequency and a set of samples and wrote them to a WAV file. That was enough for my first day’s work, but as the project progressed I needed better functions.

The second function I wrote simply appended samples to a WAV file. This allowed me to incrementally add data. The third and most useful function set allows a WAV file to be opened, samples added, and then closed at which time the header details are completed. The this version also included the number of channels so stereo WAV files could be created.

//=============================================================================
// Uses: Export sound samples to a WAVE file.
// Date: 2021-04-16
// Author: Andrew Que <https://www.DrQue.net/>
//=============================================================================
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
#include "waveExport.h"
#include "fileUtilities.h"

enum { HEADER_SIZE = 36 };

typedef struct
{
  char     chunkId[ 4 ];
  uint32_t chunkSize;
  char     chunkFormat[ 4 ];

  char     subChunkId1[ 4 ];
  uint32_t subChunkSize1;

  uint32_t format : 16;
  uint32_t channels : 16;

  uint32_t sampleRate;
  uint32_t byteRate;
  uint32_t blockAlign : 16;
  uint32_t bitesPerSample : 16;

  char     subChunkId2[ 4 ];
  uint32_t subChunkSize2;

} WaveHeader;

static WaveHeader const DEFAULT_HEADER =
{
  { 'R''I''F''F' },
  0,
  { 'W''A''V''E' },

  { 'f''m''t'' ' },
  16,

  0x01,
  0x01,

  0,
  0,
  1,
  8,

  { 'd''a''t''a' },
  0
};

//-----------------------------------------------------------------------------
// Uses:
//   Export sample to a WAVE file.
// Input:
//   fileName - Name of file to create.
//   sampleRate - Samples/second.
//   samples - 8-bit sample data.
//   sampleSize - Number of samples.
// Output:
//   True if there was an error.
//-----------------------------------------------------------------------------
bool waveExport
(
  char const * fileName,
  uint16_t sampleRate,
  uint8_t const * samples,
  uint32_t sampleSize
)
{
  bool isError = false;

  FILE * outputFile = NULL;
  if ( ! isError )
  {
    outputFile = fopen( fileName, "wb" );
    isError = ( NULL == outputFile );
  }

  if ( ! isError )
  {
    WaveHeader waveHeader;
    memcpy( &waveHeader, &DEFAULT_HEADER, sizeof( waveHeader ) );

    waveHeader.chunkSize     = sampleSize + HEADER_SIZE;
    waveHeader.sampleRate    = sampleRate;
    waveHeader.byteRate      = sampleRate;
    waveHeader.subChunkSize2 = sampleSize;

    isError |= ( 1 != fwrite( &waveHeader, sizeof( waveHeader )1, outputFile ) );
    isError |= ( 1 != fwrite( samples, sampleSize, 1, outputFile ) );
  }

  if ( outputFile )
    fclose( outputFile );

  return isError;
}

//-----------------------------------------------------------------------------
// Uses:
//   Append data to wave file.  File is created if it does not exist.
// Input:
//   fileName - Name of file to create.
//   sampleRate - Samples/second.
//   samples - 8-bit sample data.
//   sampleSize - Number of samples.
// Output:
//   True if there was an error.
//-----------------------------------------------------------------------------
bool waveExportAppend
(
  char const * fileName,
  uint16_t sampleRate,
  uint8_t const * samples,
  uint32_t sampleSize
)
{
  bool isError = false;

  bool fileExists = false;
  FILE * inputFile = NULL;
  if ( ! isError )
  {
    inputFile = fopen( fileName, "rb" );
    fileExists = ( NULL != inputFile );
  }

  if ( ! fileExists )
    isError = waveExport( fileName, sampleRate, samples, sampleSize );
  else
  {
    // Read file header.
    WaveHeader waveHeader;
    isError |= fileRead( inputFile, &waveHeader, sizeof( waveHeader ) );

    fclose( inputFile );

    // Reopen file for appending.
    FILE * outputFile = fopen( fileName, "r+b" );
    isError = ( NULL == outputFile );
    fseek( outputFile, 0, SEEK_END );

    // Write new samples.
    if ( ! isError )
      isError |= ( 1 != fwrite( samples, sampleSize, 1, outputFile ) );

    // Write new header.
    if ( ! isError )
    {
      // Update header.
      waveHeader.chunkSize     += sampleSize;
      waveHeader.subChunkSize2 += sampleSize;

      // Write new header at beginning of file.
      rewind( outputFile );
      isError |= ( 1 != fwrite( &waveHeader, sizeof( waveHeader )1, outputFile ) );

      fclose( outputFile );
    }
  }

  return isError;
}

//-----------------------------------------------------------------------------
// Uses:
//   Start a wave file.
// Input:
//   fileName - Name of file to create.
//   sampleRate - Samples/second.
//   channels - Number of channels. (1=mono, 2=stereo)
// Output:
//   Wave file context.  NULL if there was an error.
//-----------------------------------------------------------------------------
WaveContext * waveStart( char const * fileName, uint16_t sampleRate, uint8_t channels )
{
  bool isError = false;

  FILE * outputFile = NULL;
  if ( ! isError )
  {
    outputFile = fopen( fileName, "w+" );
    isError = ( NULL == outputFile );
  }

  if ( ! isError )
  {
    WaveHeader waveHeader;
    memcpy( &waveHeader, &DEFAULT_HEADER, sizeof( waveHeader ) );

    waveHeader.sampleRate = sampleRate;
    waveHeader.byteRate   = sampleRate;
    waveHeader.channels   = channels;

    isError |= ( 1 != fwrite( &waveHeader, sizeof( waveHeader )1, outputFile ) );

    if ( isError )
    {
      fclose( outputFile );
      outputFile = NULL;
    }
  }

  return outputFile;
}

//-----------------------------------------------------------------------------
// Uses:
//   Write some samples to open wave file.
// Input:
//   wave - Open wave file.  Use `waveStart` to create this.
//   samples - 8-bit sample data.
//   sampleSize - Number of samples.
// Output:
//   True if there was an error.
//-----------------------------------------------------------------------------
bool waveAddSamples( WaveContext * wave, uint8_t const * samples, uint32_t sampleSize )
{
  FILE * outputFile = (FILE *)wave;
  return ( 1 != fwrite( samples, sampleSize, 1, outputFile ) );
}

//-----------------------------------------------------------------------------
// Uses:
//   Close open wave file.
// Input:
//   wave - Open wave file.  Use `waveStart` to create this.
// Output:
//   True if there was an error.
// Notes:
//   This must be called before wave file is valid.
//-----------------------------------------------------------------------------
bool waveClose( WaveContext * wave )
{
  FILE * outputFile = (FILE *)wave;

  long int sampleSize = ftell( outputFile ) - sizeof( WaveHeader );
  rewind( outputFile );

  // Read file header.
  WaveHeader waveHeader;
  bool isError = fileRead( outputFile, &waveHeader, sizeof( waveHeader ) );

  rewind( outputFile );

  waveHeader.chunkSize     = sampleSize + HEADER_SIZE;
  waveHeader.subChunkSize2 = sampleSize;

  isError |= ( 1 != fwrite( &waveHeader, sizeof( waveHeader )1, outputFile ) );

  return isError;
}

If I expand the MOD library to handle other formats I might also add the ability to work with 16-bit data. For now, 8-bit data is sufficient.