Andrew Que Sites list Photos
Projects Contact
Main

December 26, 2018

Recursive Ticket Lock

I wrote about why one would need a ticket lock and how to implement one. Now I’d like to improve on this concept by adding recursion to the ticket lock.

First, what is meant by recursion? A mutex is usually used to isolate a resource so only one thread can use it at a time. Typically a mutex is used to a wrapped critical section. The critical section starts with the mutex being locked and ends when the mutex is released. It is a good practice to keep critical sections short as not to starve other threads that may want the resource.

Now consider that you have a resource used in several smaller functions, but also in larger functions. Think of a scenario where there are functions to print a header, body, and footer as well as a function to print the entire page that calls each of the proceeding functions. The header, body and footer functions all use a mutex to lock the print function. However, the print page function wants to lock the print function so that the entire page is not interrupted.

If the page function calls the individual header, body and footer functions, there is the chance another thread could grab the mutex after the header or body and insert text. This is especially true with a ticket lock where mutex request order is guaranteed. The page function could re-implement the header, body and footer functions and wrap them all with the mutex but that is bad pratice. What would be best is if the page function could simple request the mutex, call each of the sub-functions that also request the mutex, and then release the mutex. What we want is a recursive mutex.

A recursive mutex allows the same thread to make multiple requests for a mutex lock. The most basic mutex could dead lock itself by simply requesting the same mutex twice. The first lock it will get, but the second lock it can never get because it didn’t release the lock it already had. Many mutex implementation allow a thread to obtain a mutex if the it is the same thread already holding the lock. In addition, such implementations also require the same number of unlocks as locks before the mutex is freed by a thread. That is, if a thread requests a mutex 3 times, it must release it 3 times before the mutex actually becomes free. We would like this functionality extended to a ticket lock.

So what additional requirements are there for a recursive ticket lock? The mutex only needs to know the ID of the thread which currently holds the lock, and a count of how many times the current thread has locked the mutex. If the mutex is already locked, but the locking thread ID matches the current thread ID, the lock count is increased and there is no block. When unlocking the mutex, the lock count is decreased. If the count is zero, the mutex is actually released. If the count is not zero, the lock remains.

//=============================================================================
// Uses: Ticket lock with recursion example.
// Date: 2018-12-26
// Author: Andrew Que <https://www.DrQue.net>
// Notes:
//     Uses several threads that all hog a shared resource.  The ticket lock
//   will ensure all threads are able to access the resource equally.
// Compile:
//   gcc -Wall -Wextra -std=c99 -pedantic recursionExample.c -O2 -pthread -o example
//
//                              Andrew Que, 2018
//                                Public domain
//=============================================================================

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <pthread.h>
#include <time.h>

// Define implementation specifics about mutex.
typedef struct
{
  // Mutex and conditional instance.
  pthread_mutex_t lock;
  pthread_cond_t condition;

  // Id of the current thread that possesses the lock.
  pthread_t lockingThread;

  // Thread lock counts.
  // Used to track recursive locks.  Lock is free when count is 0.
  unsigned count;

  // Head/tail of mutex lock.
  // The head is the ticket being served, tail is the next trick number to
  // issue.
  unsigned volatile head;
  unsigned volatile tail;
} TicketLock;

//-----------------------------------------------------------------------------
// Uses:
//   Lock mutex.
// Input:
//   ticketLock - Pointer to context.
//-----------------------------------------------------------------------------
void ticketLockAcquire( TicketLock * ticketLock )
{
  bool skipLock = false;

  pthread_mutex_lock( &ticketLock->lock );

  // If the same thread is requesting the lock, increase the mutex count.
  pthread_t currentThread = pthread_self();
  if ( ( ticketLock->count > 0 )
    && ( currentThread == ticketLock->lockingThread ) )
  {
    ticketLock->count += 1;
    skipLock = true;
  }

  // If the mutex doesn't already belong to us...
  if ( ! skipLock )
  {
    // Get a ticket number.
    unsigned ticket = ticketLock->tail;
    ticketLock->tail += 1;

    // Wait for our ticket number.
    while ( ticketLock->head != ticket )
      pthread_cond_wait( &ticketLock->condition, &ticketLock->lock );

    // Note what thread has the lock (this thread).
    ticketLock->lockingThread = currentThread;

    // Now one waiting lock count.
    ticketLock->count += 1;
  }

  pthread_mutex_unlock( &ticketLock->lock );
}

//-----------------------------------------------------------------------------
// Uses:
//   Unlock mutex.  Must be called for each to `ticketLockAcquire`.
// Input:
//   ticketLock - Pointer to context.
//-----------------------------------------------------------------------------
void ticketLockRelease( TicketLock * ticketLock )
{
  pthread_mutex_lock( &ticketLock->lock );

  // Decrease the mutex count.
  ticketLock->count -= 1;

  // If the mutex is free.
  if ( 0 == ticketLock->count )
  {
    ticketLock->lockingThread = 0;

    // Advance to next mutex in line.
    ticketLock->head += 1;

    // Signal other threads the resource has been released.
    pthread_cond_broadcast( &ticketLock->condition );
  }

  pthread_mutex_unlock( &ticketLock->lock );
}

//-----------------------------------------------------------------------------
// Uses:
//   Setup a ticket mutex.  Must be called before mutex is used.
// Input:
//   ticketLock - Pointer to context to setup.
// Note:
//   Call `ticketLockDestroy` when done with mutex to release resources.
//-----------------------------------------------------------------------------
void ticketLockInitialize( TicketLock * ticketLock )
{
  // Initialize the mutex and conditional wait context.
  pthread_mutex_init( &ticketLock->lock, NULL );
  pthread_cond_init( &ticketLock->condition, NULL );

  ticketLock->lockingThread = 0;
  ticketLock->count = 0;
  ticketLock->head = 0;
  ticketLock->tail = 0;
}

//-----------------------------------------------------------------------------
// Uses:
//   Release mutex resources.  Call when mutex is no longer needed.
// Input:
//   ticketLock - Pointer to context to destory.
//-----------------------------------------------------------------------------
void ticketLockDestroy( TicketLock * ticketLock )
{
  pthread_mutex_destroy( &ticketLock->lock );
  pthread_cond_destroy( &ticketLock->condition );
}

enum { THREADS = 8 };
static pthread_t threads[ THREADS ];
static TicketLock printMutex;

static bool volatile isRunning = true;

//-----------------------------------------------------------------------------
// Uses:
//   Sleep specified number of milliseconds.
// Input:
//   ms - Milliseconds to sleep.  Must be less than 1000.
//-----------------------------------------------------------------------------
static void sleep_ms( unsigned ms )
{
  struct timespec sleepValue = { 00 };
  sleepValue.tv_nsec = ms * 1000000;
  nanosleep( &sleepValue, NULL );
}

//-----------------------------------------------------------------------------
// Uses:
//   Thread to hog print mutex.
// Input:
//   parameter - Thread ID.
// Output:
//   Output is unused and simple returns parameter.
//-----------------------------------------------------------------------------
static void * threadFunction( void * parameter )
{
  // Recast parameter as unsigned.
  // Double-cast needed.
  unsigned threadIndex = (unsigned)(uintptr_t)parameter;

  // Random sleep before thread begins.
  sleep_ms( rand() % 50 );

  while ( isRunning )
  {
    // Acquire lock multiple times prior to printing.
    ticketLockAcquire( &printMutex );
    ticketLockAcquire( &printMutex );
    ticketLockAcquire( &printMutex );

    // Print.
    printf( "Thread: %u\n", threadIndex );

    // Give up one of the locks.
    ticketLockRelease( &printMutex );

    // Waste time while holding the lock.
    sleep_ms( 10 + rand() % 50 );

    // Give up remaining locks.
    ticketLockRelease( &printMutex );
    ticketLockRelease( &printMutex );
  }

  return parameter;
}

//-----------------------------------------------------------------------------
// Uses:
//   Ticket lock example.
// Output:
//   Always returns 0.
//-----------------------------------------------------------------------------
int main()
{
  // Setup random number generator.
  srand( time( NULL ) );

  ticketLockInitialize( &printMutex );

  printf( "Hit a Enter to stop.\n" );

  // Create and start all threads.
  for ( unsigned index = 0; index < THREADS; index += 1 )
    pthread_create( &threads[ index ], NULL, threadFunction, (void*)(uintptr_t)index );

  // Wait for enter to be pressed.
  fgetc( stdin );

  // Signal shutdown.
  isRunning = false;

  // Wait for all threads to stop.
  for ( unsigned index = 0; index < THREADS; index += 1 )
    pthread_join( threads[ index ], NULL );

  ticketLockDestroy( &printMutex );

  return 0;
}

You almost always desire a recursive ticket lock implementation. The only item to consider is that you must have the same number of locks as unlocks or you can create a deadlock. Combine with single-entry single exit practices and starting/ending functions with lock/unlock virtually guarantees there will never be deadlock. Since the mutex is recursive all functions that use the resource can use their own lock/unlock and not have to worry about resource overlap or deadlock.

December 25, 2018

Ticket Lock in pthread

I wrote about the theory of a ticket lock and would now like to explore an example. First we need to demonstrate what can happen without a ticket lock. The implementations will use POSIX threads (pthreads).

//=============================================================================
// Uses: Resource hog example.
// Date: 2018-12-25
// Author: Andrew Que <https://www.DrQue.net>
// Compile:
//   gcc -Wall -Wextra -std=c99 -pedantic hogExample.c -O2 -pthread -o example
//
//                              Andrew Que, 2018
//                                Public domain
//=============================================================================

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <pthread.h>
#include <time.h>

enum { THREADS = 16 };
static pthread_t threads[ THREADS ];
static pthread_mutex_t printMutex;

static bool volatile isRunning = true;

//-----------------------------------------------------------------------------
// Uses:
//   Sleep specified number of milliseconds.
// Input:
//   ms - Milliseconds to sleep.  Must be less than 1000.
//-----------------------------------------------------------------------------
static void sleep_ms( unsigned ms )
{
  struct timespec sleepValue = { 00 };
  sleepValue.tv_nsec = ms * 1000000;
  nanosleep( &sleepValue, NULL );
}

//-----------------------------------------------------------------------------
// Uses:
//   Thread to hog print mutex.
// Input:
//   parameter - Thread ID.
// Output:
//   Output is unused and simple returns parameter.
//-----------------------------------------------------------------------------
static void * threadFunction( void * parameter )
{
  // Recast parameter as unsigned.
  // Double-cast needed.
  unsigned threadIndex = (unsigned)(uintptr_t)parameter;

  // Random sleep before thread begins.
  sleep_ms( rand() % 50 );

  while ( isRunning )
  {
    // Acquire lock prior to printing.
    pthread_mutex_lock( &printMutex );

    // Print.
    printf( "Thread: %u\n", threadIndex );

    // Waste time while holding the lock.
    sleep_ms( 10 + rand() % 50 );

    // Give up the lock.
    pthread_mutex_unlock( &printMutex );
  }

  return parameter;
}

//-----------------------------------------------------------------------------
// Uses:
//   Ticket lock example.
// Output:
//   Always returns 0.
//-----------------------------------------------------------------------------
int main()
{
  // Setup random number generator.
  srand( time( NULL ) );

  pthread_mutex_init( &printMutex, NULL );

  printf( "Hit a Enter to stop.\n" );

  // Create and start all threads.
  for ( unsigned index = 0; index < THREADS; index += 1 )
    pthread_create( &threads[ index ], NULL, threadFunction, (void*)(uintptr_t)index );

  // Wait for enter to be pressed.
  fgetc( stdin );

  // Signal shutdown.
  isRunning = false;

  // Wait for all threads to stop.
  for ( unsigned index = 0; index < THREADS; index += 1 )
    pthread_join( threads[ index ], NULL );

  pthread_mutex_destroy( &printMutex );

  return 0;
}

In this example several threads are created that all want the print resource. The thread hogs the resource by releasing the lock but immediately asking for the lock again. Typically what one will expect to see is a single thread printing its ID over and over, occasionally switching to another. The switch can happen if the operating system does a task switch just after the lock is released. Rare but it will eventually happen. The exact results will depend on the environment.

Now the same example, but with the addition of ticket locks.

//=============================================================================
// Uses: Ticket lock example.
// Date: 2018-12-25
// Author: Andrew Que <https://www.DrQue.net>
// Notes:
//     Uses several threads that all hog a shared resource.  The ticket lock
//   will ensure all threads are able to access the resource equally.
// Compile:
//   gcc -Wall -Wextra -std=c99 -pedantic ticketLockExample.c -O2 -pthread -o example
//
//                              Andrew Que, 2018
//                                Public domain
//=============================================================================

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <pthread.h>
#include <time.h>

// Context of a ticket lock.
typedef struct
{
  // Mutex and conditional instance.
  pthread_mutex_t lock;
  pthread_cond_t condition;

  // Head/tail of ticket.
  // The head is the ticket being served, tail is the next tricked number.
  unsigned volatile head;
  unsigned volatile tail;
} TicketLock;


//-----------------------------------------------------------------------------
// Uses:
//   Unlock mutex.  Must be called for each to `ticketLockAcquire`.
// Input:
//   ticketLock - Pointer to context.
//-----------------------------------------------------------------------------
void ticketLockRelease( TicketLock * ticketLock )
{
  pthread_mutex_lock( &ticketLock->lock );

  // Advance to next mutex in line.
  ticketLock->head += 1;

  // Signal other threads the resource has been released.
  pthread_cond_broadcast( &ticketLock->condition );
  pthread_mutex_unlock( &ticketLock->lock );
}

//-----------------------------------------------------------------------------
// Uses:
//   Lock mutex.
// Input:
//   ticketLock - Pointer to context.
// Note:
//   May unblock if the mutex has been signaled to shutdown.  Returns `true`
//   to denote an abnormal acquisition of resources.
//-----------------------------------------------------------------------------
void ticketLockAcquire( TicketLock * ticketLock )
{
  pthread_mutex_lock( &ticketLock->lock );

  // Get a ticket number.
  unsigned ticket = ticketLock->tail;
  ticketLock->tail += 1;

  // Wait for item to be added to queue.
  while ( ticketLock->head != ticket )
    pthread_cond_wait( &ticketLock->condition, &ticketLock->lock );

  pthread_mutex_unlock( &ticketLock->lock );
}

//-----------------------------------------------------------------------------
// Uses:
//   Setup a ticket mutex.  Must be called before mutex is used.
// Input:
//   ticketLock - Pointer to context to setup.
// Note:
//   Call `ticketLockDestroy` when done with mutex to release resources.
//-----------------------------------------------------------------------------
void ticketLockInitialize( TicketLock * ticketLock )
{
  // Initialize the mutex and conditional wait context.
  pthread_mutex_init( &ticketLock->lock, NULL );
  pthread_cond_init( &ticketLock->condition, NULL );

  ticketLock->head = 0;
  ticketLock->tail = 0;
}

//-----------------------------------------------------------------------------
// Uses:
//   Release mutex resources.  Call when mutex is no longer needed.
// Input:
//   ticketLock - Pointer to context to destory.
//-----------------------------------------------------------------------------
void ticketLockDestroy( TicketLock * ticketLock )
{
  pthread_mutex_destroy( &ticketLock->lock );
  pthread_cond_destroy( &ticketLock->condition );
}

enum { THREADS = 8 };
static pthread_t threads[ THREADS ];
static TicketLock printMutex;

static bool volatile isRunning = true;

//-----------------------------------------------------------------------------
// Uses:
//   Sleep specified number of milliseconds.
// Input:
//   ms - Milliseconds to sleep.  Must be less than 1000.
//-----------------------------------------------------------------------------
static void sleep_ms( unsigned ms )
{
  struct timespec sleepValue = { 00 };
  sleepValue.tv_nsec = ms * 1000000;
  nanosleep( &sleepValue, NULL );
}

//-----------------------------------------------------------------------------
// Uses:
//   Thread to hog print mutex.
// Input:
//   parameter - Thread ID.
// Output:
//   Output is unused and simple returns parameter.
//-----------------------------------------------------------------------------
static void * threadFunction( void * parameter )
{
  // Recast parameter as unsigned.
  // Double-cast needed.
  unsigned threadIndex = (unsigned)(uintptr_t)parameter;

  // Random sleep before thread begins.
  sleep_ms( rand() % 50 );

  while ( isRunning )
  {
    // Acquire lock prior to printing.
    ticketLockAcquire( &printMutex );

    // Print.
    printf( "Thread: %u\n", threadIndex );

    // Waste time while holding the lock.
    sleep_ms( 10 + rand() % 50 );

    // Give up the lock.
    ticketLockRelease( &printMutex );
  }

  return parameter;
}

//-----------------------------------------------------------------------------
// Uses:
//   Ticket lock example.
// Output:
//   Always returns 0.
//-----------------------------------------------------------------------------
int main()
{
  // Setup random number generator.
  srand( time( NULL ) );

  ticketLockInitialize( &printMutex );

  printf( "Hit a Enter to stop.\n" );

  // Create and start all threads.
  for ( unsigned index = 0; index < THREADS; index += 1 )
    pthread_create( &threads[ index ], NULL, threadFunction, (void*)(uintptr_t)index );

  // Wait for enter to be pressed.
  fgetc( stdin );

  // Signal shutdown.
  isRunning = false;

  // Wait for all threads to stop.
  for ( unsigned index = 0; index < THREADS; index += 1 )
    pthread_join( threads[ index ], NULL );

  ticketLockDestroy( &printMutex );

  return 0;
}

In this example, the first string of output can vary depending on the sleep time. However, that sequence will repeat thereafter.

Something to note about pthread conditional variables. There are two signaling functions available, pthread_cond_signal and pthread_cond_broadcast. The broadcast alerts all waiting threads, but the signal will just signal one of the threads. We need to use broadcast because the thread that gets notified in a single signal might not be the next ticket holder and the process will deadlock.

I have found that the hog example is identical to the ticket lock example when compiled under Cygwin. This is likely because the mutex the port of pthreads is using in this environment is already a ticket lock.

Over the summer I first encountered the need from what I learned is called a ticket lock. On a project this week I needed it again. So what is a ticket lock and why would it be needed?

First let’s start with a mutex. A mutex (mutual exclusion) is a mechanism used to limit access to a resource. Think of a setup where there is one pencil and several writers. The pencil can only be operated by one person at a time. It you have two people trying to write with the same pencil you are going to end up with bad results. We need a way to make sure only one person can have the pencil at a time.

In software the pencil could be a function that writes data to the screen, like printf. The people wanting to use the pencil are threads. Several threads could be running at once, but they must take turns writing messages to the screen. A mutex will help with this, and it is quite simple. Before any thread prints, it must acquire the mutex. Once it has the mutex, it can print. After it is done printing, the thread must release the mutex so other threads can use it. Only one thread can get the mutex at a time. If a second thread tries to get the mutex while another thread is writing, the thread is suspended until the mutex becomes available. Very simple.

There is one problem. A mutex requests are not inherently fair. A thread that requests a mutex, releases it, and then requests it again is likely to get the mutex even if there are other threads that have requested the mutex. This is because the operating system won’t check on the other threads waiting on the mutex until the current thread’s time slice has finished. This allows the creation of a mutex hog that can starve the other threads of the desired resource.

The solution is fairly simple new type of mutex called a ticket lock. Like a normal mutex, the first thread to request the a free resource simply acquires the mutex, uses the resource, and then releases it. The difference is in the how threads that have to wait are handled. If the resource is busy, the requesting thread is given a number (ticket) and is suspended. Each time the mutex is released, all the waiting threads are notified. Each thread checks to see if the mutex is serving their ticket number. If so, that thread obtains the lock. If not, the thread goes back into suspension and continues to wait.

The reason this fixes a mutex hog is that if another thread wants the mutex being used by the hog, it is given a number. When the hog releases the mutex and tries to require it again, the hog is issued a number. Although the hog requested the mutex immediately after releasing it, the hog must wait for the other requests before it can again obtain the mutex.

In order to implement this we need not just a mutex but a conditional variable. This is a way to have multiple threads wait for some condition to occur. Some other thread can signal that condition has occurred and one or more thread can resume. This is done with a single mutex, and a normal conditional wait looks like this:

lock( conditionalMutex )

while ( not someCondition )

conditionalWait( conditionContext, conditionalMutex )

...use mutex…

unlock( conditionalMutex )

The conditional wait (conditionalWait) actually unlocks the mutex while waiting, but once the condition has been met the mutex is locked again. In this way, everything between the lock and unlock is mutex protected with the exception of the conditionalWait call. The key item here is condition being waited on (someCondition) is mutex protected so it cannot be modified while being checked or after it is found to be true. Somewhere is a function that will signal the condition, like this:

lock( conditionalMutex )

...do something that effects someCondition ...

conditionalSignal( conditionContext )

unlock( conditionalMutex )

The signaling code needs to do something that changes the condition we are waiting for, and then signal waiting threads a change has taken place.

With conditional variables implementation is quite easy. We need two functions to make a mutex: lock and unlock. For the context we need a mutex, conditional variable context, the next ticket number to be issued (tail), and the current ticket number being served (head).

The lock will look like this:

lock( ticked.mutex )

ourTicketNumber = ticket.tail

ticket.tail += 1

while ( ticket.head != ourTicketNumber )

conditionalWait( ticket.conditionContext, ticket.mutex )

unlock( ticked.mutex )

The while loop is waiting for the ticket number being served to be our ticket number. The head and tail start off equal. Thus the first request for a lock will not have to wait because the tail will equal the ticket number. If the ticket mutex was not unlocked and second request would get the next highest ticket number which would not be equal to the tail, and it would have to wait.

Note that the context mutex (ticket.mutex) is only used to protect changes to the context—specifically head and tail. That is why it is unlocked after the while loop finishes. The ticket mutex is locked, but the context for the ticket mutex is available to other threads.

Now the unlock function:

lock( ticked.mutex )

ticket.head += 1

conditionalSignal( conditionContext )

unlock( conditionalMutex )

The unlock function simply advances the head and signals waiting threads so they can check to see if it is their turn. Those threads that find a mismatch between the new head and their ticket number continue to wait, but the thread that finds the match will be able to run. If there are no threads waiting, the head and tail are again equal just like when the process started.

Multitasking operating system typically implement their own mutex and conditional variable functions, so the exact syntax of this varies. In future articles example will be provided along with considerations for improvements.

June 27, 2018

C compilers and circular shifts

I was evaluating some pseudorando m number generator (PRNG) and started wondering about compiler optimization. Many modern processors have a circular shift instruction, but languages such as C have no direct support for this. As a result, optimization has been designed to recognize circular shifts.

First, a circular shift:

output = ( input << shift ) | ( input >> ( sizeof( input ) * 8 - shift ) )

As noted most C compiler recognize this line of code as a circular shift and are able to implement the circular shift instruction rather than using two shifts and an OR.

I analyzed one of the smallest, fastest, and highest entropy quality PRNGs I tested was Xoroshiro128+. It is able to produce 1 TB of random data in 3 minutes (on a 3 GHz Core i7). The algorithm is quite simple:

result = state0 + state1
state1 = state0 ^ state1
state0 = ( state0 <<< 55 ) ^ state1 ^ ( state1 << 14 )
state1 = ( state1 <<< 36 )

All values are 64-bit.  The triple less than sign here means a circular shift left. The algorithm has a 128-bit state, meaning a period of 2128 (340 undecillion, or 340 trillion trillion trillion). On a native 64-bit machine the assembly is quite small.

64-bit x86 64-bit ARM
mov rax, QWORD PTR [rcx]
mov r9, QWORD PTR 8[rcx]
mov rdx, rax
mov r8, rax
add rax, r9
xor rdx, r9
ror r8, 9
mov r10, rdx
xor r8, rdx
ror rdx, 28
sal r10, 14
mov QWORD PTR 8[rcx], rdx
xor r8, r10
mov QWORD PTR [rcx], r8
ret
mov x3, x0
ldp x2, x0, [x0]
eor x1, x2, x0
add x0, x2, x0
eor x2, x1, x2, ror 9
ror x4, x1, 28
eor x1, x2, x1, lsl 14
stp x1, x4, [x3]
ret

Note that although the ARM assembly is less instructions, the x86 assembly can do several steps in parallel. I found this was an interesting exploration of both an algorithm and of modern C compilers.