Maybe the last day of unseasonably warm temperatures. My ride to breakfast had temperatures around 43°F/6°C, but had cooled to around 38°F/3°C by my noon ride home. The humidity was high enough that even the strong head winds didn't make the ride uncomfortable. Last year at this time temperatures were below 0°F/-18°C. The National Weather Service says there is a 90% chance 2019 will develop a Southern Oscillation (El Niño). So maybe not my last day of warm riding.
So during my Thanksgiving break and Christmas break I worked on a project involving CRCs that I hope to share here. The problem I’ve run into is, where do I start? How much background knowledge should I assume for this project? And what should I try to support?
This all began when I wrote a simple Python script to take a template file and make it generate source code to perform any CRC. This is useful because there are a lot of different CRC types and several ways to implement them. I wanted a master template that would be able to generate C code for any CRC I want. That code works and is tested. However in order to deploy it.
The last modification in my series about ticket locks is about aborting locks. In an ideal world every function waiting on a mutex will eventually obtain it. However, there are times we would like a blocking function to stop blocking but not get the mutex. The most common example is during application shutdown when we need all threads to stop blocking and rejoin the main thread.
Our ticket lock implementation uses conditional variables to signal threads waiting for the mutex that the mutex has been released. During an abort we simple need to signal the waiting threads but have a flag that denotes the mutex is aborting. This information needs to be relayed to the parent function requesting the lock, and that function needs to check this status. If the lock didn’t take place, the mutex wasn’t obtained, and whatever operations were supposed to happen should be aborted. In C++ throwing an exception would be appropriate. For our C implementation, simple returning a failure code for the mutex lock will suffice.
There are just two changes and one new function. The ticket lock context will get a flag called isRunning. By default this flag is true. When forcibly shutting down, the flag is set to false. A new function called ticketLockShutdown sets the flag to false, and broadcasts a condition change to all blocking threads. The last change is in the while loop used to wait for the desired ticket number. It will additionally stop if isRunning is not set and return an error should that condition arise.
First, what is meant by recursion? A mutex is usually used to isolate a resource so only one thread can use it at a time. Typically a mutex is used to a wrapped critical section. The critical section starts with the mutex being locked and ends when the mutex is released. It is a good practice to keep critical sections short as not to starve other threads that may want the resource.
Now consider that you have a resource used in several smaller functions, but also in larger functions. Think of a scenario where there are functions to print a header, body, and footer as well as a function to print the entire page that calls each of the proceeding functions. The header, body and footer functions all use a mutex to lock the print function. However, the print page function wants to lock the print function so that the entire page is not interrupted.
If the page function calls the individual header, body and footer functions, there is the chance another thread could grab the mutex after the header or body and insert text. This is especially true with a ticket lock where mutex request order is guaranteed. The page function could re-implement the header, body and footer functions and wrap them all with the mutex but that is bad pratice. What would be best is if the page function could simple request the mutex, call each of the sub-functions that also request the mutex, and then release the mutex. What we want is a recursive mutex.
A recursive mutex allows the same thread to make multiple requests for a mutex lock. The most basic mutex could dead lock itself by simply requesting the same mutex twice. The first lock it will get, but the second lock it can never get because it didn’t release the lock it already had. Many mutex implementation allow a thread to obtain a mutex if the it is the same thread already holding the lock. In addition, such implementations also require the same number of unlocks as locks before the mutex is freed by a thread. That is, if a thread requests a mutex 3 times, it must release it 3 times before the mutex actually becomes free. We would like this functionality extended to a ticket lock.
So what additional requirements are there for a recursive ticket lock? The mutex only needs to know the ID of the thread which currently holds the lock, and a count of how many times the current thread has locked the mutex. If the mutex is already locked, but the locking thread ID matches the current thread ID, the lock count is increased and there is no block. When unlocking the mutex, the lock count is decreased. If the count is zero, the mutex is actually released. If the count is not zero, the lock remains.
You almost always desire a recursive ticket lock implementation. The only item to consider is that you must have the same number of locks as unlocks or you can create a deadlock. Combine with single-entry single exit practices and starting/ending functions with lock/unlock virtually guarantees there will never be deadlock. Since the mutex is recursive all functions that use the resource can use their own lock/unlock and not have to worry about resource overlap or deadlock.
I wrote about the theory of a ticket lock and would now like to explore an example. First we need to demonstrate what can happen without a ticket lock. The implementations will use POSIX threads (pthreads).
In this example several threads are created that all want the print resource. The thread hogs the resource by releasing the lock but immediately asking for the lock again. Typically what one will expect to see is a single thread printing its ID over and over, occasionally switching to another. The switch can happen if the operating system does a task switch just after the lock is released. Rare but it will eventually happen. The exact results will depend on the environment.
Now the same example, but with the addition of ticket locks.
In this example, the first string of output can vary depending on the sleep time. However, that sequence will repeat thereafter.
Something to note about pthread conditional variables. There are two signaling functions available, pthread_cond_signal and pthread_cond_broadcast. The broadcast alerts all waiting threads, but the signal will just signal one of the threads. We need to use broadcast because the thread that gets notified in a single signal might not be the next ticket holder and the process will deadlock.
I have found that the hog example is identical to the ticket lock example when compiled under Cygwin. This is likely because the mutex the port of pthreads is using in this environment is already a ticket lock.
Over the summer I first encountered the need from what I learned is called a ticket lock. On a project this week I needed it again. So what is a ticket lock and why would it be needed?
First let’s start with a mutex. A mutex (mutual exclusion) is a mechanism used to limit access to a resource. Think of a setup where there is one pencil and several writers. The pencil can only be operated by one person at a time. It you have two people trying to write with the same pencil you are going to end up with bad results. We need a way to make sure only one person can have the pencil at a time.
In software the pencil could be a function that writes data to the screen, like printf. The people wanting to use the pencil are threads. Several threads could be running at once, but they must take turns writing messages to the screen. A mutex will help with this, and it is quite simple. Before any thread prints, it must acquire the mutex. Once it has the mutex, it can print. After it is done printing, the thread must release the mutex so other threads can use it. Only one thread can get the mutex at a time. If a second thread tries to get the mutex while another thread is writing, the thread is suspended until the mutex becomes available. Very simple.
There is one problem. A mutex requests are not inherently fair. A thread that requests a mutex, releases it, and then requests it again is likely to get the mutex even if there are other threads that have requested the mutex. This is because the operating system won’t check on the other threads waiting on the mutex until the current thread’s time slice has finished. This allows the creation of a mutex hog that can starve the other threads of the desired resource.
The solution is fairly simple new type of mutex called a ticket lock. Like a normal mutex, the first thread to request the a free resource simply acquires the mutex, uses the resource, and then releases it. The difference is in the how threads that have to wait are handled. If the resource is busy, the requesting thread is given a number (ticket) and is suspended. Each time the mutex is released, all the waiting threads are notified. Each thread checks to see if the mutex is serving their ticket number. If so, that thread obtains the lock. If not, the thread goes back into suspension and continues to wait.
The reason this fixes a mutex hog is that if another thread wants the mutex being used by the hog, it is given a number. When the hog releases the mutex and tries to require it again, the hog is issued a number. Although the hog requested the mutex immediately after releasing it, the hog must wait for the other requests before it can again obtain the mutex.
In order to implement this we need not just a mutex but a conditional variable. This is a way to have multiple threads wait for some condition to occur. Some other thread can signal that condition has occurred and one or more thread can resume. This is done with a single mutex, and a normal conditional wait looks like this:
lock( conditionalMutex )
while ( not someCondition )
conditionalWait( conditionContext, conditionalMutex )
unlock( conditionalMutex )
The conditional wait (conditionalWait) actually unlocks the mutex while waiting, but once the condition has been met the mutex is locked again. In this way, everything between the lock and unlock is mutex protected with the exception of the conditionalWait call. The key item here is condition being waited on (someCondition) is mutex protected so it cannot be modified while being checked or after it is found to be true. Somewhere is a function that will signal the condition, like this:
lock( conditionalMutex )
...do something that effects someCondition ...
conditionalSignal( conditionContext )
unlock( conditionalMutex )
The signaling code needs to do something that changes the condition we are waiting for, and then signal waiting threads a change has taken place.
With conditional variables implementation is quite easy. We need two functions to make a mutex: lock and unlock. For the context we need a mutex, conditional variable context, the next ticket number to be issued (tail), and the current ticket number being served (head).
The lock will look like this:
lock( ticked.mutex )
ourTicketNumber = ticket.tail
ticket.tail += 1
while ( ticket.head != ourTicketNumber )
conditionalWait( ticket.conditionContext, ticket.mutex )
unlock( ticked.mutex )
The while loop is waiting for the ticket number being served to be our ticket number. The head and tail start off equal. Thus the first request for a lock will not have to wait because the tail will equal the ticket number. If the ticket mutex was not unlocked and second request would get the next highest ticket number which would not be equal to the tail, and it would have to wait.
Note that the context mutex (ticket.mutex) is only used to protect changes to the context—specifically head and tail. That is why it is unlocked after the while loop finishes. The ticket mutex is locked, but the context for the ticket mutex is available to other threads.
Now the unlock function:
lock( ticked.mutex )
ticket.head += 1
conditionalSignal( conditionContext )
unlock( conditionalMutex )
The unlock function simply advances the head and signals waiting threads so they can check to see if it is their turn. Those threads that find a mismatch between the new head and their ticket number continue to wait, but the thread that finds the match will be able to run. If there are no threads waiting, the head and tail are again equal just like when the process started.
Multitasking operating system typically implement their own mutex and conditional variable functions, so the exact syntax of this varies. In future articles example will be provided along with considerations for improvements.
A few days ago my new 400 GB micro SD card arrived. I formatted it to ext4 and synchronized it to the webpage directory on the Sun Dragon during the week but didn’t have time until now to do the transition. The transition requires starting a backup server. I have a VM for this but it turns out that VM hasn’t run since 2016—the last time I updated the SD card. I thought I would try and update it, but the updates made a mess of the VM and I abandon the idea. Maybe one day I will update the VM, but not worth it for the few minutes it will be running for the changeover.
The Sun Dragon had been running 123 days by this point—a fairly typical runtime for this machine. The backup server actually serves using the network archive of the website since this is synchronized once a day anyway. It only needs to synchronize the database and SSL certificates which takes a few seconds. Then it is ready to run. I did the changeover on Elmwood Gate (our router) and checked to make sure external computers were getting through. For this I used the Emerald Dragon.
The Sun Dragon went down at 4:06 pm. The drive change only took a minute but I was concerned the machine would not restart. I had setup fstabs not to stop if the SD card wasn’t present, but it stopped on another mount this time. Turns out the network mounts will also hold up the boot, and in this case a network mount had been lost, and a backup copied into the mount location. With a file in the mount location, the mount failed. After I fixed that the Sun Dragon was back online. I quick change of the UUID for the new SD card and that was mounted. The synchronization failed to get the user IDs correct so I had to manually change all of those. Afterward, the Sun Dragon was able to serve pages. At 4:28pm I started the switch back up the Sun Dragon and by 4:32pm had verified everything was functional. At 4:34pm the backup server was shutdown. It ran DrQue.net for less than 30 minutes.
Now the Sun Dragon has a lot more free space.