Locking with Semaphores

With multi-thread programming, there's a need to lock or synchronize access (both mean the same, the latter is just a fancier expression).

What does it mean? It means when one thread is doing something on something that can be accessed by another thread, no other threads should not interfere with it.

For instance, we might be writing to a file. If other thread wants to read the file, it might read a mixture or what the first thread has written and the old content. A random mixture.

Another example could be a list of objects in memory. One thread is deleting an object from the list, another thread is reading all elements of the list, for instance, trying to find a person by his last name or some identification code. The second thread might catch the list in some inconsistent state, when the entry is only partially deleted. Or it might just be reading the deleted item. When the second thread wants to move to the next item, results might be catastrophic if the first thread deleted the item meanwhile. Depending on the implementation of the list, we might lose some data or even crash the program.

To summarize, we have three options:
  1. stick to one-thread programming (lame)
  2. prevent other threads accessing a structure that is being changed by a thread (locking)
  3. design shared structures in a way that they are always consistent, even if one thread is changing, it and others are reading it at the same time (lock-free design)
We'll first tackle the approach #2, and then see if we can do something on path #3. Locking means that other threads must wait. The simplest way is the most stupid, for instance we could have functions Lock and Unlock and a variable that represents a lock like this:
// PLEASE DON'T USE THIS.
void Lock(bool &Locked)
{
  while (Locked)
  {
     // just loop
  }
  Locked = true;  
}

void Unlock(bool &Locked)
{
  Locked = false;
}

bool g_lock = false;

void SomeFunc()
{
  // "lock" the list
  Lock(g_lock);
  // do something with it
  // unlock
  Unlock(g_lock);
}

void SomeOtherFunc()
{
  // "lock" the list
  Lock(g_lock);
  // do something else with it
  // unlock
  Unlock(g_lock);
}
What are problems with such Lock and Unlock functions? First, they are just running in a dead loop. That's waste of processor time: something else could be running during that time! But how we can voluntary put a thread on hold and give opportunity to some other thread? There's no standard way, this must be provided by the operating system. For instance, Windows have function Sleep(); we could write:
// PLEASE DON'T USE THIS EITHER.
#include <windows.h>

void Lock(bool &Locked)
{
  while (Locked)
  {
     // wait 10 ms
     Sleep(10)
  }
  Locked = true;  
}
What's wrong with this? There a subtle problem: the loop waits until Locked becomes false, meaning the other thread that was holding it, released the lock (or "unlocked" it). The we immediately set the lock to true, meaning other threads see it's locked. The problem is the word immediately.
It's not immediately, some time passes between we leave the loop and set the variable Locked to true.
During that time, another thread could intervene, execute Loop(), and set Locked to true. We would proceed to set Locked to true on our own, and even worse both threads would think that they "own" the lock. The result is a disaster. This indicates two things: design of locking algorithms is not easy. Surely, operating systems have provided ready-made locks. But how are those locks implemented? There must be some support by the hardware itself, either in forms of temporary banning switching between threads or by providing operations that cannot be interrupted. The simplest way to correctly lock is to use OS-provided objects called semaphores. They are offered by virtually every operating system. To make our code a bit portable, I'll use portability macros. If you don't use Windows, you will have to adapt them for your OS:
#ifdef _WIN32
#include <windows.h>
#define SEM_HANDLE HANDLE
#define SEM_INIT(s) s = CreateSemaphore(NULL, 0, 0x7FFFFFFF, NULL)
#define SEM_PUT(s) ReleaseSemaphore(s, 1, NULL)
#define SEM_WAIT(s) WaitForSingleObject(s, INFINITE)
#define WAIT_OK WAIT_OBJECT_0
#define SEM_DONE(s) CloseHandle(s); s = NULL;
#else
#error Define macros for your OS!
#endif
We'll have 4 functions:
// THIS WORKS BUT IT'S NOT THE SMARTEST WAY.
void Lock(SEM_HANDLE &Lock)
{
  SEM_WAIT(Lock);
}

void Unlock(SEM_HANDLE &Lock)
{
  SEM_PUT(Lock);
}

// call this before you use the lock
void Init(SEM_HANDLE& Lock)
{
  SEM_INIT(Lock);
  SEM_PUT(Lock);
}

SEM_HANDLE g_Lock;
These functions are mere renaming of OS-provided functions to create a semaphore and operate it. Semaphore is an OS object that can contain 0 or more "units". We have operation to add an "unit" and to take an "unit". If there's an unit available, taking of an unit succeeds (now really) immediately. If not, the call blocks until an unit becomes available, and then it's taken -- it cannot be taken by anyone else. If more than one thread is waiting on an "unit", and one becomes available, only one thread will become unblocked and get the "unit" -- after it the semaphore is again empty, and all other threads will be still waiting, possibly forever. A semaphore-lock is free if there's an "unit", and locked if there are no "units" available. That's why we in Init() create a semaphore and immediately put an "unit" into it. And we can use it like before:
void SomeFunc()
{
  // "lock" the list
  Lock(g_Lock);
  // do something with it
  // unlock
  Unlock(g_Lock);
}

void SomeOtherFunc()
{
  // "lock" the list
  Lock(g_Lock);
  // do something else with it
  // unlock
  Unlock(g_Lock);
}
This works. But it can be improved, sometimes. Download C source:

Introduction

I have been programming computers for almost 20 years. I still find programming fascinating.

There are many interesting things in programming. I will try to share my often lame attempts to write something useful. The real value is understanding how and why something works, and if I write something useful that's just a bonus.

All my knowledge is won the hard way, working and learning as I worked, try and error, reading books, trying to implement stuff, failing, making things better, succeeding, then failing again...

I'm mostly working on MS Windows, with free tools like Microsoft Visual Express, in C/C++. However, I'll try to write OS-independent code, extracting all specific parts. I will focus mostly on basic stuff, for example on thread synchronization and lock-free algorithms.

You can use anything I post here for anything you want, free of any charges etc, but it would be nice to acknowledge me somewhere in the documentation.

Also, English is not my native language. So excuse my vocabulary and grammar. If you want to know something about my native language, check my other blog.