Rudiments of Java concurrency control, part 1

Java logo Coordination of threads is a key part of implementing software that exhibits significant concurrency. It seems to be particularly difficult in Java. That's not because of any deficiency in Java itself -- the language has a number of first-class features for handling concurrency -- but because of the environment in which Java tends to be used.

Java is a popular choice for middleware systems, in which large numbers of concurrent clients share access to a small number of back-end systems -- typically relational databases. In production applications, it's not unusual for each client interaction to create a new thread, and for there to be thousands of threads competing for the same resources. We use locks and semaphores to control access to these resources, if they are not themselves thread-safe. So understanding these concurrency-control mechanims is essential for for successful programming in this kind of environment.

The way that the Java JVM handles concurrency has not changed significantly for at least ten years. Even Java 1.1 had an API for thread management. And yet this topic seems to be poorly-understood, even now.

This is the first (I hope) in a series of articles that describe Java's concurrency management in some detail. I'll include some investigation of Java thread dumps, so we can see what's going on inside the JVM. An ability to understand thread dumps is crucial to troubleshooting concurrency-related problems, although nothing makes this task straightforward.

Note:
I'm assuming the reader is familiar with Java in general, and knows the basics of creating and running threads. I tested my examples with Java 17 but, frankly, they should work on Java 1.1 as well.

In this first article I'll look at coordination of threads using simple locks on a monitor object.

Locking an object as a monitor

Probably the most fundamental means of controlling concurrency is to use a specific object as a monitor. The object may itself be a subject of concurrency control -- that is, it may be some non-thread-safe object that we must regulate access to -- or it might be an arbitrary object just used as a lock. In this latter case, we'll probably use the monitor object to regulate access to some particular part of code, which is considered to be a non-concurrent section. A non-concurrent section is some piece of code that should not be entered by multiple concurrent threads. In the examples that follow, the 'non-concurrent' operation -- the thing that should not be entered concurrently -- is a method delay(). All this method does is sleep, which doesn't really raise any thread-safety concerns; but this method simulates some real-world operation which we consider to be non-thread-safe.

Note:
A non-concurrent section is not exactly the same as a critical section, although these terms are often used interchangeably. A critical section is one that should not be interrupted for anything, not just a different thread in the same application.

It's common in Java programming to use this as the monitor object -- a practice that can be troublesome if the underlying mechanism is not properly understood.

In the following example I use a simple String as the monitor object, although I could use any object. In many cases using a trivial object is entirely appropriate -- particularly if all you want to do is to define a non-concurrent section.

I've made the monitor object static here, because it's called from a static context -- the code won't compile otherwise. However, in a real application you may want to make it static anyway, so that there is only one instance, rather than different instances for different instances of the class that owns it. Getting this sort of thing wrong is a common cause of failed synchronization, which I'll come back to later.

Note:
There are some dubious coding practices in this article -- repetition, lack of exception handling, among other things. I've written the code like this to focus on the threading behaviour, which is the relevant part.

public class Test1
  {
  static String s = new String(); // This is the monitor object

  private static void delay()
    {
    try { Thread.sleep (1000); } catch (Exception e){};
    }

  private static void shortDelay()
    {
    try { Thread.sleep (10); } catch (Exception e){};
    }

  public static void main (String[] args)
    {
    // Define and start thread 1
    Runnable r1 = new Runnable()
      {
      public void run ()
        {
        while (true)
          {
          synchronized (s)
            { // Start non-concurrent section
            System.out.println ("r1 in");
            delay();
            System.out.println ("r1 out");
            } // End non-concurrent section
          shortDelay();
          }
        }
      };
    Thread t1 = new Thread (r1);
    t1.start();

  // Define and start thread 2
    Runnable r2 = new Runnable()
      {
      public void run ()
        {
        while (true)
          {
          synchronized (s)
            { // Start non-concurrent section
            System.out.println ("r2 in");
            delay();
            System.out.println ("r2 out");
            } // End non-concurrent section
          shortDelay();
          }
        }
      };
    Thread t2 = new Thread (r2);
    t2.start();
    }
  }

In the example, the main method creates two threads, each of which performs the same, near-identical endless loop. The only differences between the two threads are the messages printed when entering and leaving the synchronized section. The monitor object s defines non-concurrent sections, between the "r... in" and "r... out" println calls. If the synchronization is working properly, the output of the program will be something like this:

r1 in
r1 out
r2 in
r2 out
...

What we shouldn't see is the two threads being interleaved, that is, we should never see this:

r1 in
r2 in
r1 out
r2 out
...

Why do we also need the calls to shortDelay()? Without them one of the two threads is likely to seize the lock and, as soon as it releases it, grab it again. So it's plausible that only one of the two threads will ever get any attention from the CPU. The short delay -- which could be any sequence of operations in practice -- just provides time for the lock to switch between threads. Nothing in the Java JVM, so far as I know, guarantees a particular level of service for a thread -- we have to be careful not to allow unconstrained locking.

Let's see what this program does in the JVM, by taking a thread dump when it's running (e.g., by running jstack):

"Thread-0" #13 prio=5 os_prio=0 cpu=5.75ms elapsed=15.27s tid=0x00007f5ff81357e0 nid=0xad259 waiting for monitor entry  [0x00007f5fcd2fd000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at Test1$1.run(Test1.java:28)
        - waiting to lock <0x000000008e119260> (a java.lang.String)
        at java.lang.Thread.run(java.base@17.0.9/Thread.java:840)

"Thread-1" #14 prio=5 os_prio=0 cpu=5.48ms elapsed=15.27s tid=0x00007f5ff8136920 nid=0xad25a waiting on condition  [0x00007f5fcd1fd000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(java.base@17.0.9/Native Method)
        at Test1.delay(Test1.java:10)
        at Test1$2.run(Test1.java:48)
        - locked <0x000000008e119260> (a java.lang.String)
        at java.lang.Thread.run(java.base@17.0.9/Thread.java:840)

Note:
There's no way to tell which of Thread-0 and Thread-1 in the thread dump correspond to the threads in my Java code. That's harmless here, because they do the same thing. In practice, I'd want to name my threads, to make the thread dump easier to follow.

We can see that one thread has locked a java.lang.String -- that's the monitor object -- and is now sleeping. That's because it's executing the call to delay(). In reality, the application will be doing some real work here, although the thread might still be sleeping, if it's waiting for some external system.

The other thread is marked as BLOCKED, and it's waiting to lock a java.lang.String. You can see that the object ID of the monitor object is the same in both threads, which is how we know that the same object is being used for locking.

Note:
The line number in the thread dump, where it says 'waiting to lock...' is the line where execution will continue when the lock is acquired, not the last line executed.

When troubleshooting synchronization problems in production, it's quite common to see a single thread which has locked a specific monitor object, and thousands of other threads waiting to lock it. This might be because, for example, the thread that has the lock is stuck waiting for some external system to respond. Having the object ID of the monitor object in the thread dump makes these kinds of problems comparatively easy to troubleshoot. We won't always be so lucky.

Using `synchronized(this)` to mark a non-concurrent section

In the previous section I showed the use of a monitor object to create a lock on a non-concurrent section of code -- something that should not be entered on another thread. In Java programming, using synchronized(this) to create a non-concurrent section is idiomatic, although it's not without it's problems, as I'll explain later.

I've rewritten the previous example so that now the main method creates an instances of its own class -- Test2 -- and then creates two threads. These threads call the methods do1() and do2() in the instance of Test2 in an endless loop. We couldn't use this in the previous example, because everything was static. In all other particulars, the application behaves in the same way.

public class Test2
  {
  static String s = new String();

  private static void delay()
    {
    try { Thread.sleep (1000); } catch (Exception e){};
    }

  private static void shortDelay()
    {
    try { Thread.sleep (10); } catch (Exception e){};
    }

  public void do1()
    {
    synchronized (this)
      { // Start non-concurrent section
      System.out.println ("r1 in");
      delay();
      System.out.println ("r1 out");
      } // End non-concurrent section
    shortDelay();
    }

  public void do2()
    {
    synchronized (this)
      { // Start non-concurrent section
      System.out.println ("r2 in");
      delay();
      System.out.println ("r2 out");
      } // End non-concurrent section
    shortDelay();
    }

  public static void main (String[] args)
    {
    Test2 test2 = new Test2();

    // Define and start thread 1
    Runnable r1 = new Runnable()
      {
      public void run ()
        {
        while (true)
          {
          test2.do1(); // Call do1() repeatedly
          }
        }
      };
    Thread t1 = new Thread (r1);
    t1.start();

    // Define and start thread 2
    Runnable r2 = new Runnable()
      {
      public void run ()
        {
        while (true)
          {
          test2.do2(); // Call do2() repeatedly
          }
        }
      };
    Thread t2 = new Thread (r2);
    t2.start();
    }
  }

The output of this program should be the same as the previous example: we don't expect to see any occasion where the non-concurrent sections in do1() and do2() are interrupted.

The synchronization in this example works in exactly the same way as in the previous one, except that the monitor object is the instance of the class itself. It's less versatile than creating a specific monitor object, but very easy to read. We can see the instance being used as a monitor by looking at a thread dump again:

"Thread-0" #13 prio=5 os_prio=0 cpu=5.56ms elapsed=14.70s tid=0x00007fb06c13d570 nid=0xad6db waiting on condition  [0x00007faff6afd000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(java.base@17.0.9/Native Method)
        at Test2.delay(Test2.java:10)
        at Test2.do1(Test2.java:23)
        - locked <0x000000008e119b08> (a Test2)
        at Test2$1.run(Test2.java:51)
        at java.lang.Thread.run(java.base@17.0.9/Thread.java:840)

"Thread-1" #14 prio=5 os_prio=0 cpu=5.69ms elapsed=14.70s tid=0x00007fb06c13e630 nid=0xad6dc waiting for monitor entry  [0x00007faff69fd000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at Test2.do2(Test2.java:33)
        - waiting to lock <0x000000008e119b08> (a Test2)
        at Test2$2.run(Test2.java:65)
        at java.lang.Thread.run(java.base@17.0.9/Thread.java:840)

Here we see that ' a Test2' is being locked and waited on, in just the same same way as the String in the previous example.

Synchronized methods

Java provides a neat way to denote an entire method as a non-concurrent section, using the synchronized keyword:

public synchronized void myCriticalSection(...) {...}

This is nicely readable, but has the same problems as synchronized(this) does -- in fact, it's implemented in the JVM as a lock on this.

Why we have to be careful with `synchronized(this)`

The use of synchronized(this) is so idiomatic that it's easy to forget that it's not magic. Here is an example that fails: the apparent non-concurrent section is not protected from concurrent execution. This example looks very similar to the previous one, and it's not immediately obvious why it fails.

public class Test3
  {
  private static void delay()
    {
    try { Thread.sleep (1000); } catch (Exception e){};
    }

  private static void shortDelay()
    {
    try { Thread.sleep (10); } catch (Exception e){};
    }

  public void run()
    {
    // Define and start thread 1
    Runnable r1 = new Runnable()
      {
      public void run ()
        {
        while (true)
          {
          synchronized (this)
            { // Start of apparent non-concurrent section
            System.out.println ("r1 in");
            delay();
            System.out.println ("r1 out");
            } // End of apparent non-concurrent section
          shortDelay();
          }
        }
      };
    Thread t1 = new Thread (r1);
    t1.start();

    // Define and start thread 2
    Runnable r2 = new Runnable()
      {
      public void run ()
        {
        while (true)
          {
          synchronized (this)
            { // Start of apparent non-concurrent section
            System.out.println ("r2 in");
            delay();
            System.out.println ("r2 out");
            } // End of apparent non-concurrent section
          shortDelay();
          }
        }
      };
    Thread t2 = new Thread (r2);
    t2.start();
    }

  public static void main (String[] args)
    {
    new Test3().run();
    }
  }

A thread dumps shows the problem:

"Thread-0" #13 prio=5 os_prio=0 cpu=6.65ms elapsed=11.35s tid=0x00007fa158135460 nid=0xad399 waiting on condition  [0x00007fa13c5fd000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(java.base@17.0.9/Native Method)
        at Test3.delay(Test3.java:22)
        at Test3$1.run(Test3.java:42)
        - locked <0x000000008e119e00> (a Test3$1)
        at java.lang.Thread.run(java.base@17.0.9/Thread.java:840)

"Thread-1" #14 prio=5 os_prio=0 cpu=6.00ms elapsed=11.35s tid=0x00007fa1581365a0 nid=0xad39a waiting on condition  [0x00007fa13c4fd000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(java.base@17.0.9/Native Method)
        at Test3.delay(Test3.java:22)
        at Test3$2.run(Test3.java:62)
        - locked <0x000000008e11ac40> (a Test3$2)
        at java.lang.Thread.run(java.base@17.0.9/Thread.java:840)

You can see that both threads have been able to lock their monitor objects (see 'locked...'). Neither thread is waiting on the other. The reason that neither thread blocks is that, in fact, each thread has a different monitor object.

How can this be, when both threads synchronize on this? We can see from the thread dump that for one thread, this is an instance of Test3$1, and for the other it's an instance of Test3$2. These classes with $ in their names are actually the anonymous inner classes created by

  Runnable r1 = new Runnable()
      {
      public void run ()
        {
        ...

The compiler has created numbered classes based on the name of the main classTest3, because we've tacitly created two classes that implement Runnable. The main class is never actually instantiated at all -- there is no 'new Test3()' anywhere.

Code like this will fail in unpredictable ways, usually under conditions of high load, and such problems are miserable to troubleshoot.

The moral of this story is that, when you use synchronized(this), be sure you understand what this actually refers to.

There's another subtlety in the use of this which is often discussed -- sometimes heatedly -- but I'm not sure whether it's really a problem in practice. Using this as a monitor object exposes your monitor to other parts of the code. Something else can accidentally synchronize on the same object, and create additional, unwanted locking. If you do use synchronized(this) in a class, you ought to document it, so that you'll know to be careful not to use the class as a monitor object elsewhere in the code.

In a sense, using a synchronized method is better in that regard, even though it has the same limitation, because at least it's self-documenting.

Closing remarks

In this article I explained the use of monitor objects for creating locks -- either to control access to the monitor object itself, or just to define non-concurrent sections in the code. The monitor object can be this, but this techniques needs to be used with care.

Whatever kind of lock you use, you should be aware of, and document, the boundaries of the lock. It's all too easy to do too much under the control of a lock, and end up with a bottleneck.

The kind of synchronization in this article is easy to use, and leaves relatively clear traces of itself in a thread dump -- so its comparatively easy to troubleshoot.

In later articles I'll describe some more sophisticated forms of concurrency management.