Notes on Virtual Threads and Clojure

Have you heard the news? Virtual Threads implementation landed into JDK19 as a preview feature! Are you excited? No? You should be! It's an amazing addition to the Java platform.

Note this article discusses Preview version of software. Take it as an inspiration, not something that is set to stone!

Intro to Project Loom and Virtual Threads

Virtual Threads are the most significant feature of the so-called Project Loom.

Project Loom was launched in 2017 by Ron Pressler and his team at Oracle. The main goal of the project was to extend the capabilities of Java Virtual Machine to address the complexity of writing highly concurrent and scalable software.

There is more to Project Loom than just Virtual Threads. Project wiki specifically mentioned Delimited continuations and Tail-call elimination. But it's fair to say they are the most significant addition to the Java platform from the user perspective and productivity.

I don't want to dive deeper into delimited continuations and tail call elimination features to stay focused on the most practical matters, but it's fair to point out at least that delimited continuations seem to be quite important for the introduction of the Virtual Threads to the Java platform.

So what are they, and why they are so groundbreaking that it was worthy to write this post about them?

Traditionally, JVM threads were built around OS threads. This fact also determines their major properties:

  1. Single thread was mapped to a single OS thread
  2. Blocking (waiting) on a thread caused the thread to be effectively wasted for other tasks
  3. Managing threads on JVM was costly. Each thread easily uses an additional Megabytes of memory thus spawning many of them is not wise.

These limitations are mitigated by introducing Virtual Threads. They no longer map one-to-one to OS threads. A single OS thread can host many thousands or more Virtual Threads without a worry about blocking issues or excessive memory demands. This requires changes to the implementation of JVM and standard library to allow an effective schedule of Virtual Threads.

Virtual Threads also improve a situation when limitations of OS threads were addressed by using more or less sophisticated thread pools. Experienced developers know that thread pools (of OS threads) also have significant downsides if not constrained properly.

Virtual Thread is represented by a class java.lang.VirtualThread and it extends java.lang.Thread. This follows the Liskov-substitution principle and allows us to easily introduce them into our existing codebases.

Clojure and Threads

It's clearly stated Clojure is designed to work well together with the Java thread system. Clojure function instances even implement java.util.concurrent.Callable etc. so they naturally work with the Executor framework.

The most primitive way to do something is to launch it in a new thread like this:

(.start (Thread. #(println "Hello world!")))

Unsurprisingly there is also an API call for launching a Virtual Thread with a preview JDK (or Loom).

(Thread/startVirtualThread #(println "Hello world!"))

Nice! However, this is barely useful. We want concurrent processes to compose and coordinate. Clojure concurrency offers two essential mechanisms:

  • Agents
  • Futures

Let's revisit those in detail and see how we can spice it up with Loom's Virtual Threads.

Agents

Agents manage independent state. Their state can be changed only through submit of action. Actions are ordinary functions that take a state parameter and return a new state. Actions are dispatched using send, send-off, or send-via and they return immediately without waiting for completion. The action occurs asynchronously on thread-pool threads. Only one action per agent happens at a time.

Agents are nice because they come up with the following properties:

  • their state is always available for a reader without blocking after dereferencing with (deref an-agent) or @an-agent shortcut
  • they can be coordinated using (await an-agent)
  • any dispatches made during the action are held until after the state of the agent has changed
  • agents coordinate with transactions - any dispatches made during a transaction are held until it commits
;; construct new agent
(def a-counter (agent 0))

;; send it a function
(send a-counter inc)

;; wait for the delivery
(await a-counter)

;; reveal the state
@a-counter

Spicing up Agents

Agent's dispatching functions send and send-off use default implementations of executors for submitted tasks.

These executors live by default inside clojure.lang.Agent.

  • Dispatching function send uses clojure.lang.Agent/pooledExecutor
  • Function send-off uses clojure.lang.Agent/soloExecutor

Both executors work by default with heavy OS threads. Even though they are good defaults we can sneak in some goodies. Loom comes with a new executor service which you can easily create using the static method on the Executors class. This new executor is represented by ThreadPerTaskExecutor class. We can replace the default pooledExecutor with this new one.

(ns example
  (:import (java.util.concurrent Executors)))

;; Let's first define a factory that helps with spawning new Virtual Threads
(defn thread-factory [name]
  (-> (Thread/ofVirtual)
      (.name name 0)
      (.factory)))

;; Let's swap the default executor with the new one
(set-agent-send-executor!
  (Executors/newThreadPerTaskExecutor
    (thread-factory "clojure-agent-send-pool-")))

;; This code is going to be executed using Virtual Threads under the hood
(def a-counter (agent 0))
(send a-counter inc)
(await a-counter)
@a-counter

The same applies to the executor for send-off dispatching function.

(set-agent-send-off-executor!
  (Executors/newThreadPerTaskExecutor
    (thread-factory "clojure-agent-send-off-pool-")))

If you want to retain more control just use send-via where executor can be specified as a parameter:

;; Define an executor which just produce a new virtual thread for every task
(def unbounded-executor (Executors/newThreadPerTaskExecutor (thread-factory "unbounded-pool-")))

(send-via unbounded-executor a-counter dec)
(await a-counter)
@a-counter

This is all you need to transparently work with Agents under the new concurrency model. Clojure seems to be well prepared for the future! Futures...

Futures

Future represents a value that is going to be available at an indeterminate time in the future. It can be captured and passed around as you want. In Java futures are represented by objects implementing Future<V> interface from the java.util.concurrent package. The brief evolution of implementations of this interface can be captured by Java's standard library:

  • Java 1.5 introduced FutureTask<V>
  • Java 1.7 introduced ForkJoinTask<V>
  • As of Java 1.8 there is CompletableFuture<V>

Clojure contains a bunch of functions in its core library to work with futures. This is the most basic example that can demonstrate how to utilize futures in Clojure programs:

@(future (println "Before")
         (java.lang.Thread/sleep 2000)
         (println "After 2000 ms")
         2000)

As we can see Clojure futures are nice, Just dereference them similarly to agents or atoms with (deref a-future) or a shortcut @a-future. Dereferencing causes execution to block until a future value is resolved and thus available. Unfortunately, that means that the whole OS thread is blocked.

So what can we do to make it cheaper? Of course, Loom has our back covered with a lot cheaper Virtual Threads. Function future uses future-call function under the hood. This function references clojure.lang.Agent/soloExecutor. This means that if we replace this executor as we did for send-off above, it's all we need to do.

There is Promesa library which contains constructs to deal with futures that goes way beyond the simplistic use of futures in the Clojure core library. Some functions from the Promesa library introduce arities that take executor as a parameter and use such executor to schedule computation. Passing the ThreadPerTaskExecutor executor mitigates trouble mentioned under Promesa execution model.

Introducing Structured Concurrency

Structured concurrency is a concurrency programming model described in the following line:

When a flow of execution splits into multiple concurrent flows, they rejoin in the same code block

That means we have to be able to bind thread lifetime to a scope. Such scopes should naturally form parent-child relationships and there has to be programming constructs around the hierarchy.

Let's examine this simplistic example:

(defn run-concurrently []
  (let [executor (Executors/newThreadPerTaskExecutor (thread-factory "perfectly-scoped-pool-"))]
    (try 
      (.submit executor ^Callable #(identity 2000))
      (.submit executor ^Callable #(prn "Starting a long running operation"))
      (.submit executor ^Callable #(Thread/sleep 1000))
      (.submit executor ^Callable #(prn "Done."))
      4
      (finally (.close executor)))))

(run-concurrently)

Here scope is a function with defined executor against which tasks are submitted. None of the Virtual Threads outlives the scope of the function. Reason being ThreadPerTaskExecutor.close method do the join of the threads and cleanup after them. Caller does not need to know anything about level of concurrency of such method. Also this composes recursively (parent-child relationship), as other functions following the same structure can be called inside the body. It's deterministic and transparent.

Avoids

These are less relevant to Clojure developers as most of us do not work on low-level mode of operation, but I'd like to mention them anyway.

  1. Avoid ThreadLocal and InheritableThreadLocal. They are supported, but they defeat the cost advantages that come with Virtual Threads
  2. Avoid synchronized methods. Use java.util.concurrent.locks.ReentrantLock instead
  3. Avoid thread pools to control access to expensive resources. Use java.util.concurrent.Semaphore instead

Clojure itself contains very few instances of ThreadLocal:

  • Agent.java
  • LockingTransaction.java
  • Var.java
  • Instant.clj

Are they a problem? Probably not. My personal recommendation is to use structured concurrency approach similar to run-concurrently above so that Virtual Threads not live long and unused resources are garbage collected as soon as possible.

At some point JDK can also receive Scoped Variables that can be a substitute for expensive ThreadLocals. But it's song of the distant future.

Conclusion

  • Virtual Threads are important and extremely useful addition to Java platform
  • Clojure concurrency mechanisms can be setup and effectively use Virtual Threads today! No modifications to Clojure codebase appears to be necessary
  • Structured concurrency becomes more important mechanism to deal with concurrent processes once Virtual Threads will be released
  • Not everything is set to stone. Some mechanisms maybe revisited or adjusted

I hope this article triggered intelectual curiosity and provided with interesting information.

References

  1. YouTube - Practical Advice
  2. Ron Pressler - Loom: Bringing Lightweight Threads and Delimited Continuations to the JVM
  3. JEP 425
  4. Twitter thread on JEP 425
  5. Github commit - JDK19 Virtual Threads