Clojure the Essential Reference

Clojure
The Essential Reference

Cover photo - Clojure_ the essential ref

The aim of “Clojure: The Essential Reference” by Renzo Borgatti is to provide the most detailed guide to the Clojure standard library possible. As well as describing how each function works, theoretical and empirical performance of each function is included along with a worked example of the function.

Near the start of the book’s development, Renzo asked me if I would like to contribute a chapter and gave me a free choice on the topic I would like to cover. I was happy to be involved in the project.

A key aspect of Clojure is that it is a Lisp that runs on the Java Virtual Machine (‘JVM’). Running on the JVM means that Clojure programs can natively call all the existing production hardened libraries that have been developed over the last 20 years for Java. Nearly every large organisation runs programs on the JVM in production and Clojure can tap into all this existing knowledge and infrastructure. On the other hand, as a Lisp, Clojure also provides the development speed and power for which Lisp is known.

But being a Lisp on the JVM was not in itself what made Clojure so revolutionary. The feature of Clojure that made it so revolutionary is the persistent data structures. Persistent data structures are data structures that can be modified, but existing references to the data structure continue to point to the unmodified version of the data structure; from the point of view of these references the data structure is immutable.

A programming language that eliminates side effects, especially mutating data, has long been the holy grail. If data can no longer be mutated, a vast amount of design and programming effort that seeks to protect the program from the effects of mutating data, can be eliminated. Further, development of concurrent programs is greatly simplified.

In Erlang it is not possible to mutate data, but this is achieved at a cost. All Erlang programs have to follow the actor model of lightweight processes communicating by sending messages. Further, prior to Clojure, Erlang did not have support for core data structures such as hash maps.

Haskell also avoids functions having side effects, but achieves this through monadic composition. And although technically functions in Haskell are referentially transparent, monadic composition adds complexity and you can often end up with programs that are similar to imperative programs.

Clojure comes with very practical and efficient implementations of the two most commonly used data structures – a resizable array (called vector in Clojure) and a hash map (called hash-map). These persistent data structures were created by Rich Hickey and build on and extend the work of Phil Bagwell in his paper “Ideal Hash Trees” (1). With these persistent data structures, Clojure has spawned a new programming paradigm called “Data oriented programming”. And versions of Clojure’s persistent data structures have spread to other functional programming languages including Haskell, Scala, Erlang and Javascript.

So in 2016, I chose to write the chapter on the Clojure vector. It is gratifying to subsequently read Rich Hickey in 2020 refer to creating the Clojure hash-map and vector as “the breakthrough moment for Clojure… only after this did I feel Clojure could be practical."(2)

In the chapter I make strong claims about the theoretical and empirical performance of the Clojure vector, which I need to be able to back up. Unfortunately, I had no authoritative source I could reference. For the theoretical performance, I analysed the Clojure source code. However, with the empirical analysis I ran into problems using the Criterium library and ended up writing the Keirin library.

Clojure: The Essential Reference is available from Manning here.

(1) Ideal Hash Trees by Phil Bagwell. Available from here.

(2) Proc. ACM Program. Lang., Vol. 4, No. HOPL, Article 71. Publication date: June 2020. Available from here.

Clojure The Essential Reference

Clojure
The Essential Reference