Week 6: Clojure

After the past two posts on different topics, I'm returning to my series on Seven Languages in Seven Weeks by Bruce Tate. The programming language covered in this week's chapter is Clojure, a dialect of Lisp.

Week 1 ˙ Week 2 ˙ Week 3 ˙ Week 4 ˙ Week 5 ˙ Week 6 ˙ Week 7

Languages such as Clojure that are in the Lisp family have a sort of mystique due to their sparse syntax and proliferation of parentheses. I had a lot of fun starting to get the hang of it in this week's chapter. Here is how Bruce Tate describes this language:

Clojure is a functional language on the [Java virtual machine]. Like Scala and Erlang, this Lisp dialect is a functional language but not a pure functional language. It allows limited side effects. Unlike other Lisp dialects, Clojure adds a little syntax, preferring braces for maps and brackets for vectors. You can use commas for whitespace and omit parentheses in some places.

 

The code is dense but expressive. That's an ideal combination.

There are some other quotes (about lists, prefix notation, and macros) from Seven Languages in Seven Weeks that I'd like to share before getting into my own notes:

Lisp is a language of lists. A function call uses the first list element as the function and the rest as the arguments.
Lisp uses its own data structures to express code.

 

[Prefix notation] requires a developer to comprehend code from the inside out, rather than outside in.

 

Macro expansion is perhaps the most powerful feature of Lisp, and few languages can do it. The secret sauce is the expression of data as code, not just a string.

I'll expand on these concepts in my notes and examples below.

Two key ideas about Clojure are that:

  1. Everything is a list
  2. All lists act as functions

There is some simplification here, but it gives a general outline of the language's structure.

Here are my notes on Clojure (in addition to the book, these come from the first reference link near the end of this post):

  • Comments start with a ;

  • Lists go in parentheses (so everything is encompassed by parentheses) with items separated by whitespace: (item0 item1 ...). Commas can be added for clarity since they are treated as whitespace. Lists/functions can also be broken over multiple lines to improve readability.

  • The initial item in a list is assumed to be a function and the remaining items its arguments. Use the list function or precede the open parenthesis with a single quote (i.e. '() to make a literal list. eval will execute a literal list as a function.

  • = is used to test for equality. Create a variable with (def variablename value). let can also be used for binding values to variables (with destructuring/pattern matching) but only within a temporary context (until the parentheses around the let are closed.

  • str turns its arguments into a string. Strings are surrounded by double-quotes ("). A letter preceded by a backslash acts as a single character, but within strings they act as escape sequences (e.g. \n is the character n, but "\n" is a newline). println prints a line.

  • The use of the first item in a list as the function even applies to math. That is, Clojure uses prefix/Polish notation instead of the more familiar infix notation. -> and ->> (they work slightly differently, but I won't go into detail here) can be used to change the order for convenience/readability, however.

  • To add to lists (or vectors), the functions cons, conj, concat can be used depending on the behaviour you want.

  • first, rest, last, etc. can be used to get elements from a specified position in a list or vector.

  • Lists, vectors, maps, and sets are the main data structures. Collectively, they are known as collections and sequences (see here for a discussion of the differences between these terms) It is customary to use lists for code and vectors for data. But here is a key point about Lisp dialects: Code is data. I've already mentioned that lists are assumed to be functions. Vectors, maps, and sets can also act as functions. They are also items (even if the only item) in a list, since the line they are on will need to be enclosed by parentheses like anything else. Hopefully the next few points will clarify these concepts:

    • Vectors go in square brackets. A vector used as a function takes an index (0 is the first element) as an argument and returns the element at that index.

    • Maps go in curly braces and comprise key-value pairs. Keywords begin with :. A map can be used as a function with a key as the argument (it also works to use the key as the function and the map as the argument) to return the associated value. merge (or merge-with to instruct how to deal with duplicate keys) combines two maps, assoc adds a new key-value pair, and dissoc drops a key from a map.

    • Sets go in curly braces preceded by a hash (i.e. #{ ... }). When used as a function, they test whether their argument is a member of that set. A set can be created from a list or vector by calling the set function on it; this function reduces the list/vector to unique entries. Functions that apply to sets include clojure.set/union, clojure.set/intersection, clojure.set/difference—unlike most of the other functions I'm discussing here, these are not part of the core module.

  • Predicates are tests that return a boolean true/false result. Such functions are supposed to end with a question mark (e.g. odd?, even?). They can be used with functions such as filter, some, every?.

  • map applies a function to each element in a list or vector. reduce shrinks a list or vector to a single result by applying the specified operation between each element (e.g. reduce + adds up all of the elements).

  • "Destructuring" is a useful feature in Clojure. It involves matching the arguments to a function or a let statement to an expected order (using _ to hold a position whose contents don't matter), enclosed in square brackets. It can also be used to match the value that corresponds to a certain key in a map.

  • Functions can be defined by (defn functionname [parameters] statement). If desired, documentation for the function can be added as a string between the function name and parameters; it can be retrieved by calling doc on the function (which can also be called on built-in functions). fn creates an anonymous function (these can be used in the context of map or filter, for example). Anonymous functions can be defined more succinctly by preceding a list with a hash (i.e. #(...)) and using % as a stand-in for the items passed to the function.

  • One of the really useful properties of Clojure is that sequences are evaluated lazily (i.e. each element is only calculated when it is needed). This means that infinite sequences can be set up without problems. range, repeat, cycle, iterate can create infinite (or finite, depending on how many arguments they are given) sequences. iterate, for example, is given a function and initial value. When working with infinite sequences, (take n) selects the first n items, (drop n) drops the first n items, and (nth sequence n) returns only item n.

  • Macros are another powerful—but difficult—feature of Clojure. Macros basically write customized lists from the input they're given. Because lists are functions, this allows macros to work as meta-functions. Words or symbols that would normally have an effect need to be quoted (i.e. preceded by a single quote) so that they will appear literally when the new list is returned by the macro. Use defmacro to define a macro and macroexpand to show what a macro will return given certain inputs (this is useful, since it can be complicated to make sure everything is quoted at the right level and expands with the proper configuration of parentheses).

Here are some examples of Clojure code. Compared to some of the other languages in the book, a lot of them are pretty short because the language is so concise.

I liked this example with for from Seven Languages in Seven Weeks because it demonstrates filtering using :when:

user=> (def colours ["red" "green" "blue"])
#'user/colours
user=> (def toys ["blocks" "cars"])
#'user/toys
user=> (for [x colours] (str "I like " x))
("I like red" "I like green" "I like blue")
user=> (for [x colours, y toys] (str "I like " x " " y))
("I like red blocks" "I like red cars" "I like green blocks" "I like green cars" "I like blue blocks" "I like blue cars")
user=> (for [x colours, y toys, :when (not (= x "green"))] (str "I like " x " " y))
("I like red blocks" "I like red cars" "I like blue blocks" "I like blue cars")

Observe that every line is enclosed in parentheses, the first item in each set of parentheses is a function, and the evaluation starts inside the deepest nested layer (e.g. the :when filter finds values of x equal to "green" then applies a logical not).

Here is a compact example of calculating the sum of squares of a vector using an anonymous function (#(* % %)), which multiplies each element (%) passed to it by itself, then adding up the squared numbers with reduce +:

user=> (reduce + (map #(* % %) [1 2 3 4 5]))
55

My next example uses lazy evaluation of infinite sequences. It calculates the results of the well-known problem for how far a stack of dominoes can be made to hang over an edge (the answer is half of the sum of the harmonic series, the sequence of fractions where the denominator increases by 1 each time):

user=> (defn harmonic [n] (/ 1 n))
#'user/harmonic
user=> (* (/ 1 2) (reduce + (take 5 (map harmonic (iterate inc 1)))))
137/120
user=> (* (/ 1 2) (reduce + (take 8 (map harmonic (iterate inc 1)))))
761/560

First, I define a function for taking the reciprocal. On the next lines, starting from the inside, I use iterate on the inc function (which increments a value by 1 each time) to make an infinite sequence of integers. Next, I use map and the harmonic function I defined to move each element to the denominator. With take and reduce I add up the first 5 or 8 elements. Finally, I multiply the result by half. The Wikipedia article on this problem has results for numbers up to 30 in a table and my answers for 5 and 8 check out. Observe that Clojure will actually do calculations with fractions—this is a nice feature that avoids losing precision.

My final example is a simple macro that allows calculations to be written in Reverse Polish notation, the opposite order from how Clojure normally works:

user=> (defmacro rpn [[a b c]] (list c a b))
#'user/rpn
user=> (macroexpand '(rpn (7 3 +)))
(+ 7 3)
user=> (rpn (7 3 +))
10

The macro definition relies on destructuring. It looks for an argument containing 3 entries, then writes a list with the third one (c) in the first position. When the macro is expanded, it looks like a normal Clojure calculation, so running the macro gives the correct result.

Finally, here are some links and references about Clojure:

  • This is a great overview (some of my notes came from here)
  • If you want to understand macros better, this is a good place to start (my RPN macro was inspired by their "infix" example).
  • Clojure's concise syntax uses a lot of single characters (like the hash symbol) that are hard to google so this is a handy guide
  • Here is a cheatsheet for the language
  • If you want to try Clojure, here is an online terminal you can use (it has other languages available too)

The final language in Seven Languages in Seven Weeks is Haskell. I plan to cover that chapter in my next post, along with some concluding thoughts.

Permalink