Skip to main content

[Translation - Not Recommended] Learning Clojure - Syntax

Original from official website: https://www.clojure.org/guides/learn/syntax

Literals

Below are some examples of literal representations of common primitives in Clojure. All these literal symbols are valid Clojure expressions.

; creates a comment to the end of the line. Sometimes multiple semicolons are used to indicate header comment sections, but this is just a convention.

Numeric Types

42        ; integer
-1.5 ; floating point
22/7 ; ratio

Integers are read as fixed-precision 64-bit integers when within range, otherwise as arbitrary precision. A trailing N can be used to force arbitrary precision. Clojure also supports Java syntax for octal (prefix 0), hexadecimal (prefix 0x), and arbitrary radix (prefix base then r) integers. Ratios are provided as their own type, combining numerator and denominator.

Floating point values can be read as double-precision 64-bit floats, or arbitrary precision with M suffix. Exponential notation is also supported. Special symbolic values ##Inf, ##-Inf, and ##NaN represent positive infinity, negative infinity, and "not a number" values respectively.

Character Types

"hello"         ; string
\e ; character
#"[0-9]+" ; regular expression

Strings are enclosed in double quotes and can span multiple lines. Single characters are represented with a preceding backslash. There are some special named characters: \newline, \space, \tab, etc. Unicode characters can be represented with \uNNNN, or octal with \oNNNN.

Literal regular expressions are strings with a # prefix. These are compiled into java.util.regex.Pattern objects.

Symbols and Identifiers

map             ; symbol
+ ; symbol - most punctuation allowed
clojure.core/+ ; namespaced symbol
nil ; null value
true false ; booleans
:alpha ; keyword
:release/alpha ; namespaced keyword

Symbols consist of letters, numbers, and other punctuation, used to refer to other things like functions, values, namespaces, etc. Symbols can optionally have a namespace, separated from the name by a forward slash.

Three special symbols are read as different types - nil is the null value, true and false are booleans.

Keywords start with a preceding colon and always evaluate to themselves. They are often used as enumeration values or attribute names in Clojure.

Collections

Clojure also includes literal syntax for four collection types:

'(1 2 3)     ; list
[1 2 3] ; vector
#{1 2 3} ; set
{:a 1, :b 2} ; map

We'll discuss these in more detail later - for now, know that these four data structures can be used to create composite data.

Evaluation

Next we'll consider how Clojure reads and evaluates expressions.

Traditional Evaluation (Java)

Java evaluation

In Java, source code (.java files) is read as characters by the compiler (javac), producing bytecode (.class files) that can be loaded by the JVM.

Clojure Evaluation

Clojure evaluation

In Clojure, source code is read as characters by the reader. The reader can read source from .clj files or be given a series of expressions interactively. The reader produces Clojure data. Then the Clojure compiler produces bytecode for the JVM.

Two important points here:

  1. The unit of source code is a Clojure expression, not a Clojure source file. Source files are read as a series of expressions, just as if you typed those expressions interactively in the REPL.
  2. Macros are special functions that take code (as data) and emit code (as data). Can you see where a loop for macro expansion could be inserted in the evaluation model?

Structure and Semantics

Consider a Clojure expression.

Structure and semantics

This diagram illustrates the difference between syntax in green (Clojure data structures produced by the Reader) and semantics in blue (how the Clojure runtime understands this data).

Most Clojure literal forms evaluate to themselves, except for symbols and lists. Symbols are used to refer to other things and, when evaluated, return what they refer to. Lists (as shown) are evaluated as invocations.

In the diagram, (+ 3 4) is read as a list containing a symbol (+) and two numbers (3 and 4). The first element (where + is found) can be called the "function position" - where to find the thing to invoke. While functions are an obvious invocable thing, there are also special operators known to the runtime, macros, and some other invocable things.

Considering the evaluation of the above expression:

  • 3 and 4 evaluate to themselves (longs)
  • + evaluates to a function that implements +
  • Evaluating the list will invoke the + function with arguments 3 and 4

Many languages have statements and expressions, where statements have some stateful effect but don't return a value. In Clojure, everything is an expression that evaluates to a value. Some expressions (but not most) also have side effects.

Now let's consider how to interactively evaluate expressions in Clojure.

Quoting to Delay Evaluation

Sometimes it's useful to pause evaluation, especially for symbols and lists. Sometimes a symbol should just be a symbol without looking up what it refers to.

user=> 'x
x

And sometimes a list should just be a list of data values (not code to evaluate).

user=> '(1 2 3)
(1 2 3)

You might see a confusing error that results from accidentally trying to evaluate a data list as code.

user=> (1 2 3)
Execution error (ClassCastException) at user/eval156 (REPL:1).
class java.lang.Long cannot be cast to class clojure.lang.IFn

For now, don't worry too much about quoting, but you'll occasionally see it in these examples to avoid evaluating symbols or lists.