On Clojure

March 23, 2010

Computing with units and dimensions

Filed under: Libraries — khinsen @ 11:28 am

Many computer programs work with data that represents quantities. Examples are numerous: the age of a person, the weight of a parcel, the duration of a video clip, the distance between two cities, etc. Usually quantities are simply represented by numbers, because numbers are very easy to handle in popular programming languages. However, quantities are not numbers: two years is not the same as the number two. A quantity is defined by a magnitude (which is a number) and a unit. The same quantity can be represented by different magnitude-unit pairs. For example, one minute is the same quantity as sixty seconds. The quality being measured by a unit is called its dimension. Time, length, and weight are examples of dimensions.

There are a few good reasons to represent quantities by magnitude-unit pairs rather than by plain numbers:

  • When quantities are represented by numbers, the units become a matter of convention, written down in a comment (if at all) rather than in the program code. This makes mistakes rather likely, with possibly serious consequences: NASA’s Mars Climate Orbiter crashed because of different units being used in different parts of the software that was used to calculate its flight trajectory.
  • With just numbers, it is not even possible to verify that a quantity passed into a function has the right dimension. With an additional unit, such a check is very easy to do.
  • The unit and dimension information provides additional documentation to the human reader, and aids in debugging.

A number of libraries for various programming languages therefore implement units, dimensions, and quantities, with the associated arithmetic and comparison operators and sometimes also mathematical functions. Clojure recently joined the crowd: the units library is available at Clojars.org and the source code is hosted by Google Code. In this post, I describe how the library works and give a few examples.

First, a simple example for illustration. Like any Clojure script, the first thing to do is to set up the namespace with all the stuff we need:

(clojure.core/use 'nstools.ns)
(ns+ unit-demo
  (:clone nstools.generic-math)
  (:from units dimension? in-units-of)
  (:require [units.si :as si]))

This looks rather complicated, so it deserves some explanation. We will want to be able to calculate with quantities, units, and dimensions, in particular do arithmetic (+ – * /) and comparisons ( min max) on quantities. Clojure’s built-in arithmetic and comparison functions work only on numbers, so they are not useful here. In clojure.contrib.generic, there are generic versions of these operations, meaning that they can be defined for any datatype for which they make sense. To achieve this goal, they are implemented as multimethods, which implies some bookkeeping overhead that reduces performance. In fact, it is for performance reasons that Clojure’s standard arithmetic functions are not generic.

Constructing a nice namespace for generic arithmetic using Clojure’s standard namespace management tools is a bit cumbersome: we’d have to use an explicit :refer-clojure clause in ns in order to exclude the standard arithmetic functions, and then have a
lengthy :use clause for adding the generic versions from the various submodules of clojure.contrib.generic. An easier way is to use the nstools library which defines a suitable namespace template that we can simply clone. We then add the dimension-checking predicate dimension? and the conversion function in-units-of from the units library and the shorthand si for referring to the namespace that defines the SI unit system that we will use.

Now we can start doing something useful. The following function calculates the force exerted by a spring of force constant k that has been compressed or extended by a displacement x:

(defn spring
  [k]
  {:pre [(dimension? (/ si/force si/length) k)]}
  (fn [x]
    {:pre [(si/length? x)]}
    (- (* k x))))

The basic code looks just as if we had written it for use with plain numbers. The only difference are the preconditions that verify that the arguments k and x have the right dimensions: length for x, force constant for k. The test for length is simpler, because for all dimensions that have been assigned a name in the definition of the unit system, there is a direct test predicate, such as si/length?. There is no predefined dimension for “force divided by length”, so we have to use the generic predicate dimension? and construct the dimension arithmetically. The only operations defined on dimensions are multiplication and division, the rest (addition/subtraction, comparison) would not make sense.

Let’s use our function spring:

(def a-spring (spring (/ (* 5 si/N) si/cm)))
(prn (a-spring (si/cm 1/2)))

The first line defines a spring with a force constant of 5 N/cm. You can see in the expression that calculates it that units can be used like quantities in artithmetic. The unit “Newton” behaves just like the quantity “1 Newton”. However, these two values are represented differently internally, for a good reason that I will explain a bit later. The second line evaluates the force exerted by the spring when elongated by 1/2 cm. It shows another way to construct a quantity from unit an magnitude: units can be called as functions, with the magnitude supplied as the argument, returning a quantity.

The last line produces the output

#:force{-5/2 N}

The result thus has the dimension “force”, the magnitude “-5/2″ and the unit “Newton”. The dimension can be shown because it is a named dimension defined in the SI unit system. Otherwise the computer could not have guessed the name of the dimension. Let’s see what happens when we print a force constant:

(prn (/ (* 5 si/N) si/cm))

The output is

#:quantity{5 100.kg.s-2}

No dimension name, no unit name: the magnitude is 5, the unit is 100 kg/s^2, and it is expressed in SI base units plus a prefactor.

Let’s look at some more examples of unit arithmetic in the following REPL protocol:

unit-demo> (+ (si/m 1) (si/km 3))
#:length{3001 m}
unit-demo> (+ (si/km 3) (si/m 1))
#:length{3001/1000 km}
unit-demo> (= (+ (si/m 1) (si/km 3)) (+ (si/km 3) (si/m 1)))
true

This shows how units are converted in arithmetic: the result has the unit of the first argument. However, exchanging the argument still yields a result that is equal to the original one, as indeed “1 km” and “1000 m” are the same quantity.

Next, some more complicated examples: we calculate the kinetic energy of a car:

unit-demo> (/ (si/km 100) si/h)
#:velocity{100 5/18.m.s-1}
unit-demo> (let [v (/ (si/km 100) si/h)
		 m (si/kg 800)]
		 (* 1/2 m v v))
#:energy{4000000 25/324.m2.kg.s-2}
unit-demo> (let [v (/ (si/km 100) si/h)
		 m (si/kg 800)]
		 (in-units-of si/J (* 1/2 m v v)))
#:energy{25000000/81 J}

The last line shows how to convert a quantity to a different unit. Note that the result is always equal to the input quantity, only the representation changes.

At some point, one inevitable has to communicate with the number-only world, usually for I/O, or for plotting. So how do we convert a quantity to a number? It should be clear that this operation implies the choice of a unit. The simplest solution is to divide the quantity by the desired unit: the result will be dimensionless and thus a plain number:

unit-demo> (/  (a-spring (si/cm 1/2))  si/mN)
-2500

Another approach would be to convert to the desired units using in-units-of and then extracting the magnitude using the function magnitude from the units library:

unit-demo> (units/magnitude (in-units-of si/mN (a-spring (si/cm 1/2))))
-2500

At this point it should be clear that the units library defines three datatypes: dimensions, units, and quantities. It is less obvious that dimensions and units (and thus indirectly quantities) refer to a unit system. Without a unit system, the computer could not know that the quotient of a length and a time is a velocity, for example. Nor could it know that “Newton” is just a convenient name for “m kg/s^2″. A unit system defines base dimensions and base units. The SI system (SI = Système International) that is today used all over the world in science and engineering, as well as in daily life in most countries, defines seven base dimensions and associated units:

  • length (meter, m)
  • mass (kilogram, kg)
  • time (second, s)
  • electric current (ampere, A)
  • temperature (kelvin, K)
  • luminous intensity (candela, cd)
  • amount of substance (mole, mol)

Neither the choice of these particular dimensions nor even the choice of seven base dimensions is obvious. One could very well use the electric charge instead of the electric current, for example. And one could very well not have the dimension “amount of substance” at all. The choices made for the SI system reflect the state of the art in metrology, taking into account what can and what cannot be measured with high accuracy.

All dimensions other than the base dimensions are expressed as products of powers of the base units. For example, velocity is length^1 time^-1, and volume is length^3. The SI system is constructed to make sure that all powers are integers, but this is not true e.g. for the older cgs system, which has fractional powers for dimensions related to electricity. According to the principles of dimensional analysis, a dimension is in fact nothing else but a name for a collection of powers (seven integers for the SI system). Metrological reality is a bit more complicated because there can be multiple dimensions with the same set of exponents. For example, in the SI system, both frequency (measured in cycles per second) and radioactivity (measured in decays per second) are equivalent to time^-1, because neither “cycle” nor “decay” has its own dimension. The units library takes this into account and makes a distinction between frequency, radioactivity, and 1/time. The first two are not compatible with each other, meaning that you can’t add 1 Bq and 5 Hz. However, either one is compatible with 1/s, so you can add 1 Bq and 5/s. This feature requires that dimensions be represented by a specific data type; otherwise a list of exponents would be sufficient.

Units are handled much like dimensions: each base dimension has a base unit, and each non-base unit is defined as a product of powers of base units, plus a numerical prefactor. Quantities are then made up of a unit and a magnitude, which is typically a number. It is not strictly necessary to make the distinction between units and quantities, as in fact any quantity can be used as a unit. There are libraries around that use a single representation for both. However, there are two advantages to keeping the distinction:

  1. The units library permits magnitudes of quantities to be values of any type that implements generic arithmetic, whereas unit prefactors must be numbers. It it thus possible to use matrices as magnitudes, provided all elements have the same unit. This permits efficient implementations of many algorithms while still profiting from dimension checking and unit conversion.
  2. Without specific unit objects, every quantity would be represented as a prefactor with respect to a product of powers of the base units. The information of what unit the quantity was initially represented in is lost. While this doesn’t matter from the point of view of dimensional analysis, it does matter from a numerical point of view. For example, quantities at the atomic scale would have very small prefactors when expressed in terms of SI base units. With magnitudes expressed as floating-point values, there is thus a risk of underflow in unit arithmetic. It is in general preferable to keep quantities in their original units and apply conversion only when requested or when inevitable (such as in addition of two quantities).

To close this brief description of the design decisions behind the units library, a few words about temperatures. I have decided not to include support for temperature conversion in the initial versions of the library, and I am not sure if I will ever add it. Temperature is special in that the scales we use in daily life (nowadays mostly centigrades and Fahrenheit) have an arbitrarily chosen zero point that does not coincide with the “natural” zero point of temperature, which corresponds to the lowest possible energetic state of a system. Allowing for such units defined with an offset implies enormous complications: a distinction must be made between “differential” and “absolute” units, and arithmetic must be defined carefully to make sure that absolute units can be used only in addition with a differential unit or in subtraction. I don’t think that introducing that amount of complexity is justified, considering that daily-life temperatures are rarely combined in computations with quantities of other dimensions.

7 Comments »

  1. Two things.

    The clojars site lists the pom’s version as 0.2.1 but the actual version that is available is 0.2.0.
    Also, trying to use your example, I get this error:

    (ns+ unit-demo
    (:clone nstools.generic-math)
    (:from units dimension? in-units-of)
    (:require [units.si :as si]))
    java.lang.IllegalArgumentException: Unsupported option(s) – :as (units.clj:1917)

    Not sure what’s wrong, I’m able to use (:require [foo :as bar]) elsewhere in my code.

    Comment by Artie — May 22, 2010 @ 6:31 pm

    • The error message looks strange indeed because there is no line 1917 in file units.clj!

      However, I suspect that your problem is due to the Clojure version you use. The units module was written for an early development version of Clojure 1.2. Older Clojure releases (1.0 and 1.1) don’t have the deftype mechanism that is heavily used, and in recent developmenet versions of Clojure 1.2 this mechanism has changed so much that the units modules doesn’t work anymore. As a consequence, there are only a few snapshots of the development version that actually work with clj-units.

      I am working on adapting clj-units to the current Clojure 1.2 development versions, but since I don’t have much time right now, this might take another week or two. Please be patient!

      Comment by khinsen — May 25, 2010 @ 9:56 am

      • Please try release 0.2.2, which works with the current version of Clojure 1.2. Make sure that you have a recent version of clojure-contrib as well; I fixed a relevant bug there yesterday!

        Comment by khinsen — May 26, 2010 @ 8:32 am

  2. Works now! Cheers!

    Comment by Artie — May 29, 2010 @ 10:26 pm

  3. If I may comment on this: “I don’t think that introducing that amount of complexity is justified, considering that daily-life temperatures are rarely combined in computations with quantities of other dimensions.”:

    It is true that working with temperatures can be complex, especially if you allow the use of non-0 based scales like Celsius and Fahrenheit. However, if you stick to Kelvin (and/or Rankin), there is little complexity. More to the point, there are plenty of problems that require combining temperatures with other units, especially in chemical engineering and physics problems dealing with heat transport.

    I would really appreciate you reconsidering the above, at least adding support for Kelvin (and/or Rankin)! If I get a chance, and if my Clojure expertise is up to it, I may have a go myself, but I assume it should be trivial for you to do.

    Thanks, in any case, for what looks like a great addition to the Clojure-verse!

    Comment by Eric Fraga — July 8, 2011 @ 6:19 pm

    • Kelvin is in there, of course, it’s an SI unit after all. Adding Rankin is trivial, as is adding most other units: All it takes is a defunit statement containing the conversion factor to some already known unit combination. What’s not possible without adding lots of code is units whose zero is non-standard, which in practice means Celsius and Fahrenheit.

      Comment by khinsen — July 9, 2011 @ 3:04 pm

      • Ah, I see! Sorry I misunderstood your point. I agree completely that adding conversions between Celsius and Fahrenheit is not worth the effort. So long as Kelvin is there, I’m happy!

        Comment by Eric Fraga — July 13, 2011 @ 8:11 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Shocking Blue Green Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: