in which a violent metaphor features prominently



This past week I've had a lot of fun with Slamhound, my most recent project. Often times when writing Clojure, your namespace declarations are a pain to keep in order—you've got to specify all the namespaces you depend upon, and once you've got a lot of them in there, it's hard to keep them straight; often duplicates sneak in and things just get really messy. So Slamhound takes a source file, discards the namespace declaration, and tries to determine how to reconstruct it from the code alone. I had a lot of fun putting it together, so here's a little walk-through of how it works.

A big part of why it was so much fun is that the code is very functional. It takes a file as input, throws it into a pipeline, and spits out an answer. The only impure part is the reading of the file.[1] The pipeline is built in the slam.hound/reconstruct function:

(defn reconstruct [filename]
  (-> (io/reader filename)

But first a little back-story: Slamhound is named after the explosive device from William Gibson's novel Count Zero. At the start of the story, the main character Turner has an encounter with one, after which he spends three months in surgery while his skin and organs are regrown before he's stitched back together. Files in Slamhound go through the same process: asplode, re-grow, stitch-up. (I've quoted code selections inline, but for the complete listing a link is given for each phase.)

Phase 1: asplode

(defn asplode [rdr]
  (let [rdr (PushbackReader. rdr)
        ns-map (ns-to-map (read rdr))
        stripped-ns (apply dissoc ns-map [:use :require :import])
        body (take-while #(not= ::done %)
                         (repeatedly #(read rdr false ::done)))]
    [stripped-ns body])) 

Like its counterpart in the book, the first phase is quick, simple, and messy. First we need to construct a PushbackReader in order for clojure.core/read to operate. This gives us the ns form as a list. Then in ns-to-map (not shown) we perform the translation into a map by splitting each clause (:use, :require, and :import) out, since the map form is a lot easier to operate on in during the regrow stage. But often times you have left-over clauses from earlier revisions, so we dissoc them out in order to start from scratch. Then we repeatedly read until EOF to get the body. The stripped ns and body are passed to the next phase.

Phase 2: regrow

(defn regrow
  ([[ns-map body]]
     (force pre-load)
     (regrow [ns-map body] nil))
  ([[ns-map body] last-missing]
     (if-let [{:keys [missing type]} (check-for-failure ns-map body)]
       (if (= last-missing missing)
         (throw (Exception. (str "Couldn't resolve " missing)))
         (recur [(grow-step missing type ns-map) body] missing))

The re-grow phase is where the pain-staking progress is made. The first arity forces the pre-load delay which ensures all the source is loaded for compilation purposes, then hands off to the second arity. There's a recursive loop here: we attempt to compile the body with the current ns-map in check-for-failure (see below.) If it fails, we analyze the problematic missing piece in grow-step, which searches for all possible candidates, sorts in disambiguate, and picks one. Then the compilation is retried by recurring.

Right now if the retry results in the same failure, it gives up, though there's no reason in the future it couldn't continue down the list of candidates. Of course, if compilation succeeds, the re-grow process is finished, and it returns the ns-map with the new clauses for the next phase, stitch-up.

But check-for-failure deserves special attention as it's where the meat of the process happens:

(defn check-for-failure [ns-map body]
  (let [sandbox-ns `slamhound.sandbox#
        ns-form (stitch/ns-from-map (assoc ns-map :name sandbox-ns))]
    (binding [*ns* (create-ns sandbox-ns)]
        (eval `(do ~ns-form ~@body nil))
        (catch Exception e
          (or (failure-details (.getMessage e))
              (throw e)))
         (remove-ns (.name *ns*)))))))

Here we see that a unique sandbox namespace is created using auto-gensym. Next we reach ahead a bit into the next stitch-up phase for the ns-from-map function, which takes our map representation and turns it back into an executable ns declaration. Stepping into this sandbox namespace with binding, we attempt to eval the body. If it fails, we analyze the exception with failure-details. This tells us whether it was a :use, :require, or :import clause that caused the failure as well as the name of the missing class or var. These details are used by candidates to determine the ns-map for the next attempt.

(defmethod candidates :require [type missing]
  (for [n (all-ns)
        :when (= missing (last (.split (name (ns-name n)) "\\.")))]
    [(ns-name n) :as (symbol missing)]))

The candidates multimethod operates on each of the three clause types (:use/:require/:import), but they're all slight variations on the same theme. The :require version is given here: it simply loops through all the available namespaces and returns the ones which match the missing name from the compilation failure.

Phase 3: stitch up

(defn stitch-up [ns-map]
  (-> ns-map

Once we've got a successful compilation, we have what we need. All that's left is to turn it back from a map into code form, which is handled by the stitch phase. Technically only the the ns-from-map function we saw used above is necessary, but the ns declaration looks a lot nicer if you collapse all the clauses down first:

(:use [clojure.test :only [deftest]]
      [clojure.test :only [testing]]
      [clojure.test :only [is]]
      [clojure.test :only [are]])

;; after collapsing becomes:

(:use [clojure.test :only [deftest testing is are]])

Sorting subclauses doesn't hurt either, though it's strictly cosmetic. Finally we turn it back into code form and pretty-print it using clojure.pprint/pprint. And that's it!

1: Technically there's one exception to this; see if you can spot it.

« older | 2011-05-01T06:05:12Z | newer »