This past week I've had a lot of fun with Slamhound, my most recent project. Often times when writing Clojure, your namespace declarations are a pain to keep in order—you've got to specify all the namespaces you depend upon, and once you've got a lot of them in there, it's hard to keep them straight; often duplicates sneak in and things just get really messy. So Slamhound takes a source file, discards the namespace declaration, and tries to determine how to reconstruct it from the code alone. I had a lot of fun putting it together, so here's a little walk-through of how it works.
A big part of why it was so much fun is that the code is very
functional. It takes a file as input, throws it into a pipeline,
and spits out an answer. The only impure part is the reading of
the file.[1] The pipeline is built in
the slam.hound/reconstruct
function:
(defn reconstruct [filename] (-> (io/reader filename) asplode regrow stitch-up))
But first a little back-story: Slamhound is named after the explosive device from William Gibson's novel Count Zero. At the start of the story, the main character Turner has an encounter with one, after which he spends three months in surgery while his skin and organs are regrown before he's stitched back together. Files in Slamhound go through the same process: asplode, re-grow, stitch-up. (I've quoted code selections inline, but for the complete listing a link is given for each phase.)
(defn asplode [rdr] (let [rdr (PushbackReader. rdr) ns-map (ns-to-map (read rdr)) stripped-ns (apply dissoc ns-map [:use :require :import]) body (take-while #(not= ::done %) (repeatedly #(read rdr false ::done)))] [stripped-ns body]))
Like its counterpart in the book,
the first
phase is quick, simple, and messy. First we need to construct
a PushbackReader in order for clojure.core/read
to
operate. This gives us the ns
form as a list. Then in
ns-to-map
(not shown) we perform the translation into
a map by splitting each clause
(:use
, :require
,
and :import
) out, since the map form is a lot easier
to operate on in during the regrow stage. But often times you have
left-over clauses from earlier revisions, so
we dissoc
them out in order to start from
scratch. Then we repeatedly
read
until
EOF to get the body. The stripped ns and body are passed to the
next phase.
(defn regrow ([[ns-map body]] (force pre-load) (regrow [ns-map body] nil)) ([[ns-map body] last-missing] (if-let [{:keys [missing type]} (check-for-failure ns-map body)] (if (= last-missing missing) (throw (Exception. (str "Couldn't resolve " missing))) (recur [(grow-step missing type ns-map) body] missing)) ns-map)))
The re-grow phase is where the pain-staking progress is made. The
first arity forces the pre-load
delay which ensures
all the source is loaded for compilation purposes, then hands off
to the second arity. There's a recursive loop here: we attempt to
compile the body with the current ns-map
in check-for-failure
(see below.) If it fails, we analyze the
problematic missing
piece in grow-step
,
which searches for all possible candidates
, sorts
in disambiguate
, and picks one. Then the compilation
is retried by recur
ring.
Right now if the retry results in the same failure, it gives up,
though there's no reason in the future it couldn't continue down
the list of candidates. Of course, if compilation succeeds, the
re-grow process is finished, and it returns
the ns-map
with the new clauses for the next
phase, stitch-up
.
But check-for-failure
deserves special attention as
it's where the meat of the process happens:
(defn check-for-failure [ns-map body] (let [sandbox-ns slamhound.sandbox# ns-form (stitch/ns-from-map (assoc ns-map :name sandbox-ns))] (binding [*ns* (create-ns sandbox-ns)] (try (eval `(do ~ns-form ~@body nil)) (catch Exception e (or (failure-details (.getMessage e)) (throw e))) (finally (remove-ns (.name *ns*)))))))
Here we see that a unique sandbox namespace is created using
auto-gensym. Next we reach ahead a bit into the
next stitch-up
phase for the ns-from-map
function, which takes our map representation and turns it back
into an executable ns
declaration. Stepping into this
sandbox namespace with binding
, we attempt
to eval
the body. If it fails, we analyze the
exception with failure-details
. This tells us whether
it was a :use
, :require
,
or :import
clause that caused the failure as well as
the name of the missing class or var. These details are used
by candidates
to determine the ns-map for the next
attempt.
(defmethod candidates :require [type missing] (for [n (all-ns) :when (= missing (last (.split (name (ns-name n)) "\\.")))] [(ns-name n) :as (symbol missing)]))
The candidates
multimethod operates on each of the
three clause types
(:use
/:require
/:import
),
but they're all slight variations on the same
theme. The :require
version is given here: it simply
loops through all the available namespaces and returns the ones
which match the missing
name from the compilation
failure.
(defn stitch-up [ns-map] (-> ns-map collapse-clauses sort-subclauses ns-from-map prettify))
Once weve got a successful compilation, we have what we
need. All that's left is to turn it back from a map into code
form, which is handled by the stitch phase. Technically only the
the ns-from-map
function we saw used above is
necessary, but the ns declaration looks a lot nicer if you
collapse all the clauses down first:
(:use [clojure.test :only [deftest]] [clojure.test :only [testing]] [clojure.test :only [is]] [clojure.test :only [are]]) ;; after collapsing becomes: (:use [clojure.test :only [deftest testing is are]])
Sorting subclauses doesn't hurt either, though it's strictly
cosmetic. Finally we turn it back into code form and pretty-print
it using clojure.pprint/pprint
. And that's it!
๛