Clojure is Bliss

The problem

Recently I had a certain problem which I wanted to solve with Clojure: I wanted to sort a set of version tags. The version tags look something like this:

Version_6_00_000
Version_6_00_000_alpha_2
Version_6_00_000_alpha_13
Version_7_00_000
Version_7_00_000_beta_24

The version tags contain numbers and words which are separated by underscores. The tags cannot be sorted alphabetically because of the different lengths (with or without alpha/beta/number suffix). That`s why we need a custom way to sort the version numbers correctly (see the ordering above).

In Java the standard way to do this is by using the Comparator interface (or Comparable). In Clojure it is the same. In Clojure you can implement Java interfaces on the fly using the proxy macro.

First draft

Below is the first version of the Comparator implementation.

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [as (.split a "_")
5		      bs (.split b "_")
6		      [d1 d2 d3 d4 d5 d6] (map compare as bs)]
7		  (if (and d1 (not= 0 d1)) d1
8		      (if (and d2 (not= 0 d2)) d2
9			  (if (and d3 (not= 0 d3)) d3
10			      (if (and d4 (not= 0 d4)) d4
11				  (if (and d5 (not= 0 d5)) d5
12				      (if (and d6 (not= 0 d6)) d6
13					  0))))))))))

In line 1 we define the symbol tag-comparator. The proxy macro (line 2) lets you extend Java interfaces like the Comparator interface. We implement the compare method (line 3) by splitting the two strings using underscore as the separator (line 4, 5). We map the (String-) compare method to all the substrings. In line 6 you see an example of Clojure’s destructuring mechanism in the let block. The map function returns a list of values. The list [d1 d2 d3 d4 d5 d6] takes this list and binds the values to the variables/symbols.

In line 7 to 13 we check for each substring difference if it is equal to zero. If not, we have a difference and know how to order the two strings a and b. The cascade of ifs is not very elegant, that`s why we want to simplify the comparator and make it more general.

There is still an error in the definition above which concerns the ordering of numbers as strings. The numbers are sorted alphabetically, which can produce a wrong ordering („13“ < "2").

Getting functional

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [la (map maybe-to-int (.split a "_"))
5		      lb (map maybe-to-int (.split b "_"))
6		      deltas (map compare la lb)
7		      diff-len (- (count la) (count lb))]
8		  (or (first (filter (partial not= 0) deltas))
9		      diff-len)))))

In this version we have the function maybe-to-int (line 4, 5) that tries to parse a given substring as a Java integer. If this fails it returns the string instead. In line 6 we compare the substrings (and integers) and bind it to deltas. In line 8 we filter the deltas which are not zero and then take the first entry of the resulting list. This is the result of the comparison.

However the deltas may as well be all zero. In this case the filtering returns an empty list and first yields nil. The or function evaluates the next expression which is diff-len and represents the length difference of the two substring lists.

Don`t repeat yourself

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [explode #(map maybe-to-int (.split % "_"))
5		      [la lb] (map explode [a b])
6		      deltas (map compare la lb)
7		      diff-len (reduce - (map count [la lb]))]
8		  (println deltas)
9		  (or (first (filter (partial not= 0) deltas))
10		      diff-len)))))

In this version we extract the common functionality of splitting and mapping and create the function explode (line 4). We map the function to both strings (line 5). The value diff-len is computed in a more functional way (line 7).

Refinement

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [parse-int #(Integer/parseInt %)
5		      parse-fns [str parse-int parse-int parse-int str parse-int]
6		      explode (fn [s] (map #(%1 %2) parse-fns (.split s "_")))
7		      [la lb] (map explode [a b])
8		      deltas (map compare la lb)
9		      diff-len (reduce - (map count [la lb]))]
10;		  (println deltas)
11		  (or (first (filter (partial not= 0) deltas))
12		      diff-len)))))

Schlagwörter:

Schreibe einen Kommentar

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s


%d Bloggern gefällt das: