Archive for the ‘IT’ Category

Clojure is Bliss

Juli 30, 2009

The problem

Recently I had a certain problem which I wanted to solve with Clojure: I wanted to sort a set of version tags. The version tags look something like this:

Version_6_00_000
Version_6_00_000_alpha_2
Version_6_00_000_alpha_13
Version_7_00_000
Version_7_00_000_beta_24

The version tags contain numbers and words which are separated by underscores. The tags cannot be sorted alphabetically because of the different lengths (with or without alpha/beta/number suffix). That`s why we need a custom way to sort the version numbers correctly (see the ordering above).

In Java the standard way to do this is by using the Comparator interface (or Comparable). In Clojure it is the same. In Clojure you can implement Java interfaces on the fly using the proxy macro.

First draft

Below is the first version of the Comparator implementation.

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [as (.split a "_")
5		      bs (.split b "_")
6		      [d1 d2 d3 d4 d5 d6] (map compare as bs)]
7		  (if (and d1 (not= 0 d1)) d1
8		      (if (and d2 (not= 0 d2)) d2
9			  (if (and d3 (not= 0 d3)) d3
10			      (if (and d4 (not= 0 d4)) d4
11				  (if (and d5 (not= 0 d5)) d5
12				      (if (and d6 (not= 0 d6)) d6
13					  0))))))))))

In line 1 we define the symbol tag-comparator. The proxy macro (line 2) lets you extend Java interfaces like the Comparator interface. We implement the compare method (line 3) by splitting the two strings using underscore as the separator (line 4, 5). We map the (String-) compare method to all the substrings. In line 6 you see an example of Clojure’s destructuring mechanism in the let block. The map function returns a list of values. The list [d1 d2 d3 d4 d5 d6] takes this list and binds the values to the variables/symbols.

In line 7 to 13 we check for each substring difference if it is equal to zero. If not, we have a difference and know how to order the two strings a and b. The cascade of ifs is not very elegant, that`s why we want to simplify the comparator and make it more general.

There is still an error in the definition above which concerns the ordering of numbers as strings. The numbers are sorted alphabetically, which can produce a wrong ordering („13“ < "2").

Getting functional

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [la (map maybe-to-int (.split a "_"))
5		      lb (map maybe-to-int (.split b "_"))
6		      deltas (map compare la lb)
7		      diff-len (- (count la) (count lb))]
8		  (or (first (filter (partial not= 0) deltas))
9		      diff-len)))))

In this version we have the function maybe-to-int (line 4, 5) that tries to parse a given substring as a Java integer. If this fails it returns the string instead. In line 6 we compare the substrings (and integers) and bind it to deltas. In line 8 we filter the deltas which are not zero and then take the first entry of the resulting list. This is the result of the comparison.

However the deltas may as well be all zero. In this case the filtering returns an empty list and first yields nil. The or function evaluates the next expression which is diff-len and represents the length difference of the two substring lists.

Don`t repeat yourself

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [explode #(map maybe-to-int (.split % "_"))
5		      [la lb] (map explode [a b])
6		      deltas (map compare la lb)
7		      diff-len (reduce - (map count [la lb]))]
8		  (println deltas)
9		  (or (first (filter (partial not= 0) deltas))
10		      diff-len)))))

In this version we extract the common functionality of splitting and mapping and create the function explode (line 4). We map the function to both strings (line 5). The value diff-len is computed in a more functional way (line 7).

Refinement

1(def tag-comparator
2     (proxy [java.util.Comparator][]
3       (compare [a b]
4		(let [parse-int #(Integer/parseInt %)
5		      parse-fns [str parse-int parse-int parse-int str parse-int]
6		      explode (fn [s] (map #(%1 %2) parse-fns (.split s "_")))
7		      [la lb] (map explode [a b])
8		      deltas (map compare la lb)
9		      diff-len (reduce - (map count [la lb]))]
10;		  (println deltas)
11		  (or (first (filter (partial not= 0) deltas))
12		      diff-len)))))

Haskell Heureka

Dezember 23, 2008

Now here is one of the many discoveries I got with the Haskell programming language. Haskell is a functional programming language that might look occult or opaqe to the average programmer. You might wonder how you get anything done in this language. But I recently had some mind-blowing experiences. Now if you are not a haskell programmer be warned: You might not understand what I’m talking about unless you have a basic understanding of the language. To the more enlightened master hacker haskell guys: This might seem trivial to you but here is what I discovered:

I had a task at hand which required to sequence some monadic operations of the form (a -> m a). It was the IO Monad to be specific but it doesn’t matter because it works with any Monad. The function I wrote had the following type signature:
m a -> [a -> m a] -> m a
It did some kind of sequencing of functions in a list so I called it mapseq, which might not be the best name for the function but was close enough. Here is the definition:

mapseq :: Monad m => m a -> [a -> m a] -> m a
mapseq f [] = f
mapseq s (f:fs) = mapseq (s >>= f) fs

In the back of my mind I always knew that it should be somehow transformed into a fold function, but I could not quite define it, maybe because I tried with foldr too much. The solution was simple but it struck me as both cool and elegant:
foldl (>>=)
That’s all. It has the same type and the same semantics as mapseq and it replaces my crude function entirely. Heureka!

LISP Misc

März 1, 2008

Zur Zeit beschäftige ich mich viel mit der Programmiersprache LISP. LISP steht für List Processing und ist eigentlich eine Familie von Programmiersprachen. LISP wurde in den Fünfziger Jahren von John McCarthy entwickelt und ist nach Fortran die zweitälteste höhere Programmiersprache. Die Syntax ist hauptsächlich durch runde Klammern gekennzeichnet. Trotz des Alters wird LISP auch heute noch gerne verwendet und ist besonders für explorative Programmierung und Rapid Prototyping von Software gut geeignet. Lisp wurde in der Vergangenheit häufig im Bereich der Künstlichen Intelligenz eingesetzt.

„Lisp is a programmable programming language.“
John Foderaro, CACM, September 1991

„Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.“
Eric Raymond, „How to Become a Hacker“

Zur Zeit lese ich das Buch Paradigms of Artificial Intelligence Programming von Peter Norvig. Das Buch ist eine sehr gute Einführung in den Bereich Künstliche Intelligenz und auch für die Sprache Lisp, ich kann es sehr empfehlen. An SICP habe ich mich bisher noch nicht herangetraut, jedoch habe ich mir die Video Lectures angeschaut. Diese sollte man als echter Nerd auf keinen Fall verpassen.

Als Entwicklungsumgebung verwende ich Emacs/Slime mit CLISP. Slime ist der Superior Lisp Interaction Mode for Emacs. Zu Slime gibt es ein sehr gutes Video online, welches eindrücklich zeigt, welche Möglichkeiten dem Lisp Entwickler zur Verfügung stehen.