Thursday, January 30, 2014

99 Clojure Problems – 16: Drop Every Nth Element of a List

Example:

(deftest p16-drop-every-nth
  (is (= '( a  b  d  e  g  h  j  k) (drop-every 3  '(a b c d e f g h i j k)))))

Solution:

I have two solutions to offer. The basic idea for both of them is to go through the input list and transform it to a result list, counting the elements as they are cons'ed to the result. But I would only do the cons'ing if the value of the counter was not a multiple of n thereby dropping every nth element as required.

The first one is a recursive solution. I used a let binding to define an inner function with the additional counter argument. Without any further precautions the function would be stack-consuming and a stack overflow would be due quite quickly (on my machine with a standard 'lein repl' it can handle 4221 elements). In order to prevent that I wrapped the meat of the function in a lazy-seq

 +--------+   +--------+  +--------+
 |LazySeq |   |LazySeq |  |LazySeq |
 |--------|   |--------|  |--------|
 |+--+---+|   |+--+---+|  |        |
 ||a | +----> ||b | +---> |(drop..)|
 |+--+---+|   |+--+---+|  |        |
 +--------+   +--------+  +--------+

lazy-seq is just a macro that turns the expressions it is given into a LazySeq object. If you followed the link and were surprised by Java code: LazySeq is part of the Java underpinnings of Clojure. The lazy part in lazy-seq means that it does not immediately realise the given expressions (and in our case use up all the stack through recursion). Instead it returns and does nothing until it is accessed as a sequence (that's the seq part). Once realised it caches its value. See my poor man's illustration above to find two realised seqs (a and b) and one that has not yet computed its value and still contains the function to produce it.

With this arrangement in place, the only limit is the size of your heap. At least if you prevent the garbage collector from freeing up memory by collecting the elements of the list we have already computed but won't need anymore. This will happen if you hold on to the head of the list somewhere.

Solution number two is based on the same idea but makes use of a built-in function called 'keep-indexed'. I discovered it a couple of days after my first implementation. It's a higher-order function that takes a decider as its first argument. The decider function gets called for every element in the input sequence. The decider itself takes two arguments the first being the zero-based index of the current item and the second being the item itself. The function is called 'keep-indexed' because it keeps every element for which the decider returns a non-nil value. Implementing drop-every-nth is thus reduced to a very simple anonymous function that does the division (as explained above)—we end up with a quite elegant one-liner.

Read more about this series.

No comments: