Sunday, December 18, 2011

Where we apply destructuring, pre check and decorate data

Hello again.

I just ended reading Learn you a Haskell for a great good and trying to find some example to set on the blog in order to explore (applicative)functors an monads on the blog. As these things take time, the blog is still empty.
But this kind colleague of mine always insists in the way I should produce thoughts on a more regular base even when they are just remainders or basics. So Adrian, I fulfill your request.

Having switched back to my SICP lecture (stuck on chapter 2) , I did the data symbol part (almost, as exercise 2.58 is still waiting for me). For the first time I bumped into a wall with Scheme, as I don't know the language very well. All things being equal I came back to Clojure, as trying to practice the book with both languages.
I found then myself using some little Clojure forms I was not expecting there, like pattern matching, meta data and pre condition testing. Abelson and Sussman present us a small differentiation program aiming to derive so simple equations. Of course the program is extremely limited, the purpose being to make the student work with symbol data. A typical example of differentiation is :

sicp.data-symbol=> (deriv '(* x y (+ x 3)) 'x)
(+ (* x y) (* y (+ x 3)))

where the first parameter of the deriv function is the expression and the second, the free parameter used in the differentiation process
Although there is room here for macro kata, I adopted the authors approach, not feeling completely at ease with macros.
 The interesting stuff here is that we start with a "symbolic" expression very Clojure idiomatic and get as a result too a symbolic expression. At some point in the book, the reader must extend the differentiation program in order to allow for differentiation of multiplications and additions accepting both multiple parameters.

In my humble opinion the exercise should be achieved using the ideas from the previous paragraphs specifically when dealing with stratified design, ergo the notion that a system should be implemented versus a sequence of layered languages. Each level is built up upon a lower level combining the latter functions so providing a new mean of abstraction. I found my Scheme version violating this rule.

Coming back to Clojure, how are we going to express the symbolic data ? Well, we will use the quote special form. That one simply prevents its arguments from being evaluated. So you can try the following in the REPL:

sicp.data-symbol=> 'a
a
sicp.data-symbol=> '(1 2 3)
(1 2 3)

A symbol is equal to himself, so in the first example, 'a will always mean 'a. The second example exposes the strengh of the quote (') symbol as a powerful way to construct self explicitly lists avoiding the cluttering noise of a list keyword. Following the writers indication I produced a first set of basic functions allowing me to challenge my symbolic expressions. Expressions will be idiomatically expressed as list like structure:

 
'(* x y (+ x 3))

Checking the nature per se of the expressions becomes then very easy, as we will check for the first symbolic item in a list:

(defn variable? [expression]
  (symbol? expression))

(defn sum? [expression]
  (= '+ (first expression)))

(defn product? [expression]
  (= '* (first expression)))

(defn exponentiation? [expression]
  (= '** (first expression)))

(defn same-variable? [expression variable]
  (and (variable? expression) (= expression variable)))

challenged using :

(deftest sum?-should-recognize-input-expression
  (is (= true (sum? '(+ x 3))))
  (is (not= true (sum? '(* x 3)))))

(deftest product?-should-recognize-input-expression
   (is (= true (product? '(* x 3))))
   (is (not= true (product? '(+ x 3)))))

(deftest exponentiation?-should-identify-operation
  (is (= true (exponentiation? '( ** x 3))))
  (is (= false (exponentiation? '( * x 3)))))

Of course I did not challenge Clojure native number? and symbol? forms. A lot of the job was done without using tests but walking side by side with the authors' wishful thinking approach, but I I will provide info when I produced it.

The next level of abstraction includes constructors and selectors, respectively creating the operation expressions, than selecting operands on them. For example in order to create a sum one need a make-sum function as constructor. The selectors aim to extract the addend - the first operand - of the sum expression and the augend, the second operand. Working on the sum, we expect our constructor and selectors to handle all the dirty process of manipulating multiple arguments. The intent is to keep the piggy stuff at a lower level so the main function in charge of the main process might be kept clean and unaware of the lower stuff like structure of a sum , products etc...
Of course in the SICP book, the main function has already been written by wishful thinking. In essence our sum constructor should comply to something like this:

(deftest make-sum-with-simple-arguments-should-produce-expression
  (is (= '(+ x 3) (make-sum 'x 3))))

(deftest make-sum-with-simple-arguments-should-produce-expression
  (is (= '(+ x y z) (make-sum '(x y z)))))

(deftest make-sum-with-one-zero-argument-should-produce-symbol
  (is (= 'x (make-sum 'x 0))))

(deftest make-sum-with-two-zero-argument-should-produce-operation-symbol
  (is (= '(+ x y) (make-sum '(x 0 y 0)))))

(deftest make-sum-with-two-numbers-should-produce-number
  (is (= 3 (make-sum '(1 2)))))

We introduced, like in the book, smooth reduction operations allowing to get rid of the identity operands when possible. The result expression are complex enough so we can try to reduce a little bit of that complexity.
The quality of solution I provided is debatable and as usual I am opened to critics. Not at ease with wishful thinking, I tempted to guard my functions with small contracts on my input functions (stack overflow being very close) and explored function variable arity in order to satisfy both my main function invocations and my selectors'invocations:

(defn make-sum
  ([x y] (make-sum [x y]))
  ([[_ :as expression]]
    (cond
      (= 1 (count expression)) (first expression)
      (all-numbers? expression) (apply + expression)
      (:simplified (meta expression)) (cons '+ expression)
      :else (make-sum
              (with-meta
                (filter (fn [x] (not= 0 x)) expression)
                {:simplified true})))))

(defn addend [expression]
  {:pre [(>= (count expression) 3)]}
  (second expression))

(defn augend [[op _ & cdr :as expression]]
  {:pre [(>= (count expression) 3)]}
  (make-sum cdr))

The SICP book strongly suggests to make of a '(+ x y z) expression a '(+ x (+ y z)) expression while looking for augend (so far, the interpretation is mine).
 So now we are talking.

In less than 20 lines I used a lot Clojure idiomatic stuff. As told earlier, my main function already exists (check at the end of the blog) and expects my make-sum function to expose a two variable signature. While extracting the augend parameter from my sum expression I don't want to clutter the code of the augend function with expression re-composition.
Composing a new sum from an existing one is the purpose of the make-sum constructor. I can isolate the sub set of the list I want my constructor to use in order to create a new sum expression. So my make-sum function will expose two function signatures, one with two parameters, and a second with a sequence.

Isolating the rest of the expression in order to crate a new sum expression can become annoying, specifically when mangling a let form sub level with combinations of first and rest functions applied to the input expression. Then comes destructuring. As noted by one of the readers (see comments), destructuring slightly differs from pattern matching  (one  may find interesting discussions here and there).
Let's say on a first approximation that destructuring allows to bind inner elements of an input sequence or map to input function variables. As so, appying :

[op _ & cdr :as expression]

will
  • bind op to the first parameter
  • ignore the second parameter as idiomatically expressed with the _ wild card 
  • and using the & separator will isolate all the remaining elements in a sequenced referred by the cdr expression
The :as keyword allows for the aliasing of the complete input collection using the expression parameter name.

My intent , aliasing the whole input expression, was to provide meat to a pre condition checking (yes like in design by contract). Pre-condition checking are a bless specifically when they stop you before your blindness fires a stack overflow :).
 Back to the make-sum function variable arity, one can see that I delegate all the work to the version accepting sequences as parameters. I could have not used destructuring into the second signature. I found it an idiomatic way to express that the function was specifically expecting a sequence.

As the tests showed earlier, I had to take into account various scenarii like the identity element for the sum, that can lead to a single element expression, the fact that all the list operands are pure numbers etc...

Filtering identity elements induces a recursion step after the filter step. We should memoize some ...state (?) in order to not indefinitely filter zeros from the input expression.But state is bad, we all know it. Decorating the filtered list with a meta information can be a smooth way to proceed. Once the immutable filtered sequence created, we do not alter its content or use a third method signature. We attach to the data a meta information:

 (with-meta
   (filter (fn [x] (not= 0 x)) expression)
   {:simplified true})

informing the function that no more recursion is necessary:

(:simplified (meta expression)) (cons '+ expression)

and the construction of the sum expression can be achieved.
The set of operations is similar when it comes to the product expression constructors and selectors.

 The whole code is given by:

(defn variable? [expression]
  (symbol? expression))

(defn sum? [expression]
  (= '+ (first expression)))

(defn product? [expression]
  (= '* (first expression)))

(defn exponentiation? [expression]
  (= '** (first expression)))


(defn same-variable? [expression variable]
  (and (variable? expression) (= expression variable)))

(defn all-numbers? [in-list]
  (letfn [
           (iter [still-number remaining]
             (cond
               (not still-number) still-number
               (empty? remaining) true
               :else (recur
                       (and still-number (number? (first remaining)))
                       (rest remaining))))]
    (iter true in-list)))

(defn make-sum
  ([x y] (make-sum [x y]))
  ([[_ :as expression]]
    (cond
      (= 1 (count expression)) (first expression)
      (all-numbers? expression) (apply + expression)
      (:simplified (meta expression)) (cons '+ expression)
      :else (make-sum
              (with-meta
                (filter (fn [x] (not= 0 x)) expression)
                {:simplified true})))))

(defn addend [expression]
  {:pre [(>= (count expression) 3)]}
  (second expression))

(defn augend [[op _ & cdr :as expression]]
  {:pre [(>= (count expression) 3)]}
  (make-sum cdr))

(defn make-product
  ([x y] (make-product [x y]))
  ([[_ :as expression]]
    (cond
      (= 1 (count expression)) (first expression)
      (> (count (filter (fn [x] (= 0 x)) expression)) 0) 0
      (all-numbers? expression) (apply * expression)
      (:simplified (meta expression)) (cons '* expression)
      :else (recur
              (with-meta
                (filter (fn [x] (not (= 1 x))) expression)
                {:simplified true})))))

(defn multiplier [expression]
  {:pre [(>= (count expression) 3)]}
  (second expression))

(defn multiplicand [[op _ & cdr :as expression]]
  {:pre [(>= (count expression) 3)]}
  (make-product cdr))

(defn make-exponentiation [base exponent]
  (cond
    (= 0 exponent) 1
    (= 1 exponent) base
    (and (number? base) (= 1 base)) 1
    (and (number? base) (= 0 base)) 0
    :else (list '** base exponent)))

(defn base [from-exponentiation]
  {:pre [(= 3 (count from-exponentiation))]}
  (second from-exponentiation))

(defn exponent [from-exponentiation]
  {:pre [(= 3 (count from-exponentiation))]}
  (second (rest from-exponentiation)))


(defn deriv [expression variable]
  (println expression " derived by " variable)
  (cond
    (number? expression)
    0
    (variable? expression)
    (if (same-variable? expression variable) 1 0)
    (sum? expression)
    (make-sum
      (deriv (addend expression) variable)
      (deriv (augend expression) variable))
    (product? expression)
    (make-sum
      (make-product
        (multiplier expression)
        (deriv (multiplicand expression) variable))
      (make-product
        (deriv (multiplier expression) variable)
        (multiplicand expression)))
    (exponentiation? expression)
    (do
      (make-product
        (make-product
          (exponent expression)
          (make-exponentiation (base expression) (dec (exponent expression))))
        (deriv (base expression) variable)))
    :else (println "Invalid expression")))

I am working on a Github account, so next time I will give you a link I hope :)

We have been using destructuring, meta data, and function variable arity, all Clojure's idiomatic forms, in one small program, not so unrealistic (shoot me if you never had to write a small interpreter in your real coder's life. If not, you have been lucky)

 PS: For Schemers , why did not I find my solution satisfactory ? Because my first shot is:

(define (make-sum x y) 
  (cond ((=number? x 0) y)
        ((=number? y 0) x)
        ((and (number? x) (number? y)) (+ x y))
        ( else (list '+ x y))))

(define (augend sum-expr) 
  (let ((rest (cddr sum-expr)))
    (if (= 1 (length rest))
        (car rest)
        (cons '+ rest))))

From my own point of view, the augend function should not build a new sum, the make-sum function taking that in charge. But I will find solution using the '.' notation in input parameters.

 Be seeing you !!! :)

3 comments:

mgm7734 said...

FYI, destructuring is not quite pattern matching. Pattern matching can be true or false; destructuring either works or blows up.

There are pattern matching libs for clojure, like matchure.

Globulon said...

Good evening and thank for your comment. Are we talking about the one developed by David Nolen here : http://www.infoq.com/presentations/The-Mapping-Dilemma ?

Globulon said...

I have taken into your comment. Thanks again for noticing that inconsistency

Post a Comment