Wednesday, November 2, 2011

Where are the lines of my Clojure ?

Houdy again.

 Once again a very (very) short one in... Clojure this time. As a matter of fact this is a short one before the longer one up to come :). 

In order to challenge the code displayed in the upcoming post, I had again this problem of reading data from a file before converting it. Thanks to an astute reader in a previous post, I dropped the use of the clojure contrib 1.2 while using the 1.3 version of Clojure.
As I did not took the time to explore the new Clojure contributions I found my self a little bit upset while needing to load data from a file. The older Clojure contributions library hosted a suitable read-lines method in the module. And I liked it. 

 Fortunately LISP languages always invite you in a nice way to develop your tools and build upon them. Of course I could not resist. At that point the do-not-reinvent-the-wheel conservationist's  will have left the building.
Thank you to the others for staying :) The following harsh test renders what I needed:

(def from-test-file "/dev/projects/clojure/leinengen/ml/myfile.txt")

(deftest read-lines-from-file-should-take-back-all
  (is (= 50 (count (read-lines from-test-file)))))

The read-lines takes as input the exact file location expression and returns a sequence (lazy if possible). Having explored lazy sequences in a previous post , I came to the working following solution:

(ns tools.file-reader
    (:import [ File BufferedInputStream BufferedReader FileReader]))

(defn read-lines [from-file]
  (let [reader (BufferedReader. (FileReader. (File. from-file)))]
      (letfn [(flush-content [line]
                  (if line
                    (cons line (flush-content (.readLine reader)))
                    (.close reader))))]
      (flush-content (.readLine reader)))))

One will recognize the invocations of the BufferedReader, FileReader, and File java constructors recognizable by the appended "." at the end of the class name. The imports have been achieved through the use of the :import form at the top of the file. 

The function body hosts a recursive inner function flush-content, in charge of
  • taking the decision to stop the recursion when no more line has been found
  • recur again using the readLine method from the BufferedReader.
each new found line becomes an element of the lazy sequence using the standard cons form. The lazy sequence end clause is reached when no more line has been found. Works nicely for small files.

And that's all folks. The big interesting part is in the next post.

Must write it now, so be seeing you !!! :)


Sean Corfield said...

Depending on what you want to do with the lazy sequence of lines, the following might be useful:

(defn by-line [f from-file]
(with-open [r ( from-file)]
(f (line-seq r))))

(deftest read-lines-from-file-should-take-back-all
(is (= 50 (by-line count from-test-file))))

Instead of getting the sequence of lines back, you need to pass in the function to operate on the sequence. This is the same pattern that would be used with for processing SQL result sets (using with-query-results).

Globulon said...

Nice one !! Thank you very much for you feed back :)

Martijn Verburg said...

Great post - I wonder what would happen if you used Java 7's try-with-resources with this example, I think you'll like the result!

Globulon said...

Thank you :) I have a hunch it would be a bit surprising... :):):)

Post a Comment