Parsing data with Clojure, interval problem.
        Posted  
        
            by Andrea Di Persio 
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Andrea Di Persio 
        
        
        
        Published on 2010-03-29T13:50:07Z
        Indexed on 
            2010/03/29
            13:53 UTC
        
        
        Read the original article
        Hit count: 336
        
clojure
Hello! I'm writing a little parser in clojure for learning purpose. basically is a TSV file parser that need to be put in a database, but I added a complication. The complication itself is that in the same file there are more intervals. The file look like this:
###andreadipersio 2010-03-19 16:10:00###                                                                                
USER     COMM               PID  PPID  %CPU %MEM      TIME  
root     launchd              1     0   0.0  0.0   2:46.97  
root     DirectoryService    11     1   0.0  0.2   0:34.59  
root     notifyd             12     1   0.0  0.0   0:20.83  
root     diskarbitrationd    13     1   0.0  0.0   0:02.84`
....
###andreadipersio 2010-03-19 16:20:00###                                                                                
USER     COMM               PID  PPID  %CPU %MEM      TIME  
root     launchd              1     0   0.0  0.0   2:46.97  
root     DirectoryService    11     1   0.0  0.2   0:34.59  
root     notifyd             12     1   0.0  0.0   0:20.83  
root     diskarbitrationd    13     1   0.0  0.0   0:02.84
I ended up with this code:
(defn is-header? 
  "Return true  if a line is header"
  [line]
  (> (count (re-find #"^\#{3}" line)) 0))
(defn extract-fields
  "Return regex matches"
  [line pattern]
  (rest (re-find pattern line)))
(defn process-lines
  [lines]
  (map process-line lines))
(defn process-line
  [line]
  (if (is-header? line)
    (extract-fields line header-pattern))
  (extract-fields line data-pattern))
My idea is that in 'process-line' interval need to be merged with data so I have something like this:
('andreadipersio', '2010-03-19', '16:10:00', 'root', 'launchd', 1, 0, 0.0, 0.0, '2:46.97')
for every row till the next interval, but I can't figure how to make this happen.
I tried with something like this:
(def process-line
  [line]
  (if is-header? line)
    (def header-data (extract-fields line header-pattern)))
  (cons header-data (extract-fields line data-pattern)))
But this doesn't work as excepted.
Any hints?
Thanks!
© Stack Overflow or respective owner