CSV2ARRAY

From Pickwiki
Revision as of 07:23, 24 July 2007 by Stuboy (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

HomePage>>SourceCode>>BasicSource>>CSV2ARRAY

This is a tool to convert a CSV file to a dynamic array.

Background

The main goal of this tool is to provide a good conversion from CSV. This is a Microsoft file format commonly used for flat file data exchange (until people start using XML universally). I have found many different bits and pieces around to convert CSV, most of which don't handle the possibilities of CSV or do the conversion badly.
The program was designed for UV as a function requiring a csv document and input arguments and returns a result value, and sets status() if conversion fails. It means it can be easily called from basic or an I-type subr() statement. See the comments in the code for examples and details of methods and the arguments required.
function CSV2ARRAY(doc,opts)
* function to convert a csv to an array
* 1.0 23-Jul-2007 stuart boydell - Original
*
* doc:  csv document
*       line breaks in cells get converted to subVal marks (@sm)
*       nb. U2 autoconverts char(10) / char(13) to normalised (nix/win) crlf - so best to get your csv raw.
* opts: 'N' returns a normalised dynamic array // eg orthagonal to excel: columns to attributes, cells to values
*

   equ E.PROG.NAME to 'CSV2ARRAY',
       E.VERSION   to '1.0',
       E.COMMA     to ',',
       E.QUOTE     to '"',
       E.WINCRLF   to char(10):char(13)   ;*// CSV is usually produced by a win program - so parse win crlf

   returnValue = oconv('','c')            ;*// set RV to '' and status() function to false
   occurs = 1

   lmx = maximum(counts(doc,E.COMMA))   ;*// <<TODO>> incomplete use of these in normalisation
   lmn = minimum(counts(doc,E.COMMA))

   doc = change(doc,E.WINCRLF,@am)  ;*// crlf to @am
   doc = convert(E.COMMA:char(10),@vm:@sm,doc)   ;*// commas to vm // char(10) to sm (manual cell breaks alt-enter)

   loop while findstr E.QUOTE in doc,occurs setting a,v do
      docLine    = raise(doc<a>)
      newDocLine = ''
      max = dcount(docLine,@am)
      for i = max to 1 step -1
         cell = docLine<i>
         if cell[1] = E.QUOTE then
            *// make sure commas are replaced // should be an even number of quotes per cell
            loop while cell[1,1] # E.QUOTE and mod(count(cell,E.QUOTE),2) and i > 0 do
               i -= 1
               cell = docLine<i>:E.COMMA:cell
            repeat
            if cell[1,1] = E.QUOTE and cell[1] = E.QUOTE then
               cell = cell[2,len(cell)-2]
            end
            cell = change(cell,str(E.QUOTE,2),E.QUOTE)
         end
         if len(newDocLine) then
            ins cell before newDocLine<1>
         end else
            *// deal with inserting '' into empty array.
            if i = max-1 and cell = '' then newDocLine = @am else newDocLine = cell
         end
      next i
      occurs += count(newDocLine,E.QUOTE)
      doc<a>  = lower(newDocLine)
   repeat

   if opts = 'N' then   ;*// do mv normalisation
      max = dcount(doc,@am)
      newDoc = raise(doc<1>)
      if count(newDoc,@am) < lmx then newDoc<lmx+1> = ''
      for i = 2 to max
         newDoc = splice(newDoc,char(222),raise(doc<i>))
      next i
      doc = convert(char(222),@vm,newDoc)
   end
   opts = err
   returnVal = doc
return(returnVal)
end

HomePage>>SourceCode>>BasicSource>>CSV2ARRAY