Row2Col

From Pickwiki
Revision as of 23:48, 26 February 2015 by Conversion script (talk) (link fix)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

HomePage>>SourceCode>>BasicSource>>Row2Col

This utility can transpose a dynamic array item's fields into values, and vice versa. I originally wrote it for AccuTerm GUI programming, where I needed to load records into grid controls. Calling the utility a second time on the same variable will transpose it back to its original "shape"; in order to do this accurately, all fields must contain the same number of values. It's pretty fast on large records too, since it only passes through the original item a few times (instead of extracting and replacing hundreds of fields and values.)

 For example:
 
  0001: apple}banana}cantalope                    0001: apple}111}red}0
  0002: 111}222}333                  becomes      0002: banana}222}green}1
  0003: red}green}blue                            0003: cantalope}333}blue}0
  0004: 0}1}0

i.e. 4 fields with 3 values each becomes 3 fields with 4 values each.

2008-01-18 by Rex Gozar
A discussion on comp.databases.pick suggested some techniques to make this program faster. The new 1.2 revision takes advantage of Universe's internal optimizations to make it 6x faster.

The first optimization was to take advantage of Universe's ability to quickly traverse from field to field within a string. (Note that traversing values is NOT optimized in Universe.) DCOUNT'ing the fields to check the value counts cut a significant amount of time off the processing.

The second optimization uses Universe's REMOVE function to load the values into a dimensioned array. This function traverses both fields and values quickly, and also cut processing time.

The discussion (and benchmarks) leads me to conclude that for small data sets (under 100KB and under 1000 values), simpler home-grown solutions are usually adequate. ROW2COL now has good performance for smaller data sets, and also scales well to large data sets (over 500KB and over 10,000 values).

My benchmark results in milliseconds:
70KB variable with 1,000 fields or values: 22ms (new) vs. 140ms (old)
704KB variable with 10,000 fields or values: 225ms (new) vs. 1428ms (old)

      SUBROUTINE [[ROW2COL]](ITEM)
******
* [[ROW2COL]]  $Revision: 1.2 $
* Transpose table data "rows" (fields) to "columns" (values).
* Copyright (C) 2004 Rex Gozar
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
* http://www.gnu.org/licenses/lgpl.html
*
* Rex Gozar
* [email protected]
******
$OPTIONS DEFAULT
***
* Find the maximum number of fields and the maximum
* number of values per field.  Note that all fields
* must have the same number of values; otherwise,
* additional delimiters would need to be added as
* placeholders and the resulting item could not be
* "flipped" back into its original form.
***
      ITEMLEN = LEN(ITEM)
      FMAX = DCOUNT(ITEM, @FM)
      VMAX = DCOUNT(ITEM<1>, @VM)
      FOR FMC = 2 TO FMAX
         VMC = DCOUNT(ITEM<FMC>, @VM)
         IF VMC NE VMAX THEN
            ABORTM "FIELDS DO NOT HAVE THE SAME VALUE COUNT"
         END
      NEXT FMC
***
* Extract the elements and load them into an array.
* This should be faster than <> extraction on large items.
***
      DIM ARRAY(FMAX,VMAX)
      ITEM = ITEM
      TEXT = ''
      FMC = 1
      VMC = 1
      LOOP
         TEXT := REMOVE(ITEM, MORE)
         DELIM = (IF MORE THEN CHAR(256-MORE) ELSE '')
         BEGIN CASE
            CASE DELIM EQ @VM
               ARRAY(FMC,VMC) = TEXT
               VMC += 1
               TEXT = ''
            CASE DELIM EQ @FM OR DELIM EQ ''
               ARRAY(FMC,VMC) = TEXT
               FMC += 1
               VMC = 1
               TEXT = ''
            CASE @TRUE
               TEXT := DELIM
         END CASE
      WHILE MORE REPEAT
***
* Build the new item so values are fields and fields
* are values.
***
      BUFFER = SPACE(ITEMLEN)
      BUFPOS = 1
      EQU APPEND$TEXT LIT "BUFFER[BUFPOS,LEN(TEXT)] = TEXT ; BUFPOS += LEN(TEXT)"
      FOR VMC = 1 TO VMAX
         IF VMC # 1 THEN
            TEXT = @FM
            APPEND$TEXT
         END
         FOR FMC = 1 TO FMAX
            IF FMC # 1 THEN
               TEXT = @VM
               APPEND$TEXT
            END
            TEXT = ARRAY(FMC,VMC)
            APPEND$TEXT
         NEXT FMC
      NEXT VMC
***
* Done.
***
      ITEM = BUFFER
      RETURN
   END

The older 1.1 revision for comparison. I don't think it contains any Universe-specific functions or statements.

      SUBROUTINE [[ROW2COL]](ITEM)
******
* [[ROW2COL]]  $Revision: 1.1 $
* Transpose table data "rows" (fields) to "columns" (values).
* Copyright (C) 2004 Rex Gozar
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
* http://www.gnu.org/licenses/lgpl.html
*
* Rex Gozar
* [email protected]
******
$OPTIONS DEFAULT
***
* Find the maximum number of fields and the maximum
* number of values per field.  Note that all fields
* must have the same number of values; otherwise,
* additional delimiters would need to be added as
* placeholders and the resulting item could not be
* "flipped" back into its original form.
***
      ITEMLEN = LEN(ITEM)
      FMC = 1
      VMC = 1
      VMAX = 0
      J = 0
      LOOP
         J += 1
         C = ITEM[J,1]
         BEGIN CASE
            CASE C = @VM
               VMC += 1
            CASE C = @FM OR C = ''
               IF FMC = 1 THEN
                  VMAX = VMC
               END ELSE
                  IF VMC # VMAX THEN
                     ABORTM "FIELDS DO NOT HAVE THE SAME VALUE COUNT"
                  END
               END
               IF C = @FM THEN
                  FMC += 1
               END
               VMC = 1
         END CASE
      WHILE C # '' REPEAT
      FMAX = FMC
***
* Extract the elements and load them into an array.
* This should be faster than <> extraction on large items.
***
      DIM ARRAY(FMAX,VMAX)
      MAT ARRAY = ''
      FMC = 1
      VMC = 1
      TEXTPOS = 1
      TEXTLEN = 0
      J = 0
      LOOP
         J += 1
         C = ITEM[J,1]
         BEGIN CASE
            CASE C = @VM
               ARRAY(FMC,VMC) = ITEM[TEXTPOS,TEXTLEN]
               TEXTPOS = J + 1
               TEXTLEN = 0
               VMC += 1
            CASE C = @FM OR C = ''
               ARRAY(FMC,VMC) = ITEM[TEXTPOS,TEXTLEN]
               TEXTPOS = J + 1
               TEXTLEN = 0
               FMC += 1
               VMC = 1
            CASE 1
               TEXTLEN += 1
         END CASE
      WHILE C # '' REPEAT
***
* Build the new item so values are fields and fields
* are values.
***
      BUFFER = SPACE(ITEMLEN)
      BUFPOS = 1
      FOR VMC = 1 TO VMAX
         IF VMC # 1 THEN
            TEXTLEN = 1
            BUFFER[BUFPOS,TEXTLEN] = @FM
            BUFPOS += TEXTLEN
         END
         FOR FMC = 1 TO FMAX
            IF FMC # 1 THEN
               TEXTLEN = 1
               BUFFER[BUFPOS,TEXTLEN] = @VM
               BUFPOS += TEXTLEN
            END
            TEXT = ARRAY(FMC,VMC)
            TEXTLEN = LEN(TEXT)
            BUFFER[BUFPOS,TEXTLEN] = TEXT
            BUFPOS += TEXTLEN
         NEXT FMC
      NEXT VMC
***
* Done.
***
      ITEM = BUFFER
      RETURN
   END

http://www.autopower.com/rgozar/pixel.gif