Looking for ideas to move large data into new table.

  • I have data in a normalized form:

    KEY1 | STRING1

    KEY1 | STRING2

    KEY1 | STRING3

    KEY1 | STRING4

    KEY1 | STRING5

    I would like to format it as:

    KEY1 | STRING1 | STRING2 | STRING3 | STRING4 | STRING5 |

    Anyone have any ideas do it quickly?

    One table has 50 million rows and another tabls has 84 million rows of data.

    Thanks,

    Kay

  • Determine the max number of strings that any given key value can have since that is what will decide the number of columns for you once you convert the rows into columns.

    If you have a finite list, then you can use this approach else you can create a UDF to do this:

    drop table tablea

    create table tablea (ROW_NUM INT IDENTITY(1,1), col1 varchar(10), col2 varchar(10))

    insert into tablea (COL1, COL2) values ('KEY1', 'STRING1')

    insert into tablea (COL1, COL2) values ('KEY1', 'STRING2')

    insert into tablea (COL1, COL2) values ('KEY1', 'STRING3')

    insert into tablea (COL1, COL2) values ('KEY1', 'STRING4')

    insert into tablea (COL1, COL2) values ('KEY1', 'STRING5')

    insert into tablea (COL1, COL2) values ('KEY2', 'STRING1')

    insert into tablea (COL1, COL2) values ('KEY2', 'STRING2')

    GO

    SELECT

    COL1,

    MAX(CASE WHEN RNG_NBR = 1 THEN COL2 ELSE NULL END) AS COL2,

    MAX(CASE WHEN RNG_NBR = 2 THEN COL2 ELSE NULL END) AS COL3,

    MAX(CASE WHEN RNG_NBR = 3 THEN COL2 ELSE NULL END) AS COL4,

    MAX(CASE WHEN RNG_NBR = 4 THEN COL2 ELSE NULL END) AS COL5,

    MAX(CASE WHEN RNG_NBR = 5 THEN COL2 ELSE NULL END) AS COL6

    FROM

    (SELECT

    COL1,

    COL2,

    (SELECT COUNT(1) FROM TABLEA WHERE COL1 = X.COL1 AND ROW_NUM <= X.ROW_NUM) RNG_NBR

    FROM TABLEA X) DT

    GROUP BY COL1

    --output

    COL1       COL2       COL3       COL4       COL5       COL6      

    ---------- ---------- ---------- ---------- ---------- ----------

    KEY1       STRING1    STRING2    STRING3    STRING4    STRING5

    KEY2       STRING1    STRING2    NULL       NULL       NULL

    (2 row(s) affected)

    Please remember that since you are dealing with a very large number of rows, please provide a filter criteria to this innermost SQL so that the access is via indexed columns and the rows to columns conversion is faster, example:

    If your Keys are number keys, you can put a while loop around this and provide values from say 1-1000 as filter:

    WHERE COL1 > 0 and COL1 <= 1000

    So, in this way the row to columns conversion will be done for those 1000 records and that can be stored someplace.  Then, the next loop can deal with the next 1000 and so on.  This way, you will not be scanning the 50+ million record table.

    Hth

     

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply