Create Parent Child Structrure from dataset

  • Hi

    I have a dataset that looks like this

    id Code Text

    001 A Member1

    002 A001 Member2

    003 A001A Member3

    004 A001B Member4

    005 A001B1 Member5

    006 A001C Member6

    007 A002 Member7

    008 A002AA Member8

    009 A002AB Member9

    010 A002AB1 Member10

    011 ADA Member11

    012 ADA01 Member12

    Now this describes a ragged hierarchy, with member 1 being the root.

    It is easy to see that members 2,7,11 has member 1 as parent

    member 3,4,6 has member 2 as parent

    member 5 has member 4 as parent

    etc

    What I want is to generate a parent child structure that corresponds to this

    id Code Text Parent

    001 A Member1 001

    002 A001 Member2 001

    003 A001A Member3 002

    004 A001B Member4 002

    005 A001B1 Member5 004

    006 A001C Member6 002

    007 A002 Member7 001

    008 A002AA Member8 007

    009 A002AB Member9 007

    010 A002AB1 Member10 009

    011 ADA Member11 001

    012 ADA01 Member12 011

    As can be seen the hierarchy is ragged, the codes are ordered in the list according to the ID, so you know that if a member follows another and len(code1)>len(code2) then code1 is parent of code2.

    I have solved this using cursors, building on the knowledge that there are at most 7 levels in the data. By doing this I can solve it by keeping the parent for each level in a variable, I can step through the data and assign the parents.

    It bugs me though, because I just know there must be a solution without cursors, but to the life of me I cannot find one!

    Anyone have any ideas?

    Extra info.

    We have about 10 datasets of this type, varying size form about a 1,000 rows to 100,000 rows.

  • soren kongstad (8/20/2008)


    Hi

    I have a dataset that looks like this

    id Code Text

    001 A Member1

    002 A001 Member2

    003 A001A Member3

    004 A001B Member4

    005 A001B1 Member5

    006 A001C Member6

    007 A002 Member7

    008 A002AA Member8

    009 A002AB Member9

    010 A002AB1 Member10

    011 ADA Member11

    012 ADA01 Member12

    Now this describes a ragged hierarchy, with member 1 being the root.

    It is easy to see that members 2,7,11 has member 1 as parent

    member 3,4,6 has member 2 as parent

    member 5 has member 4 as parent

    etc

    What I want is to generate a parent child structure that corresponds to this

    id Code Text Parent

    001 A Member1 001

    002 A001 Member2 001

    003 A001A Member3 002

    004 A001B Member4 002

    005 A001B1 Member5 004

    006 A001C Member6 002

    007 A002 Member7 001

    008 A002AA Member8 007

    009 A002AB Member9 007

    010 A002AB1 Member10 009

    011 ADA Member11 001

    012 ADA01 Member12 011

    As can be seen the hierarchy is ragged, the codes are ordered in the list according to the ID, so you know that if a member follows another and len(code1)>len(code2) then code1 is parent of code2.

    I have solved this using cursors, building on the knowledge that there are at most 7 levels in the data. By doing this I can solve it by keeping the parent for each level in a variable, I can step through the data and assign the parents.

    It bugs me though, because I just know there must be a solution without cursors, but to the life of me I cannot find one!

    Anyone have any ideas?

    Extra info.

    We have about 10 datasets of this type, varying size form about a 1,000 rows to 100,000 rows.

    USE A CTE.

  • BTW, AFTER LOOKING AT THE DATASET, how do you define a member to be the parent of other?

  • Hi

    The code defines the family relationship. If member m is an ancestor of member n, then code m like code n + '%'.

    The most immediate ancestor is the parent.

    I have thought of CTE's but my mojo is not strong enough to solve the problem.

    /Soren

  • Hi.

    id Code Text Parent

    001 A Member1 001

    011 ADA Member11 001

    Please clarify the two entries.

    How come A has Parent 001 as id is 001 for it. I guess this is the Root Node in the hierarchy.

    The Code ADA is not clear if it has Parent as 001.

    Please clarify the structure.......

    Thanks,

    Amit Khanna

  • The code you are using is confusing us. Please calrify the code like A001B1 -> what the second 1 means here?? id 001

  • The code is made by alphanumeric characters, and describes a hierarchy.

    The root is 'A', and it has itself as parent (just a convention, we could make the parent null)

    A child of the root will have a code consisting of the charachter "A" followed by either a numeric designator, or a character designater.

    In my example, A001, A002, A003

    are examples of children of the root, with numeric designators for the next level.

    ADA is a child of the root with characters.

    It would be nice if the code alternated between numeric and character level designators, but alas, it is not so.

    The data is the output of a depth first algorithm. We have no control of the data, it is a dataset supplied by the government. You can se the datadumper on this site:

    http://medinfo.dk/sks/dump.php

    These are not the only datasets we have, but the format is the same for all datasets.

    The data will change over time, so I am not so keen on making a specific procedure for each dataset.

    /Soren

  • The following is a set based solution but, as it involves a triangular join, a cursor will be quicker on large data sets. What constitutes a large data set will depend on your data so you will need to test it against a cursor solution.

    SELECT

    &nbsp&nbsp&nbsp&nbspC.*

    &nbsp&nbsp&nbsp&nbsp,ISNULL(P.[id], C.[id]) AS ParentID

    FROM YourTable P

    &nbsp&nbsp&nbsp&nbspJOIN

    &nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT T22.Code, MAX(T21.Code) AS ParentCode

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspFROM YourTable T21

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspJOIN YourTable T22

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspON T22.Code LIKE T21.Code + '%'

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspAND T22.Code <> T21.Code

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspGROUP BY T22.Code

    &nbsp&nbsp&nbsp&nbsp) D

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspON P.Code = D.ParentCode

    &nbsp&nbsp&nbsp&nbspRIGHT JOIN YourTable C

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspON D.Code = C.Code

    p.s. This T-SQL specific code may be quicker:

    SELECT *

    &nbsp&nbsp&nbsp&nbsp,ISNULL

    &nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT T.[id]

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspFROM YourTable T

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspWHERE T.Code =

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp(

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspSELECT MAX(P.Code)

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspFROM YourTable P

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspWHERE C.Code LIKE P.Code + '%'

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspAND C.Code <> P.Code

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp)

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp)

    &nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp,C.[id]

    &nbsp&nbsp&nbsp&nbsp) AS ParentID

    FROM YourTable C

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply