looking up in the tree

  • Well I have a two tables lets say they look like as follows:

    Table A

    CompanyID

    Location

    TableB

    CompanyID

    CompanyName

    Date

    The CompanyID column has values that look like this:

    AAA OR

    AAA.BBB OR

    AAA.BBB.CCC OR

    AAA.BBB.CCC.DDD OR

    AAA.BBB.CCC.DDD.EEE OR

    AAA.BBB.CCC.DDD.EEE.FFF

    each line representing the level of the company.

    we can see that CompanyID column exists in both tables. and I would like to find out the CompanyName from table B for each companyID that exists in Table A.

    but the tricky part is that for a specific CompanyID in tableA I might not have a exact match.

    e.g. in A lets say I have AAA.BBB.CCC.DDD.EEE for CompanyID but in table B i might not have AAA.BBB.CCC.DDD.EEE but instead have AAA.BBB or AAA.BBB.CCC or AAA.

    so I have to walk up the tree or these levels to look for a match an get the companyName. the match logic is as such.

    If in table A companyID = AAA.BBB.CCC.DDD.EEE then look for same value in B if NOT FOUND then

    remove the last level from the companyID value from Table A. so new value of CompanyID = AAA.BBB.CCC.DDD

    Now look for this new value AAA.BBB.CCC.DDD in table B. IF NOT FOUND then

    remove another level from ComapnyID in table A so new companyID = AAA.BBB.CCC and look for

    AAA.BBB.CCC in table B. just going up the tree by chopping off a level till I find a match.

    now the question is how to do this in TSQL.... can someone help?

    So what I really want to do is to look for a match between A and B based on companyID.

  • I have created a function that will return the best match for a given id. It will return the [id] from TempTableB based on the criteria you specified or it will return a blank if it does not find any match

    Create sample data:

    CREATE TABLE dbo.TempTableA

    (

    [id] varchar(250),

    [location] varchar(250)

    )

    CREATE TABLE dbo.TempTableB

    (

    [id] varchar(250),

    [compname] varchar(250),

    [compdate] datetime default (getdate())

    )

    INSERT INTO TempTableA([id],[location]) VALUES('AAA','LocA')

    INSERT INTO TempTableA([id],[location]) VALUES('AAA.BBB','LocAB')

    INSERT INTO TempTableA([id],[location]) VALUES('AAA.BBB.CCC','LocABC')

    INSERT INTO TempTableA([id],[location]) VALUES('AAA.BBB.CCC.DDD','LocABCD')

    INSERT INTO TempTableA([id],[location]) VALUES('AAA.BBB.CCC.DDD.EEE','LocABCDE')

    INSERT INTO TempTableA([id],[location]) VALUES('AAA.BBB.CCC.DDD.EEE.FFF','LocABCDEF')

    INSERT INTO TempTableB([id],[compname],[compdate]) VALUES('AAA','CompA',getdate())

    INSERT INTO TempTableB([id],[compname],[compdate]) VALUES('AAA.BBB.CCC.DDD','CompABCD',getdate())

    INSERT INTO TempTableB([id],[compname],[compdate]) VALUES('AAA.BBB.CCC.DDD.EEE.FFF','CompABCDEF',getdate())

    GO

    Create the function that would return the id or its parent:

    CREATE FUNCTION dbo.udf_TestGetVal (@InString varchar(250))

    RETURNS varchar(250)

    AS

    BEGIN

    DECLARE @Out varchar(250)

    DECLARE @Out1 varchar(250)

    IF EXISTS (SELECT [id]

    FROM dbo.TempTableB

    WHERE [id]=@InString)

    BEGIN

    SET @Out = @InString

    END

    ELSE

    BEGIN

    SET @Out1 = LEFT(@InString,LEN(@InString)-CHARINDEX('.',REVERSE(@InString),1))

    IF LEN(@Out1) < LEN(@InString)

    BEGIN

    SET @Out = dbo.udf_TestGetVal(@Out1)

    END

    ELSE

    BEGIN

    SET @Out = ''

    END

    END

    RETURN @Out

    END

    GO

    Test this:

    SELECT *

    FROM TemptableA a

    inner join TemptableB b

    on dbo.udf_TestGetVal(a.[id]) = b.[id]

    Let me know if this answers your problem.

  • i couldn't figure out how to do it with a CTE, but here's a loop that works.

    create table #tableA ( companyId varchar(255) )

    create table #tableB ( companyId varchar(255), companyName varchar(255))

    ;

    insert into #tableA values ('a.1')

    insert into #tableA values ('a.2')

    insert into #tableA values ('a.2.red')

    insert into #tableA values ('a.2.green')

    insert into #tableA values ('b.blue')

    insert into #tableA values ('b.1')

    insert into #tableA values ('bb')

    insert into #tableB values ('a','abc corp')

    insert into #tableB values ('a.1','abc south')

    insert into #tableB values ('a.2','abc north')

    insert into #tableB values ('a.2.red','abc north red')

    insert into #tableB values ('a.2.blue','abc north blue')

    insert into #tableB values ('a.3','abc west')

    insert into #tableB values ('b','bongo co.')

    insert into #tableB values ('b.1','bongo and sons')

    insert into #tableB values ('b.1.green','bongo enterprises')

    insert into #tableB values ('b.2','bongo music')

    insert into #tableB values ('b.2.blue','bongo promotions')

    ;

    drop table #matches

    ;

    declare @affected int

    select A.companyId, B.companyName,

    isnull(B.companyId,

    substring(A.companyId,1, -- drop the trailing token after '.'

    len(A.companyId) - (charindex('.',reverse(A.companyId))))) as match

    into #matches

    from #tableA as A left outer join #tableB as B on A.companyId = B.companyId

    set @affected = @@rowcount

    while @affected > 0

    begin

    update #matches

    set companyName = B.companyName,

    match = isnull(B.companyId,

    substring(A.match,1, len(A.match) - charindex('.',reverse(A.match))))

    from #matches A left outer join #tableB as B on A.match = B.companyId

    where A.companyName is null and A.match != A.companyId

    set @affected = @@rowcount

    end

    select * from #matches

  • I do not have time right now to do an optimal version, but this should be better than an iterative or recursive solution:

    Select A.*, B.*

    From TableA A

    Left Outer join TableB B

    ON B.CompanyID = Left(A.CompanyID

    , (Select MAX(len(B2.CompanyID)) From TableB B2

    Where B2.CompanyID = Left(A.CompanyID, Len(B2.companyID))

    ) )

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (4/3/2008)


    I do not have time right now to do an optimal version, but this should be better than an iterative or recursive solution:

    Select A.*, B.*

    From TableA A

    Left Outer join TableB B

    ON B.CompanyID = Left(A.CompanyID

    , (Select MAX(len(B2.CompanyID)) From TableB B2

    Where B2.CompanyID = Left(A.CompanyID, Len(B2.companyID))

    ) )

    that almost works but doesn't consider the delimiters, so the final row in my test sample is incorrect.

    A.compId B.compId B.companyName

    -----------------------------------

    a.1 a.1 abc south

    a.2 a.2 abc north

    a.2.red a.2.red abc north red

    a.2.green a.2 abc north

    b.blue b bongo co.

    b.1 b.1 bongo and sons

    bb b bongo co.

    bb should remain unmatched.

    compId companyName match

    ----------------------------------

    a.1 abc south a.1

    a.2 abc north a.2

    a.2.red abc north red a.2.red

    a.2.green abc north a.2

    b.blue bongo co. b

    b.1 bongo and sons b.1

    bb NULL bb

  • I agree with you. Whatever doesnt have a match shouldn't be #matches table.

  • rbarryyoung (4/3/2008)


    OK, corrected version:

    Select A.*, B.*

    From TableA A

    Left Outer join TableB B

    ON B.CompanyID = Left(A.CompanyID

    , (Select MAX(len(B2.CompanyID)) From TableB B2

    Where B2.CompanyID = Left(A.CompanyID, Len(B2.companyID))

    And ( B2.CompanyID = A.CompanyID

    Or Left(A.CompanyID, Len(B2.companyID)+1) = '.' )

    ) )

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Set-based solution

    DECLARE@Locations TABLE (ID VARCHAR(250), Location VARCHAR(250))

    INSERT@Locations

    SELECT'AAA', 'LocA' UNION ALL

    SELECT'AAA.BBB', 'LocAB' UNION ALL

    SELECT'AAA.BBB.CCC', 'LocABC' UNION ALL

    SELECT'AAA.BBB.CCC.DDD', 'LocABCD' UNION ALL

    SELECT'AAA.BBB.CCC.DDD.EEE', 'LocABCDE' UNION ALL

    SELECT'AAA.BBB.CCC.DDD.EEE.FFF', 'LocABCDEF'

    DECLARE@Companies TABLE (ID VARCHAR(250), CompName VARCHAR(250), CompDate DATETIME)

    INSERT@Companies

    SELECT'AAA', 'CompA', GETDATE() UNION ALL

    SELECT'AAA.BBB.CCC.DDD', 'CompABCD', GETDATE() UNION ALL

    SELECT'AAA.BBB.CCC.DDD.EEE.FFF', 'CompABCDEF', GETDATE()

    ;WITH Yak (LocationID, LocID, Lvl)

    AS (

    SELECTID,

    ID,

    1

    FROM@Locations

    UNION ALL

    SELECTy.LocationID,

    LEFT(y.LocID, LEN(y.LocID) - CHARINDEX('.', REVERSE(y.LocID))),

    y.Lvl + 1

    FROMYak AS y

    WHEREy.LocID LIKE '%.%'

    )

    SELECTLocationID,

    CompanyID

    FROM(

    SELECTy.LocationID,

    c.ID AS CompanyID,

    ROW_NUMBER() OVER (PARTITION BY y.LocationID ORDER BY y.Lvl) AS RecID

    FROMYak AS y

    INNER JOIN@Companies AS c ON c.ID = y.LocID

    ) AS d

    WHEREd.RecID = 1

    ORDER BYd.LocationID


    N 56°04'39.16"
    E 12°55'05.25"

  • Another approach

    DECLARE@Stage TABLE (RowID INT IDENTITY(1, 1), LocationID NVARCHAR(250), LocID NVARCHAR(250), CompanyID NVARCHAR(250))

    INSERT@Stage

    (

    LocationID,

    LocID

    )

    SELECTl.ID AS LocationID,

    LEFT(l.ID, v.Number - 1) AS LocID

    FROM@Locations AS l

    INNER JOINmaster..spt_values AS v ON v.Type = 'p'

    WHERESUBSTRING(l.ID, v.Number, 1) IN ('.', '')

    AND v.Number > 0

    ORDER BYl.ID,

    v.Number DESC

    UPDATEs

    SETs.CompanyID = c.ID

    FROM@Stage AS s

    INNER JOIN@Companies AS c ON c.ID = s.LocID

    DELETEs

    FROM@Stage AS s

    INNER JOIN(

    SELECTLocationID,

    MIN(RowID) AS theID

    FROM@Stage

    WHERECompanyID > ''

    GROUP BYLocationID

    ) AS x ON x.LocationID = s.LocationID

    WHEREx.theID <> s.RowID

    SELECTLocationID,

    CompanyID

    FROM@Stage

    ORDER BYLocationID


    N 56°04'39.16"
    E 12°55'05.25"

  • rbarryyoung (4/4/2008)


    rbarryyoung (4/3/2008)


    OK, corrected version:

    Select A.*, B.*

    From TableA A

    Left Outer join TableB B

    ON B.CompanyID = Left(A.CompanyID

    , (Select MAX(len(B2.CompanyID)) From TableB B2

    Where B2.CompanyID = Left(A.CompanyID, Len(B2.companyID))

    And ( B2.CompanyID = A.CompanyID

    Or Left(A.CompanyID, Len(B2.companyID)+1) = '.' )

    ) )

    Tres cool barry!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply