Help needed in join

  • Hi,

    Below is my table structure

    Create Table Company(IdCompany int primary key,CompanyName varchar(50),CompanyCode varchar(50))

    Insert into Company values(1000,'IBM','CP100'),(1001,'Informix Corporation','CP101'), (1002,'Rational Software Corporation','CP102'),

    (1003,'Oracle','CP103'),(1004,'DataRaker','CP104'), (1005,'Ksplice','CP105'),

    (1006,'Microsoft','CP106'),(1007,'Hotmail','CP107'), (1008,'Flash Communications','CP108')

    Create table CompanyGroup(IdCompanyGroup int identity(1,1) primary key,IdParentCompany int,IdchildCompany int,

    FOREIGN KEY (IdParentCompany) REFERENCES Company(IdCompany)

    ,FOREIGN KEY (IdchildCompany) REFERENCES Company(IdCompany)

    )

    Insert into CompanyGroup values(1000,1001),(1000,1002),

    (1003,1004),(1003,1005),

    (1006,1007),(1006,1006)

    The logic of this table would be

    Informix Corporation,Rational Software Corporation was aquired by IBM so IBM is parent and others are child in the CompanyGroup table.

    DataRaker, Ksplice was aquired by Oracle, so Oracle is parent and others are child in the CompanyGroup table.

    deto the microsoft example.

    I need to loop the company table based on the idcompany and map it with CompanyGroup Table. If the IdCompany exists in the

    Group table's IdchildCompany column then get the corresponding IdParentCompany and get the companycode of it.

    Exprected Result:

    Any sample query please

  • If this is a fixed hierarchy, then it is straight forward. First get all distinct parent entries and the append the children. There is one error in the data, the last insert into the CompanyGroup should be (1006,1008).

    😎

    SELECT

    Y.IdCompany

    ,Y.CompanyName

    ,Y.CompanyCode

    FROM

    (

    SELECT

    X.IdCompany

    ,X.CompanyName

    ,X.CompanyCode

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY CP.IdCompany

    ORDER BY (SELECT NULL)

    ) AS CID_RID

    ,CP.CompanyName

    ,CP.CompanyCode

    ,CP.IdCompany

    FROM dbo.Company CP

    INNER JOIN dbo.CompanyGroup CG

    ON CP.IdCompany = CG.IdParentCompany

    ) AS X WHERE X.CID_RID = 1

    UNION ALL

    SELECT

    CC.IdCompany

    ,CC.CompanyName

    ,CP.CompanyCode

    FROM dbo.Company CP

    INNER JOIN dbo.CompanyGroup CG

    ON CP.IdCompany = CG.IdParentCompany

    INNER JOIN dbo.Company CC

    ON CG.IdchildCompany = CC.IdCompany

    ) AS Y ORDER BY Y.CompanyCode,Y.IdCompany

    Results

    IdCompany CompanyName CompanyCode

    ----------- ------------------------------ ------------

    1000 IBM CP100

    1001 Informix Corporation CP100

    1002 Rational Software Corporation CP100

    1003 Oracle CP103

    1004 DataRaker CP103

    1005 Ksplice CP103

    1006 Microsoft CP106

    1007 Hotmail CP106

    1008 Flash Communications CP106

  • Hi Eirik,

    I am sorry for the error data and Thank you for the reply and query. I understand how to split. Have a question about i have 100*1000 records in both of the table. so if i do this logic will it impact the performance? Any suggestion please.

    Appreciate your time on this.

  • KGJ-Dev (5/11/2014)


    Hi Eirik,

    I am sorry for the error data and Thank you for the reply and query. I understand how to split. Have a question about i have 100*1000 records in both of the table. so if i do this logic will it impact the performance? Any suggestion please.

    Appreciate your time on this.

    This is quite efficient and can be sped up by adding an index or two, don't think that is a concern.

    As I said in the previous post, this method is fine if the hierarchy is fixed, that is only parent-child.

    😎

  • yes the hierarchy is fixed. I believe your words. I will start working with the real time data and will post back.

    Once again thank you so much for your time Gentle Man.

  • KGJ-Dev (5/11/2014)


    yes the hierarchy is fixed. I believe your words. I will start working with the real time data and will post back.

    Once again thank you so much for your time Gentle Man.

    Just as I was closing the code window I noticed there was an opportunity for improvement:-D, this shaves probably around 10% off the query cost

    😎

    SELECT

    Y.IdCompany

    ,Y.CompanyName

    ,Y.CompanyCode

    FROM

    (

    SELECT

    X.IdCompany

    ,X.CompanyName

    ,X.CompanyCode

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY CP.IdCompany

    ORDER BY (SELECT NULL)

    ) AS CID_RID

    ,CP.CompanyName

    ,CP.CompanyCode

    ,CP.IdCompany

    FROM dbo.Company CP

    /* Removed the join, simpler execution plan */

    WHERE CP.IdCompany IN (SELECT IdParentCompany FROM dbo.CompanyGroup)

    ) AS X WHERE X.CID_RID = 1

    UNION ALL

    SELECT

    CC.IdCompany

    ,CC.CompanyName

    ,CP.CompanyCode

    FROM dbo.Company CP

    INNER JOIN dbo.CompanyGroup CG

    ON CP.IdCompany = CG.IdParentCompany

    INNER JOIN dbo.Company CC

    ON CG.IdchildCompany = CC.IdCompany

    ) AS Y ORDER BY Y.CompanyCode,Y.IdCompany

  • Awesome. Another great tweak. Thanks.

  • And here is an alternative solution:

    Create Table dbo.Company(

    IdCompany int primary key,

    CompanyName varchar(50),

    CompanyCode varchar(50));

    Insert into Company

    values

    (1000,'IBM','CP100'),

    (1001,'Informix Corporation','CP101'),

    (1002,'Rational Software Corporation','CP102'),

    (1003,'Oracle','CP103'),

    (1004,'DataRaker','CP104'),

    (1005,'Ksplice','CP105'),

    (1006,'Microsoft','CP106'),

    (1007,'Hotmail','CP107'),

    (1008,'Flash Communications','CP108');

    Create table CompanyGroup(

    IdCompanyGroup int identity(1,1) primary key,

    IdParentCompany int,IdchildCompany int,

    FOREIGN KEY (IdParentCompany) REFERENCES Company(IdCompany),

    FOREIGN KEY (IdchildCompany) REFERENCES Company(IdCompany)

    );

    Insert into dbo.CompanyGroup

    values

    (1000,1001),

    (1000,1002),

    (1003,1004),

    (1003,1005),

    (1006,1007),

    (1006,1008);

    go

    select

    c.IdCompany,

    c.CompanyName,

    case when oa.CompanyCode is null then c.CompanyCode else oa.CompanyCode end CompanyCode

    from

    dbo.Company c

    outer apply (select

    c1.CompanyCode

    from

    dbo.Company c1

    inner join dbo.CompanyGroup cg

    on c1.IdCompany = cg.IdParentCompany

    where

    c.IdCompany = cg.IdchildCompany)oa(CompanyCode);

    go

    -- OR

    select

    c.IdCompany,

    c.CompanyName,

    isnull(oa.CompanyCode, c.CompanyCode) CompanyCode

    from

    dbo.Company c

    outer apply (select

    c1.CompanyCode

    from

    dbo.Company c1

    inner join dbo.CompanyGroup cg

    on c1.IdCompany = cg.IdParentCompany

    where

    c.IdCompany = cg.IdchildCompany)oa(CompanyCode);

    go

    drop table dbo.CompanyGroup;

    drop table dbo.Company;

    go

  • Just for comparison I ran the following two queries against the sample data:

    select

    c.IdCompany,

    c.CompanyName,

    isnull(oa.CompanyCode, c.CompanyCode) CompanyCode

    from

    dbo.Company c

    outer apply (select

    c1.CompanyCode

    from

    dbo.Company c1

    inner join dbo.CompanyGroup cg

    on c1.IdCompany = cg.IdParentCompany

    where

    c.IdCompany = cg.IdchildCompany)oa(CompanyCode);

    go

    SELECT

    Y.IdCompany

    ,Y.CompanyName

    ,Y.CompanyCode

    FROM

    (

    SELECT

    X.IdCompany

    ,X.CompanyName

    ,X.CompanyCode

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY CP.IdCompany

    ORDER BY (SELECT NULL)

    ) AS CID_RID

    ,CP.CompanyName

    ,CP.CompanyCode

    ,CP.IdCompany

    FROM dbo.Company CP

    /* Removed the join, simpler execution plan */

    WHERE CP.IdCompany IN (SELECT IdParentCompany FROM dbo.CompanyGroup)

    ) AS X WHERE X.CID_RID = 1

    UNION ALL

    SELECT

    CC.IdCompany

    ,CC.CompanyName

    ,CP.CompanyCode

    FROM dbo.Company CP

    INNER JOIN dbo.CompanyGroup CG

    ON CP.IdCompany = CG.IdParentCompany

    INNER JOIN dbo.Company CC

    ON CG.IdchildCompany = CC.IdCompany

    ) AS Y ORDER BY Y.CompanyCode,Y.IdCompany

    go

    Attach is the .sqlplan file with the actual execution plans.

  • Lynn Pettis (5/11/2014)


    And here is an alternative solution:

    And its also a better one:ermm:, probably 20% quicker than my code! Nice!

    😎

  • For comparison, here is a SQL 2012/2014 window function version. Without an POC index, Lynn's code is still faster.

    😎

    SELECT

    IdCompany

    ,CompanyName

    ,CompanyCode

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY CP.IdCompany

    ORDER BY (SELECT NULL)

    ) AS COMP_RID

    ,CP.IdCompany

    ,CP.CompanyName

    ,FIRST_VALUE(CP.CompanyCode) OVER

    (

    PARTITION BY CG.IdParentCompany

    ORDER BY (SELECT NULL)

    ) AS CompanyCode

    FROM dbo.CompanyGroup CG

    OUTER APPLY dbo.Company CP

    WHERE (CG.IdParentCompany = CP.IdCompany AND CG.IdchildCompany <> CP.IdCompany)

    OR (CG.IdParentCompany <> CP.IdCompany AND CG.IdchildCompany = CP.IdCompany)

    ) AS X WHERE X.COMP_RID = 1

  • Hi Lynn,

    wonderful. Thanks a lot for tuning. nice Discussion. thanks eirik too.

  • Eirikur Eiriksson (5/11/2014)


    For comparison, here is a SQL 2012/2014 window function version. Without an POC index, Lynn's code is still faster.

    😎

    SELECT

    IdCompany

    ,CompanyName

    ,CompanyCode

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY CP.IdCompany

    ORDER BY (SELECT NULL)

    ) AS COMP_RID

    ,CP.IdCompany

    ,CP.CompanyName

    ,FIRST_VALUE(CP.CompanyCode) OVER

    (

    PARTITION BY CG.IdParentCompany

    ORDER BY (SELECT NULL)

    ) AS CompanyCode

    FROM dbo.CompanyGroup CG

    OUTER APPLY dbo.Company CP

    WHERE (CG.IdParentCompany = CP.IdCompany AND CG.IdchildCompany <> CP.IdCompany)

    OR (CG.IdParentCompany <> CP.IdCompany AND CG.IdchildCompany = CP.IdCompany)

    ) AS X WHERE X.COMP_RID = 1

    I have to ask, why all the extra complexity in the code using a windowing function? I really see no need for it.

  • I still haven't setup a million row test, and really should to really test the proposed solutions, but I made a few minor changes to the CompanyGroup table. I made the primary key on this table nonclustered and added a unique clustered index on the IdchildCompany column.

    I then ran the three proposed solutions after also modifying mine to include an ORDER BY clause.

    Here is the updated test script.

    set nocount on;

    go

    Create Table dbo.Company(

    IdCompany int primary key,

    CompanyName varchar(50),

    CompanyCode varchar(50));

    Insert into Company

    values

    (1000,'IBM','CP100'),

    (1001,'Informix Corporation','CP101'),

    (1002,'Rational Software Corporation','CP102'),

    (1003,'Oracle','CP103'),

    (1004,'DataRaker','CP104'),

    (1005,'Ksplice','CP105'),

    (1006,'Microsoft','CP106'),

    (1007,'Hotmail','CP107'),

    (1008,'Flash Communications','CP108');

    Create table CompanyGroup(

    IdCompanyGroup int identity(1,1) primary key nonclustered,

    IdParentCompany int,

    IdchildCompany int unique clustered,

    FOREIGN KEY (IdParentCompany) REFERENCES Company(IdCompany),

    FOREIGN KEY (IdchildCompany) REFERENCES Company(IdCompany)

    );

    Insert into dbo.CompanyGroup

    values

    (1000,1001),

    (1000,1002),

    (1003,1004),

    (1003,1005),

    (1006,1007),

    (1006,1008);

    go

    set nocount off;

    go

    print '********** Simple Outer Apply **********';

    print '';

    go

    set statistics io,time on;

    select

    c.IdCompany,

    c.CompanyName,

    isnull(oa.CompanyCode, c.CompanyCode) CompanyCode

    from

    dbo.Company c

    outer apply (select

    c1.CompanyCode

    from

    dbo.Company c1

    inner join dbo.CompanyGroup cg

    on c1.IdCompany = cg.IdParentCompany

    where

    c.IdCompany = cg.IdchildCompany)oa(CompanyCode)

    order by

    c.IdCompany;

    go

    set statistics io,time off;

    go

    print '';

    print '********** First Window Function **********';

    print '';

    go

    set statistics io,time on;

    SELECT

    Y.IdCompany

    ,Y.CompanyName

    ,Y.CompanyCode

    FROM

    (

    SELECT

    X.IdCompany

    ,X.CompanyName

    ,X.CompanyCode

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY CP.IdCompany

    ORDER BY (SELECT NULL)

    ) AS CID_RID

    ,CP.CompanyName

    ,CP.CompanyCode

    ,CP.IdCompany

    FROM dbo.Company CP

    /* Removed the join, simpler execution plan */

    WHERE CP.IdCompany IN (SELECT IdParentCompany FROM dbo.CompanyGroup)

    ) AS X WHERE X.CID_RID = 1

    UNION ALL

    SELECT

    CC.IdCompany

    ,CC.CompanyName

    ,CP.CompanyCode

    FROM dbo.Company CP

    INNER JOIN dbo.CompanyGroup CG

    ON CP.IdCompany = CG.IdParentCompany

    INNER JOIN dbo.Company CC

    ON CG.IdchildCompany = CC.IdCompany

    ) AS Y ORDER BY Y.CompanyCode,Y.IdCompany

    go

    set statistics io,time off;

    go

    print '';

    print '********** Second Window Function 2012/2014 option **********';

    print '';

    go

    set statistics io,time on;

    SELECT

    IdCompany

    ,CompanyName

    ,CompanyCode

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY CP.IdCompany

    ORDER BY (SELECT NULL)

    ) AS COMP_RID

    ,CP.IdCompany

    ,CP.CompanyName

    ,FIRST_VALUE(CP.CompanyCode) OVER

    (

    PARTITION BY CG.IdParentCompany

    ORDER BY (SELECT NULL)

    ) AS CompanyCode

    FROM dbo.CompanyGroup CG

    OUTER APPLY dbo.Company CP

    WHERE (CG.IdParentCompany = CP.IdCompany AND CG.IdchildCompany <> CP.IdCompany)

    OR (CG.IdParentCompany <> CP.IdCompany AND CG.IdchildCompany = CP.IdCompany)

    ) AS X WHERE X.COMP_RID = 1

    go

    set statistics io,time off;

    go

    drop table dbo.CompanyGroup;

    drop table dbo.Company;

    go

    And here is the output from the SET STATISTICS from my dual proc 8 core laptop with 8 GB ram.

    ********** Simple Outer Apply **********

    (9 row(s) affected)

    Table 'Company'. Scan count 1, logical reads 14, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'CompanyGroup'. Scan count 0, logical reads 18, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    ********** First Window Function **********

    (9 row(s) affected)

    Table 'Company'. Scan count 1, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'CompanyGroup'. Scan count 2, logical reads 21, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    ********** Second Window Function 2012/2014 option **********

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 2 ms.

    (9 row(s) affected)

    Table 'Worktable'. Scan count 6, logical reads 43, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'CompanyGroup'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Company'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    Edit: Fix copy/paste issue.

  • Thanks Lynn, can i conclude the moral of this comparison that The first logic takes very little time then other two and also faster.

    Also i will be testing this logic with my 100*1000 records in a day or two.

    Appreciate your wonderful time on this post.

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply