How to Join two tables with conditions and insert into new table

  • Hi,

     

    I have two tables, both the tables have 4 common columns,

    First Column: LinkingID

    Other four columns(Name,Department,City).

    In few instances, we will get LinkingID row values, and few instances we don't have LinkingID column, so we left to use other 3 common columns), I want to see the best way to handle this scenario.

     

    Create table #table1

    (

    table1id int identity(1,1),

    name varchar(40),

    department varchar(30),

    city varchar(30),

    modifieddate date,

    linkingID int

    )

    Create table #table2

    (

    table2id int identity(100,1),

    name varchar(40),

    department varchar(30),

    city varchar(30),

    modifieddate date,

    linkingID int

    )

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Aa','IT',Null,getdate()-2,11001)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values(Null,'IT','City1',getdate()-2,11002)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values(Null,Null,Null,getdate()-2,11003)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('James','Finance','CN1',getdate()-2,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Martini','Sales','NT',getdate()-2,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Hebbah','Marketing','WN',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Aa','IT',Null,getdate()-2,11001)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values(Null,'IT','City1',getdate()-2,11002)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values(Null,Null,Null,getdate()-2,11003)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('James','Finance','CN1',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Martini','Sales','NT',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Hebbah','Marketing','WN',getdate()-2,Null)

    select * from #table1

    select * from #table2

    Create table #table3

    (

    table3id int identity(1,1),

    table1id int,

    table2id int,

    name varchar(40),

    department varchar(30),

    city varchar(30),

    modifieddate date,

    linkingID int

    )

    Insert into #table3 (name,department,city,modifieddate,linkingID)

    select table1id, table2id, name , department , city , modifieddate,linkingID  from

    (

    SELECT tb1.table1id,

    tb2.table2id,

    tb2.name,

    tb2.department,

    tb2.city,

    tb2.modifieddate,

    tb2.linkingID

    FROM #table2 tb2

    INNER JOIN #table1 tb1 ON tb2.linkingID = tb1.linkingID --When Linking ID is available

    where tb1.table1id not in (select distinct table1id from #table1)

    and tb2.table2id not in (select distinct table2id from #table2)

    Insert into #table3 (name,department,city,modifieddate,linkingID)

    SELECT tb1.table1id,

    tb2.table2id,

    tb2.name,

    tb2.department,

    tb2.city,

    tb2.modifieddate,

    tb2.linkingID

    Row_Number()over(partition by city,department ,city order by modifieddate desc) as RN

    FROM #table2 tb2

    INNER JOIN #table1 tb1 ON tb1.city = tb2.city and tb1.department = tb2.department and tb1.name = tb2.name

    where tb1.linkingID is null or tb2.linkingID is null

    except

    select name, department ,city from #table3

    )sub

    where RN =1

     

    • This topic was modified 4 years, 8 months ago by  Sangeeth Raj.
  • Unless I've missed something, this should work:

    SELECT
    tb1.table1id,
    tb2.table2id,
    tb2.name,
    tb2.department,
    tb2.city,
    tb2.modifieddate,
    tb2.linkingID
    FROM #table2 tb2
    INNER JOIN #table1 tb1
    ON tb2.linkingID = tb1.linkingID
    OR (tb1.city = tb2.city and tb1.department = tb2.department and tb1.name = tb2.name)

    John

  • Hi John,

    The Multiple Join condition returns multiple lines from #table1 for each #Table2. Currently I am using these two joins  query separately, because I can using RowNumber to eliminate multiple lines from the second Join query and union back to the first Join query.

     

     

     

    Thanks

    • This reply was modified 4 years, 8 months ago by  Sangeeth Raj.
  • >>> have two tables, both the tables have 4 common columns,

    First Column: linking_id

    Other four columns(Name, Department, City). <<

    Please explain how you can have a first column, since tables have no ordering by definition. Your other four columns are improperly named for RDBMS. There is no such thing as generic "name"; it has to be the name of something in particular; this is what data modeling calls and attribute property and it has to be of the form <attribute>_name. We also have no idea what particular attribute you're using for the city. I guess we'll just have to look at the code and figure it out since you're not willing to tell us In your DDL.

    But perhaps the most offensive, totally non-relational thing you have here is the idea of "linking_id". The term linking goes back to the days of pointer chains and referred to a one-way pointer to another data element. We through all of that out, and RDBMS. I'm also curious why you think numbering the tables makes any kind of sense and RDBMS. We've tried to get rid of numbered sequences of data elements.

    Things like "modified_date" are metadata and have no place in the base table. This should have been covered in the first week or two of your data modeling class.

    I'm going to try to correct the attempted DDL you posted into something that's actually RDBMS. We need to get rid of IDENTITY columns and get real keys. Again, since each of your two tables must represent completely different kinds of entities to be valid. I'm going to assume you really wanted to put one type of foobar into the first table and another type of foobar in the second table. This is a design fallacy called attribute splitting. I'm going to assume that the findings from the first table are a type I foobar and the other table is a type II foobar. This is about as stupid as putting male employees and female employees in separate tables instead of just having a personnel table with a column for sex code.

    >> In few instances, we will get linking_id [sic] row values, and few instances we don't have linking_id [sic] column, so we left to use other 3 common columns), I want to see the best way to handle this scenario. <<

    Your second table has exactly the same structure as your first table. If you think I'm abusive, then you want to hear what Chris Date or David McGovern would have to say about this huge design flaw. But if you're still thinking in terms of sequential files, WHERE the IDENTITY column [sic: it is a table property, not a column] is really a record number on the tape,

    Each table should model a set of completely different entities. Again, what did you learn your first week RDBMS or basic data modeling classes? Also, you don't seem to know the current syntax for table constructions and insertion statements so you're still writing the way we did over 50 years ago with the first of the SQL products

    INSERT INTO Foobar

    VALUES

    ('Aa', 'IT', NULL , 'type1'),

    ('Aa', 'IT', NULL,'type2'),

    ('Hebbah', 'Marketing', 'WN', 'type1'),

    ('Hebbah', 'Marketing', 'WN', 'type2');

    ('James', 'Finance', 'CN1', 'type1'),

    ('James', 'Finance', 'CN1', 'type2'),

    ('Martini', 'Sales', 'NT', 'type1'),

    ('Martini', 'Sales', 'NT', 'type2'),

    (NULL, 'IT', 'City1', 'type1'),

    (NULL, 'IT', 'City1','type2'),

    (NULL, NULL, NULL, 'type1'),

    (NULL, NULL, NULL, 'type2');

    Please notice that the type column has to be added to the one table that will model your foobars. Of course, it's not possible to have a key, so this can never be a real table. What you really have here is a deck of punch cards or magnetic tape from the 1960s and no idea how the relational model works at all.

    The reason that you materialize your redundant and useless #table3 is that this is how you would have done it with magnetic tapes in the 1960s. This would have been called a tape merge. The flowchart symbol back when we wrote flowcharts would've been something a look sort of like a butterfly.

    The use of "TB_?" is another design flaw called repeated group or a tibble. But let's ignore that. It's one of the many things you never got right.

    When an SQL programmer would've written would have been a view rather than materialize a table. But when your mindset is still doing tape file merges and punchcards, you have to see things materialized.

    Why do you think this is the way to use SQL and RDBMS? I'm desperately looking for something you did right, but I can't find anything!

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Sangeeth Raj wrote:

    Hi,

    I have two tables, both the tables have 4 common columns,

    First Column: LinkingID

    Other four columns(Name,Department,City).

    In few instances, we will get LinkingID row values, and few instances we don't have LinkingID column, so we left to use other 3 common columns), I want to see the best way to handle this scenario.

    Create table #table1

    (

    table1id int identity(1,1),

    name varchar(40),

    department varchar(30),

    city varchar(30),

    modifieddate date,

    linkingID int

    )

    Create table #table2

    (

    table2id int identity(100,1),

    name varchar(40),

    department varchar(30),

    city varchar(30),

    modifieddate date,

    linkingID int

    )

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Aa','IT',Null,getdate()-2,11001)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values(Null,'IT','City1',getdate()-2,11002)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values(Null,Null,Null,getdate()-2,11003)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('James','Finance','CN1',getdate()-2,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Martini','Sales','NT',getdate()-2,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Hebbah','Marketing','WN',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Aa','IT',Null,getdate()-2,11001)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values(Null,'IT','City1',getdate()-2,11002)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values(Null,Null,Null,getdate()-2,11003)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('James','Finance','CN1',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Martini','Sales','NT',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Hebbah','Marketing','WN',getdate()-2,Null)

    select * from #table1

    select * from #table2

    Create table #table3

    (

    table3id int identity(1,1),

    table1id int,

    table2id int,

    name varchar(40),

    department varchar(30),

    city varchar(30),

    modifieddate date,

    linkingID int

    )

    Insert into #table3 (name,department,city,modifieddate,linkingID)

    select table1id, table2id, name , department , city , modifieddate,linkingID  from

    (

    SELECT tb1.table1id,

    tb2.table2id,

    tb2.name,

    tb2.department,

    tb2.city,

    tb2.modifieddate,

    tb2.linkingID

    FROM #table2 tb2

    INNER JOIN #table1 tb1 ON tb2.linkingID = tb1.linkingID --When Linking ID is available

    where tb1.table1id not in (select distinct table1id from #table1)

    and tb2.table2id not in (select distinct table2id from #table2)

    Insert into #table3 (name,department,city,modifieddate,linkingID)

    SELECT tb1.table1id,

    tb2.table2id,

    tb2.name,

    tb2.department,

    tb2.city,

    tb2.modifieddate,

    tb2.linkingID

    Row_Number()over(partition by city,department ,city order by modifieddate desc) as RN

    FROM #table2 tb2

    INNER JOIN #table1 tb1 ON tb1.city = tb2.city and tb1.department = tb2.department and tb1.name = tb2.name

    where tb1.linkingID is null or tb2.linkingID is null

    except

    select name, department ,city from #table3

    )sub

    where RN =1

    You might want to check the code you pasted in works without errors!

    There are more columns in the selects than there are column listed in the tables in the insert.

    Surely it is not too much effort to just check your code works before you paste it in and expect people to look at it?

  • Hi,

     

    I have amended the code and verified it executed without errors. Yes, I will agree the column name conventions are not appropriate.

     

    CREATE TABLE #table1

    (table1id INT IDENTITY(1, 1),

    name VARCHAR(40),

    department VARCHAR(30),

    city VARCHAR(30),

    modifieddate DATE,

    linkingID INT

    );

    CREATE TABLE #table2

    (table2id INT IDENTITY(100, 1),

    name VARCHAR(40),

    department VARCHAR(30),

    city VARCHAR(30),

    modifieddate DATE,

    linkingID INT

    );

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Aa','IT',Null,getdate()-2,11001)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values(Null,'IT','City1',getdate()-2,11002)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values(Null,Null,Null,getdate()-2,11003)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('James','Finance','CN1',getdate()-2,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Martini','Sales','NT',getdate()-2,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Hebbah','Marketing','WN',getdate()-2,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Hebbah','Marketing','WN',getdate()-3,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('Martini','Sales','NT',getdate()-3,Null)

    Insert into #table1 (name,department,city,modifieddate,linkingID) values('James','Finance','CN1',getdate(),Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Aa','IT',Null,getdate()-2,11001)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values(Null,'IT','City1',getdate()-2,11002)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values(Null,Null,Null,getdate()-2,11003)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('James','Finance','CN1',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Martini','Sales','NT',getdate()-2,Null)

    Insert into #table2 (name,department,city,modifieddate,linkingID) values('Hebbah','Marketing','WN',getdate()-2,Null)

    select * from #table1

    select * from #table2

    CREATE TABLE #table3

    (table3id INT IDENTITY(1, 1),

    table1id INT,

    table2id INT,

    name VARCHAR(40),

    department VARCHAR(30),

    city VARCHAR(30),

    modifieddate DATE,

    linkingID INT

    );

    INSERT INTO #table3 --case 1: When LinkingID exists

    (table1id,

    table2id,

    NAME,

    department,

    city,

    modifieddate,

    linkingid)

    SELECT table1id,

    table2id,

    NAME,

    department,

    city,

    modifieddate,

    linkingid

    FROM (SELECT tb1.table1id,

    tb2.table2id,

    tb2.NAME,

    tb2.department,

    tb2.city,

    tb2.modifieddate,

    tb2.linkingid

    FROM #table2 tb2

    INNER JOIN #table1 tb1

    ON tb2.linkingid = tb1.linkingid --When Linking ID is available

    WHERE tb1.table1id NOT IN (SELECT DISTINCT table1id

    FROM #table1)

    AND tb2.table2id NOT IN (SELECT DISTINCT table2id

    FROM #table2)) a;

    INSERT INTO #table3 --case 2: When LinkingID exists

    (NAME,

    department,

    city,

    linkingid)

    SELECT NAME,

    department,

    city,

    linkingid

    FROM (SELECT tb2.NAME,

    tb1.department,

    tb1.city,

    tb1.modifieddate,

    tb2.linkingid,

    Row_number()

    OVER(

    partition BY tb1.city, tb1.department, tb1.city

    ORDER BY tb1.modifieddate DESC) AS RN

    FROM #table2 tb2

    INNER JOIN #table1 tb1

    ON tb1.city = tb2.city

    AND tb1.department = tb2.department

    AND tb1.NAME = tb2.NAME

    WHERE tb1.linkingid IS NULL

    OR tb2.linkingid IS NULL) a

    WHERE rn = 1

    EXCEPT

    SELECT NAME,

    department,

    city,

    linkingid

    FROM #table3;

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply