Error using Inner join!

  • Hi,

    I'm trying to insert date from into table customers_data from 6 different tables by using a join across tables on the customer id.

    this is the structure of the customers_data table

    CREATE TABLE [dbo].[Customers_Data](

    [CustomerKey] [int] IDENTITY(1,1) NOT NULL,

    [customerID] [int] NULL,

    [firstName] [varchar](50) NULL,

    [lastName] [varchar](50) NULL,

    [varchar](50) NULL,

    [username] [varchar](50) NULL,

    [password] [varchar](50) NULL,

    [coverClicksent] [bit] NULL,

    [customerReferenceID] [int] NULL,

    [referenceFirstName] [varchar](50) NULL,

    [referenceLastName] [varchar](50) NULL,

    [referencePhone] [varchar](50) NULL,

    [referenceRelationship] [varchar](50) NULL,

    [customerPhoneID] [int] NULL,

    [HomeNumber] [varchar](50) NULL,

    [MobileNumber] [varchar](50) NULL,

    [customerPersonalID] [int] NULL,

    [mothersMaiden] [varchar](50) NULL,

    [gender] [varchar](50) NULL,

    [issuedIDState] [varchar](50) NULL,

    [maritalStatus] [varchar](50) NULL,

    [UScitizen] [bit] NULL,

    [activeMilitary] [bit] NULL,

    [customerHousingID] [int] NULL,

    [ownHome] [bit] NULL,

    [address] [varchar](max) NULL,

    [city] [varchar](50) NULL,

    [state] [varchar](50) NULL,

    [country] [varchar](50) NULL,

    [zip] [varchar](50) NULL,

    [customerFinancialID] [int] NULL,

    [foreclosure] [bit] NULL,

    [bankruptcyEver] [bit] NULL,

    [bankruptcyNow] [bit] NULL,

    [nonSufficient2Months] [varchar](50) NULL,

    [nonSufficient6Months] [bit] NULL,

    [netMonthlyIncome] [money] NULL,

    [incomeSource] [varchar](50) NULL,

    [net1000] [bit] NULL,

    [cosignerAvailable] [bit] NULL,

    [customerEmploymentID] [int] NULL,

    [occupation] [varchar](50) NULL,

    [selfEmployed] [bit] NULL,

    [workExt] [varchar](50) NULL,

    [middleinitial] [varchar](50) NULL,

    [socialsecurity] [varchar](max) NULL,

    The following is the join i am using to bring all data into the customer_data table

    select c.customerid, c.firstname, c.lastname, c.email, c.username, c.password, cpe.customerpersonalID, cpe.mothersmaiden, cpe.birthyear, cpe.birthmonth, cpe.birthday, cpe.gender, cpe.issuedIDstate, cpe.maritalstatus, cpe.UScitizen, cm.activemilitary, ch.customerhousingID, ch.ownhome, ch.address, ch.city, ch.state, ch.country, ch.zip, ch.monthsatresidence, ce.workext, c.middleinitial, cpe.socialsecurity

    from customers__test as cd

    inner join [dbo].[customers] as c on c.customerid = cd.customerid

    inner join .[dbo].[customerhousing]as ch on ce.customerid = ch.customerid

    inner join .[dbo].[customermilitary] cm on ch.customerid = cm.customerid

    inner join .[dbo].[customerpersonal] cpe on cm.customerid = cpe.customerid

    inner join [dbo].[customerreferences] as cr on cpe.customerid = cr.customerid

    order by cd.customerid

    I do not want to list the structure of each table as it may become too huge to read. But the structure of customers_test is as follows

    CREATE TABLE [dbo].[Customers__Test](

    [Customer Key] [varchar](50) NULL,

    [CustomerID] [int] NULL,

    [firstname] [varchar](50) NULL,

    [lastname] [varchar](50) NULL,

    [varchar](50) NULL,

    [address] [varchar](max) NULL,

    [city] [varchar](50) NULL,

    [state] [varchar](50) NULL,

    [country] [varchar](50) NULL,

    [zip] [varchar](50) NULL,

    [socialsecurity] [varchar](50) NULL,

    ]

    The problem that i face is that...................

    The customers_test table has 2.38 million records. However when i do a join across all the tables i receive only about 1.58 million records. That too each record is replicated.

    I am expecting to insert all 2.38 million records into the Customers_data table.

    Any explicit reason why?, i am trying to rewrite the query etc...

  • I tried using a left join instead of inner join and i receive 3.18 million records now..it may be pulling extra ones

  • I'd suggest a count(customerid) and a count(distinct customerid) from each table to know the counts of each table involved. If these are different values for the same table, the table has multiples of the customerid.

    You can then full outer join each table individually with customer_test to know how many and which ones are extra in every table. That should narrow down your quest to fix the duplicates/missing values.

  • I performed a count and count distinct across all tables, it returned the same number for all tables except the last one where there were duplicates across the reference table.

    Quite definetly this is painful and i am going to update each table one by one and forget about joins.

    Thanks for the suggestion it did help clear out things....

  • When you use joins, you are telling SQL to give you all the customers that have matches in each of the joined tables. Right now, the only way a customer is showing up is when he exists in all 7 of the tables. When you do a left join you are saying to grab all customers from the left most table, regardless of their existance in the right table. This is why you get more records, but not the correct count.

    Try using full outer joins to see if you get the data you so desire.

  • The question I am asking myself is why are you joining the table for the insert. Why didn't you just insert from each table, like:

    insert into Customer_test

    select * from table1;

    insert into Customer_Test

    select * from table2;

    ......

  • I had wanted to use a single query and execution plan to update the cust_test table..........

  • The full outer join will work.

  • just another thought... You can use union all also.

    select * from table1 union all

    select * from table2 union all

    select * from table3 union all

    ....

    select * from table 7

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply