March 26, 2008 at 12:59 pm
Hi,
I'm trying to insert date from into table customers_data from 6 different tables by using a join across tables on the customer id.
this is the structure of the customers_data table
CREATE TABLE [dbo].[Customers_Data](
[CustomerKey] [int] IDENTITY(1,1) NOT NULL,
[customerID] [int] NULL,
[firstName] [varchar](50) NULL,
[lastName] [varchar](50) NULL,
[varchar](50) NULL,
[username] [varchar](50) NULL,
[password] [varchar](50) NULL,
[coverClicksent] [bit] NULL,
[customerReferenceID] [int] NULL,
[referenceFirstName] [varchar](50) NULL,
[referenceLastName] [varchar](50) NULL,
[referencePhone] [varchar](50) NULL,
[referenceRelationship] [varchar](50) NULL,
[customerPhoneID] [int] NULL,
[HomeNumber] [varchar](50) NULL,
[MobileNumber] [varchar](50) NULL,
[customerPersonalID] [int] NULL,
[mothersMaiden] [varchar](50) NULL,
[gender] [varchar](50) NULL,
[issuedIDState] [varchar](50) NULL,
[maritalStatus] [varchar](50) NULL,
[UScitizen] [bit] NULL,
[activeMilitary] [bit] NULL,
[customerHousingID] [int] NULL,
[ownHome] [bit] NULL,
[address] [varchar](max) NULL,
[city] [varchar](50) NULL,
[state] [varchar](50) NULL,
[country] [varchar](50) NULL,
[zip] [varchar](50) NULL,
[customerFinancialID] [int] NULL,
[foreclosure] [bit] NULL,
[bankruptcyEver] [bit] NULL,
[bankruptcyNow] [bit] NULL,
[nonSufficient2Months] [varchar](50) NULL,
[nonSufficient6Months] [bit] NULL,
[netMonthlyIncome] [money] NULL,
[incomeSource] [varchar](50) NULL,
[net1000] [bit] NULL,
[cosignerAvailable] [bit] NULL,
[customerEmploymentID] [int] NULL,
[occupation] [varchar](50) NULL,
[selfEmployed] [bit] NULL,
[workExt] [varchar](50) NULL,
[middleinitial] [varchar](50) NULL,
[socialsecurity] [varchar](max) NULL,
The following is the join i am using to bring all data into the customer_data table
select c.customerid, c.firstname, c.lastname, c.email, c.username, c.password, cpe.customerpersonalID, cpe.mothersmaiden, cpe.birthyear, cpe.birthmonth, cpe.birthday, cpe.gender, cpe.issuedIDstate, cpe.maritalstatus, cpe.UScitizen, cm.activemilitary, ch.customerhousingID, ch.ownhome, ch.address, ch.city, ch.state, ch.country, ch.zip, ch.monthsatresidence, ce.workext, c.middleinitial, cpe.socialsecurity
from customers__test as cd
inner join [dbo].[customers] as c on c.customerid = cd.customerid
inner join .[dbo].[customerhousing]as ch on ce.customerid = ch.customerid
inner join .[dbo].[customermilitary] cm on ch.customerid = cm.customerid
inner join .[dbo].[customerpersonal] cpe on cm.customerid = cpe.customerid
inner join [dbo].[customerreferences] as cr on cpe.customerid = cr.customerid
order by cd.customerid
I do not want to list the structure of each table as it may become too huge to read. But the structure of customers_test is as follows
CREATE TABLE [dbo].[Customers__Test](
[Customer Key] [varchar](50) NULL,
[CustomerID] [int] NULL,
[firstname] [varchar](50) NULL,
[lastname] [varchar](50) NULL,
[varchar](50) NULL,
[address] [varchar](max) NULL,
[city] [varchar](50) NULL,
[state] [varchar](50) NULL,
[country] [varchar](50) NULL,
[zip] [varchar](50) NULL,
[socialsecurity] [varchar](50) NULL,
]
The problem that i face is that...................
The customers_test table has 2.38 million records. However when i do a join across all the tables i receive only about 1.58 million records. That too each record is replicated.
I am expecting to insert all 2.38 million records into the Customers_data table.
Any explicit reason why?, i am trying to rewrite the query etc...
March 26, 2008 at 1:37 pm
I tried using a left join instead of inner join and i receive 3.18 million records now..it may be pulling extra ones
March 26, 2008 at 2:11 pm
I'd suggest a count(customerid) and a count(distinct customerid) from each table to know the counts of each table involved. If these are different values for the same table, the table has multiples of the customerid.
You can then full outer join each table individually with customer_test to know how many and which ones are extra in every table. That should narrow down your quest to fix the duplicates/missing values.
March 26, 2008 at 3:32 pm
I performed a count and count distinct across all tables, it returned the same number for all tables except the last one where there were duplicates across the reference table.
Quite definetly this is painful and i am going to update each table one by one and forget about joins.
Thanks for the suggestion it did help clear out things....
March 26, 2008 at 3:38 pm
When you use joins, you are telling SQL to give you all the customers that have matches in each of the joined tables. Right now, the only way a customer is showing up is when he exists in all 7 of the tables. When you do a left join you are saying to grab all customers from the left most table, regardless of their existance in the right table. This is why you get more records, but not the correct count.
Try using full outer joins to see if you get the data you so desire.
March 26, 2008 at 3:41 pm
The question I am asking myself is why are you joining the table for the insert. Why didn't you just insert from each table, like:
insert into Customer_test
select * from table1;
insert into Customer_Test
select * from table2;
......
March 26, 2008 at 3:46 pm
I had wanted to use a single query and execution plan to update the cust_test table..........
March 26, 2008 at 3:55 pm
The full outer join will work.
March 26, 2008 at 4:37 pm
just another thought... You can use union all also.
select * from table1 union all
select * from table2 union all
select * from table3 union all
....
select * from table 7
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply