December 20, 2016 at 9:54 pm
-- Source
Create table staging
(id int,
name varchar(50),
empid int,
company_id int,
addres varchar(50)
)
Create table Destination
(id int,
name varchar(50),
empid int,
company_id int,
addres varchar(50)
)
insert into staging
select 1, 'amit',NULL,101,'USA'
UNION ALL
Select 1,'amit',10002,'','USA'
UNION ALL
Select 2,'Vijay','',650,'China'
UNION ALL
Select 2,'Vijay','','','China'
UNION ALL
Select 5,'Sanjay',589756,NULL,'India'
UNION ALL
Select 5,'Sanjay',NULL,151215,'India'
UNION ALL
Select 5,'Sanjay',589757,NULL,'India'
Select * from staging
-- Expected result
-- Destination table
-- This is expected result
idnameempidcompany_idaddres
1amit10002101 USA
2Vijay0650 China
5Sanjay589756151215 India
5 Sanjay 589757 NULL India
December 21, 2016 at 2:21 am
Let's tidy your code up a bit and put it in some IFCode brackets:
CREATE TABLE Staging (id INT,
[name] VARCHAR(50),
empid INT,
company_id INT,
[address] VARCHAR(50) --Address has 2 s.
);
GO
CREATE TABLE Destination (id INT,
[name] VARCHAR(50),
empid INT,
company_id INT,
[address] VARCHAR(50) --Address has 2 s.
);
GO
INSERT INTO Staging
SELECT 1, 'amit',NULL,101,'USA'
UNION ALL
SELECT 1,'amit',10002,'','USA'
UNION ALL
SELECT 2,'Vijay','',650,'China'
UNION ALL
SELECT 2,'Vijay','','','China'
UNION ALL
SELECT 5,'Sanjay',589756,NULL,'India'
UNION ALL
SELECT 5,'Sanjay',NULL,151215,'India'
UNION ALL
SELECT 5,'Sanjay',589757,NULL,'India';
GO
SELECT *
FROM Staging;
GO
DROP TABLE Staging;
DROP TABLE Destination;
GO
Secondly, looking at your data, my concern is when looking at Sanjay. What is your logic for giving Sanjay 589756 a company_id of 151215. Why not Sanjay 589757? Am I right in guessing that the employee with the lowest number is assigned the company with the lowest id? What would happen if there was a additional Sanjay with the following details added after all your other inserts
INSERT INTO Staging VALUES (5, 'Sanjay', NULL, 142451, 'India')
Would Sanjay 589759 move to company 142451? There's no other ID/Row numbers, so otherwise the assignment might as well be random with a larger volume of data. An answer of company 151215 is on the next line isn't going to work, as when returning a SELECT statement it would be pure luck. If you have some type of index on it, could your data not return in a different order? For example, look what happens if you had a clustered index on empid:
CREATE CLUSTERED INDEX [empid_IX] ON [dbo].[Staging] ( [empid] ASC) ;
SELECT *
FROM Staging;
I'd also like to know your NULL/0 handling. For example Amit 100002 loses his company_id of 0, but Vijay retains his 0 empid. Can a company_id of 0 be assumed to be a NULL, and thus ignored? Why does Vijay 0 not have a second entry? Is it because it's assumed to be a duplicate?
Although you have supplied your expected result set, we need to logic on how you get there. What you've provided is a very small dataset, and I would suspect that if applying to a much larger dataset, which we don't have, it would not present the same result that we would provide you.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
December 21, 2016 at 2:45 am
; with cte as
(
Select
row_number() over (partition by id order by id ) as rnum
,*
from staging
)
Select Id,name,max(empid) as empid,max(company_id) as company_id,addres
from cte
group by Id,case when rnum%2=0 then rnum else rnum+1 end,name,addres
December 21, 2016 at 3:00 am
patilar (12/21/2016)
; with cte as(
Select
row_number() over (partition by id order by id ) as rnum
,*
from staging
)
Select Id,name,max(empid) as empid,max(company_id) as company_id,addres
from cte
group by Id,case when rnum%2=0 then rnum else rnum+1 end,name,addres
See my comments above. How can you guarantee that ordering? Your ORDER BY in your OVERclause is ordering your partition item. This means that your asking SQL to order the values 5, 5, 5 when at Sanjay. If I were ask you are these numbers is the same order could you honestly say yes? :
5, 5, 5
5, 5, 5
5, 5, 5
They all look the same, but actually, I put the 3th 5 at the start on the 2nd line. And on the 3rd line, I wrote the 5's the other way round. Can you tell? Nope, and neither can I, but I'm saying I did it.
If you were to expand that data out, with empid then you would get: [5, 589756], [5, NULL], [5, 589757]. I I still asked you to order these around the id (5 in this case), what order would you do it? Then the above example becomes more clear:
[5, 589756], [5, NULL], [5, 589757]
[5, 589757], [5, 589756], [5, NULL]
[5, 589757], [5, NULL], [5, 589756]
As I said, the dataset above is very small, and if this is being applied to something larger, this is going to have the ability to give varying results.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply