September 18, 2017 at 3:24 am
Hi All,
A questions which is there with me a long long time. Comparing 2 rows in a table.
How can I compare 2 rows in a table and pin-point and say so and so specific column value don't match.
In my below example, i want to compare 2 rows and should display the ouput as c2 column value is different and so both rows are not same.
Two compare 2 rows, I think I can use checksum,binary_checksum,hashbytes etc.... but the wont tell me which column has the difference.
Correct me if I am wrong !!
For row comparisons, what I am doing is , copy 2 rows into an excel sheet and using transponse converting row to a column and thats how comparing both the rows.
I would like to see if I can do that using TSQL itself.
create table test
(c1 int,
c2 char(10),
c3 varchar(100),
c4 float
)
insert into test
select 1,'A','Adam',9000
union all
select 1,'B','Adam',9000
select * from test
Thanks in advance.
Sam
September 18, 2017 at 3:42 am
This is a bit of a guess, based on your small dataset, but this, at least, might get the ball rolling:SELECT *,
CASE WHEN LAG(c2,1,C2) OVER (PARTITION BY C1, C3, C4 ORDER BY C2) = c2 THEN 'Same' ELSE 'Differs' END AS Compare
FROM test;
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
September 30, 2017 at 5:54 am
Thom A - Monday, September 18, 2017 3:42 AMThis is a bit of a guess, based on your small dataset, but this, at least, might get the ball rolling:SELECT *,
CASE WHEN LAG(c2,1,C2) OVER (PARTITION BY C1, C3, C4 ORDER BY C2) = c2 THEN 'Same' ELSE 'Differs' END AS Compare
FROM test;
Below is the script, probably the extension of above -
drop table test
create table test
(c1 int,
c2 char(10),
c3 varchar(100),
c4 float
)
insert into test
select 1,'A','Adam',9000
union all
select 2,'A','Adam',9000
union all
select 1,'B','Adam',9000
union all
select 1,'A','Jhon',9000
union all
select 2,'A','Adam',9001
SELECT *,
CASE WHEN LAG(c1,1,C1) OVER (PARTITION BY C2, C3, C4 ORDER BY C1) = c1 THEN 'Same' ELSE 'Differs' END AS CompareC1,
CASE WHEN LAG(c2,1,C2) OVER (PARTITION BY C1, C3, C4 ORDER BY C2) = c2 THEN 'Same' ELSE 'Differs' END AS CompareC2,
CASE WHEN LAG(c4,1,C4) OVER (PARTITION BY C1, C2, C3 ORDER BY C4) = c4 THEN 'Same' ELSE 'Differs' END AS CompareC4,
CASE WHEN LAG(c3,1,C3) OVER (PARTITION BY C1, C2, C4 ORDER BY C3) = c3 THEN 'Same' ELSE 'Differs' END AS CompareC3
FROM test
order by c1,c2,c3,c4
First solve the problem then write the code !
October 2, 2017 at 5:13 am
If you want a generic solution then you are going to need to use a cursor, sys.colums and probably some executable SQL to build what is effectively the pivoted data (rowID, ColumnID, ColumnValue) and then compare them. Remember that some columns will change by default (timestamp/rowID) on every change so would probably not need to be included in your results set.
Can you provide more context of why you need to identify the exact column(s) that are different and what you are going to do with the data - this may yield an alternative solution.
October 2, 2017 at 5:39 am
aaron.reese - Monday, October 2, 2017 5:13 AMIf you want a generic solution then you are going to need to use a cursor, sys.colums and probably some executable SQL to build what is effectively the pivoted data (rowID, ColumnID, ColumnValue) and then compare them. Remember that some columns will change by default (timestamp/rowID) on every change so would probably not need to be included in your results set.Can you provide more context of why you need to identify the exact column(s) that are different and what you are going to do with the data - this may yield an alternative solution.
A cursor?! No.... No, no no no no... There are far better options than using a cursor to get dynamic column names. Plus, like I showed before, Window functions can easily be used to compare separate rows if they are in a defined order.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
October 2, 2017 at 6:33 am
so if you wanted to pass a table name and two record IDs to a function/procedure that would spit out a list of fields with differences and the before/after values, how would you do it without a cursor at some point?. you have to convert the individual records into an EAV table so that you can join them together.
I suppose you could use a combination of dynamic SQL to UNPIVOT, build a string of fields using STUFF...FOR XML and Execute @sql dynamic statement but the code would be horrible. At least a cursor would be readable.
Sorry, Thread Hijack!
October 2, 2017 at 7:14 am
aaron.reese - Monday, October 2, 2017 6:33 AMso if you wanted to pass a table name and two record IDs to a function/procedure that would spit out a list of fields with differences and the before/after values, how would you do it without a cursor at some point?. you have to convert the individual records into an EAV table so that you can join them together.I suppose you could use a combination of dynamic SQL to UNPIVOT, build a string of fields using STUFF...FOR XML and Execute @sql dynamic statement but the code would be horrible. At least a cursor would be readable.
Sorry, Thread Hijack!
For the example that the OP gave, they are simply looking at the next record, there's no need to do a repetative task along a whole dataset, when you can easily use a window function to inspect the last line. LAG may not be the right choice for every event, it may be that FIRST_VALUE is the right choice, but that's data dependant.
As for Dynamic SQL (D-SQL), you can just as easily make it readable, but my main point is the importance of efficiency. A cursor may be considered by you to be more "readable" but that doesn't mean it performs well. A poorly written but "well worded" cursor is a far worse option that well written but "poorly worded" D-SQL. Plus, D-SQL is often very easy to trouble shoot my simply changing the EXEC statement to a PRINT. You can see what SQL would be ran then, adjust that statement as needed, and then amend the D-SQL creation part.
A "good" query is almost always going to put efficiency first before readability.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
October 2, 2017 at 7:28 am
Agree with all of your points but I did qualify my statement by saying that there were other considerations if you need a generic solution. (i.e. compare any two records in any table - assuming a single field PK)
October 25, 2017 at 4:42 am
TheCTEGuy - Saturday, September 30, 2017 5:54 AMThom A - Monday, September 18, 2017 3:42 AMThis is a bit of a guess, based on your small dataset, but this, at least, might get the ball rolling:SELECT *,
CASE WHEN LAG(c2,1,C2) OVER (PARTITION BY C1, C3, C4 ORDER BY C2) = c2 THEN 'Same' ELSE 'Differs' END AS Compare
FROM test;Below is the script, probably the extension of above -
drop table test
create table test
(c1 int,
c2 char(10),
c3 varchar(100),
c4 float
)insert into test
select 1,'A','Adam',9000
union all
select 2,'A','Adam',9000
union all
select 1,'B','Adam',9000
union all
select 1,'A','Jhon',9000
union all
select 2,'A','Adam',9001SELECT *,
CASE WHEN LAG(c1,1,C1) OVER (PARTITION BY C2, C3, C4 ORDER BY C1) = c1 THEN 'Same' ELSE 'Differs' END AS CompareC1,
CASE WHEN LAG(c2,1,C2) OVER (PARTITION BY C1, C3, C4 ORDER BY C2) = c2 THEN 'Same' ELSE 'Differs' END AS CompareC2,
CASE WHEN LAG(c4,1,C4) OVER (PARTITION BY C1, C2, C3 ORDER BY C4) = c4 THEN 'Same' ELSE 'Differs' END AS CompareC4,
CASE WHEN LAG(c3,1,C3) OVER (PARTITION BY C1, C2, C4 ORDER BY C3) = c3 THEN 'Same' ELSE 'Differs' END AS CompareC3
FROM test
order by c1,c2,c3,c4
Hi TheCTEGuy,
Thanks for the solution. But it is not working properly for below data. How can I fix it???
drop table test
create table test
(c1 int,
c2 char(10),
c3 varchar(100),
c4 float
)
truncate table test
insert into test
select 1,'A','Adam',9000
union all
select 1,'B','Adam',7000
go
select * from test;
go
SELECT *,
CASE WHEN LAG(c1,1,C1) OVER (PARTITION BY C2, C3, C4 ORDER BY C1) = c1 THEN 'Same' ELSE 'Differs' END AS CompareC1,
CASE WHEN LAG(c2,1,C2) OVER (PARTITION BY C1, C3, C4 ORDER BY C2) = c2 THEN 'Same' ELSE 'Differs' END AS CompareC2,
CASE WHEN LAG(c3,1,C3) OVER (PARTITION BY C1, C2, C4 ORDER BY C3) = c3 THEN 'Same' ELSE 'Differs' END AS CompareC3,
CASE WHEN LAG(c4,1,C4) OVER (PARTITION BY C1, C2, C3 ORDER BY C4) = c4 THEN 'Same' ELSE 'Differs' END AS CompareC4
FROM test
order by c1,c2,c3,c4
October 25, 2017 at 5:20 am
What results were you expecting? Since no matter how you slice it, you have only one row in each partition, and you set the default value (the third parameter to the LAG function) to the current row value, then you're always going to get 'Same'.
John
October 25, 2017 at 5:59 am
John Mitchell-245523 - Wednesday, October 25, 2017 5:20 AMWhat results were you expecting? Since no matter how you slice it, you have only one row in each partition, and you set the default value (the third parameter to the LAG function) to the current row value, then you're always going to get 'Same'.John
Hi John,
Thanks for the response.
Thing is, when I am comparing row-1 to row-2 , then the values for row 2 Compare values should be as follows.
CompareC1 CompareC2 CompareC3 CompareC4
Same Differes Same Differs
Can this made dynamic? and also at any point I am going to compare only 2 rows not more than that.
Thanks,
Sam
October 25, 2017 at 7:09 am
Sam
So you're taking the whole set and comparing the value of each column in each row to the values of each column in the previous row? The trouble is, your set doesn't have any intrinsic order - unless you have an ID or date column or something like that that you haven't shown us? Try this - I've used C4 since the rows as you've shown them happen to be in the same order as the values in that column.
SELECT *,
CASE WHEN LAG(C1,1,C1) OVER (ORDER BY C4) = C1 THEN 'Same' ELSE 'Differs' END AS CompareC1,
CASE WHEN LAG(C2,1,C2) OVER (ORDER BY C4) = C2 THEN 'Same' ELSE 'Differs' END AS CompareC2,
CASE WHEN LAG(C3,1,C3) OVER (ORDER BY C4) = C3 THEN 'Same' ELSE 'Differs' END AS CompareC3,
CASE WHEN LAG(C4,1,C4) OVER (ORDER BY C4) = C4 THEN 'Same' ELSE 'Differs' END AS CompareC4
FROM test
John
October 25, 2017 at 10:40 am
You're probably looking for the following:CREATE TABLE #test (
c1 int,
c2 char(10),
c3 varchar(100),
c4 float
);
INSERT INTO #test (c1, c2, c3, c4)
VALUES (1,'A','Adam',9000),
(1,'B','Adam',9000);
WITH RAW_DATA AS (
SELECT *, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RN
FROM #test
)
SELECT RN, c1, c2, c3, c4,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c1, 1, NULL) OVER (ORDER BY RN) = c1 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c1, 1, NULL) OVER (ORDER BY RN) = c1 THEN 'Same' ELSE 'Differs' END
END AS CompareC1,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c2, 1, NULL) OVER (ORDER BY RN) = c2 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c2, 1, NULL) OVER (ORDER BY RN) = c2 THEN 'Same' ELSE 'Differs' END
END AS CompareC2,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c3, 1, NULL) OVER (ORDER BY RN) = c3 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c3, 1, NULL) OVER (ORDER BY RN) = c3 THEN 'Same' ELSE 'Differs' END
END AS CompareC3,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c4, 1, NULL) OVER (ORDER BY RN) = c4 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c4, 1, NULL) OVER (ORDER BY RN) = c4 THEN 'Same' ELSE 'Differs' END
END AS CompareC4
FROM RAW_DATA
ORDER BY RN;
DROP TABLE #test;
The problem with what's been previous posted is that the use of the LAG function was invoking a 3rd parameter value that says if you can't get a previous value because such a record doesn't exist, just use the current record's value, but that is relying on a previous record, which won't exist for the first row. See if you understand how my code works. It's specifically designed for exactly 2 rows of data.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
October 25, 2017 at 11:14 am
John Mitchell-245523 - Wednesday, October 25, 2017 7:09 AMSamSo you're taking the whole set and comparing the value of each column in each row to the values of each column in the previous row? The trouble is, your set doesn't have any intrinsic order - unless you have an ID or date column or something like that that you haven't shown us? Try this - I've used C4 since the rows as you've shown them happen to be in the same order as the values in that column.
SELECT *,
CASE WHEN LAG(C1,1,C1) OVER (ORDER BY C4) = C1 THEN 'Same' ELSE 'Differs' END AS CompareC1,
CASE WHEN LAG(C2,1,C2) OVER (ORDER BY C4) = C2 THEN 'Same' ELSE 'Differs' END AS CompareC2,
CASE WHEN LAG(C3,1,C3) OVER (ORDER BY C4) = C3 THEN 'Same' ELSE 'Differs' END AS CompareC3,
CASE WHEN LAG(C4,1,C4) OVER (ORDER BY C4) = C4 THEN 'Same' ELSE 'Differs' END AS CompareC4
FROM testJohn
Thanks John.
Yes, I am taking the whole set and comparing the value of each column in each row to the values of each column in the previous row.
October 25, 2017 at 11:17 am
sgmunson - Wednesday, October 25, 2017 10:40 AMYou're probably looking for the following:CREATE TABLE #test (
c1 int,
c2 char(10),
c3 varchar(100),
c4 float
);
INSERT INTO #test (c1, c2, c3, c4)
VALUES (1,'A','Adam',9000),
(1,'B','Adam',9000);WITH RAW_DATA AS (
SELECT *, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RN
FROM #test
)
SELECT RN, c1, c2, c3, c4,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c1, 1, NULL) OVER (ORDER BY RN) = c1 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c1, 1, NULL) OVER (ORDER BY RN) = c1 THEN 'Same' ELSE 'Differs' END
END AS CompareC1,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c2, 1, NULL) OVER (ORDER BY RN) = c2 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c2, 1, NULL) OVER (ORDER BY RN) = c2 THEN 'Same' ELSE 'Differs' END
END AS CompareC2,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c3, 1, NULL) OVER (ORDER BY RN) = c3 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c3, 1, NULL) OVER (ORDER BY RN) = c3 THEN 'Same' ELSE 'Differs' END
END AS CompareC3,
CASE RN
WHEN 1 THEN CASE WHEN LEAD(c4, 1, NULL) OVER (ORDER BY RN) = c4 THEN 'Same' ELSE 'Differs' END
WHEN 2 THEN CASE WHEN LAG(c4, 1, NULL) OVER (ORDER BY RN) = c4 THEN 'Same' ELSE 'Differs' END
END AS CompareC4
FROM RAW_DATA
ORDER BY RN;DROP TABLE #test;
The problem with what's been previous posted is that the use of the LAG function was invoking a 3rd parameter value that says if you can't get a previous value because such a record doesn't exist, just use the current record's value, but that is relying on a previous record, which won't exist for the first row. See if you understand how my code works. It's specifically designed for exactly 2 rows of data.
Thanks Steve. Thats exactly I was looking for. At this point, I want to compare any 2 rows in a table.
Many thanks.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply