May 11, 2009 at 11:19 am
Now I was asked to run a compare on the data. I created a table and ran the cursor. It stored data in table1.
I think created another table and ran my new script. It stores data in table2.
Now I want to compare the tables to insure they are they same data.
Challenges:
1) There are over 12 million rows in each table
2) There are 15 columns in each table
Does anyone know how to compare two tables such as these? I tried the red gate products. However, they only compare tablea in database1 to tablea in database2. My two tables are in the same database on the same server.
I am also using SQL Server 2000 sp4.
Thanks,
Tony
Things will work out. Get back up, change some parameters and recode.
May 11, 2009 at 2:51 pm
How about something like this for a start (I'm using AdventureWorks for SQL Server 2005, but the code is valid for SQL 2000):
-- create a copy of a table in the same database
SELECT * INTO dbo.Department FROM HumanResources.Department AS D
-- FULL OUTER JOIN using CHECKSUM() - should be the same if data is the same.
SELECT
*
FROM
(
SELECT DepartmentID, CHECKSUM(*) AS tableA_CheckSum FROM HumanResources.Department
) AS D FULL OUTER JOIN
(
SELECT DepartmentID, CHECKSUM(*) AS tableB_CheckSum FROM dbo.Department
) AS D2 ON D.tableA_CheckSum = D2.tableB_CheckSum
-- change a row to see if CHECKSUM finds the difference
UPDATE dbo.Department
SET GroupName = 'Test'
WHERE
DepartmentID = 1
-- re-query using CHECKSUM to see the difference.
SELECT
*
FROM
(
SELECT DepartmentID, CHECKSUM(*) AS tableA_CheckSum FROM HumanResources.Department
) AS D FULL OUTER JOIN
(
SELECT DepartmentID, CHECKSUM(*) AS tableB_CheckSum FROM dbo.Department
) AS D2 ON D.tableA_CheckSum = D2.tableB_CheckSum
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 11, 2009 at 3:04 pm
CHECKSUM is not good enough.
Different records may have the same CHECKSUM.
_____________
Code for TallyGenerator
May 11, 2009 at 4:53 pm
Hello,
Thanks for responding.
I need to compare two tables that are in existence. I saw this example of using Checksum after an update statement. I watched the video also.
But I don't see how that tells me how to compare two tables? The example seems to show if a change is made to one table.
Again, I need to compare two existing tables.
Thanks.
Tony
Things will work out. Get back up, change some parameters and recode.
May 11, 2009 at 5:06 pm
Does anyone know how to compare two tables such as these? I tried the red gate products. However, they only compare tablea in database1 to tablea in database2. My two tables are in the same database on the same server.
I am also using SQL Server 2000 sp4.
Thanks,
Tony
Hi,
SQL Data Compare does let you compare differently named tables. Simply Edit the Project and go to the Remap Objects tab. This lets you unmap objects and remap differently named ones.
Hope this helps!
Kind regards,
David Atkinson
Red Gate Software
May 11, 2009 at 5:13 pm
Sergiy (5/11/2009)
CHECKSUM is not good enough.Different records may have the same CHECKSUM.
I agree that the checksum function will produce the same number for different rows.
However, the chance of a CHECKSUM being the same for rows with the same PK and different data is so remote that I wouldn't hesitate to use it.
Here is a sample to compare two tables:--Setup Sampel Data
DECLARE @TableA TABLE (ID INT PRIMARY KEY, Val1 INT, Val2 VARCHAR(50))
DECLARE @TableB TABLE (ID INT PRIMARY KEY, Val1 INT, Val2 VARCHAR(50))
INSERT @TableA
SELECT 1, 1 , 'foo'
UNION ALL SELECT 2, 9, 'bar'
UNION ALL SELECT 3, 9, 'bar'
UNION ALL SELECT 4, 8, 'foo'
UNION ALL SELECT 5, 7, 'jibby'
INSERT @TableB
SELECT 1, 1 , 'foo1' -- Changed
UNION ALL SELECT 2, 9, 'bar'
UNION ALL SELECT 3, 9, 'bar'
UNION ALL SELECT 4, 7, 'foo' -- Changed
UNION ALL SELECT 5, 7, 'jibby'
-- Perform comparison
SELECT *
FROM
(
SELECT ID, CHECKSUM(*) AS CheckA
FROM @TableA
) AS A
FULL OUTER JOIN
(
SELECT ID, CHECKSUM(*) AS CheckB
FROM @TableB
) AS B
ON A.ID = B.ID
WHERE
A.CheckA B.CheckB
Your tables aren't that big, but if you run into performance issues I'd suggest:
1. Create to temp tables with the PK and the Checksum (indexed) and you could join those together.
2. Pull out chunks for comparison (either as a query or to temp tables). Meaning, if you have a PK that is an INT you could just look for PK values 1 - 1Million, then look for 1million - 2 million.. etc..
Cheers!
May 11, 2009 at 11:26 pm
Like this:
-- Find Rows in TableA that are not in TableB
Select * From TableA
Except
Select * From TableB
-- Find Rows in TableB that are not in TableA
Select * From TableB
Except
Select * From TableA
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
May 12, 2009 at 3:47 am
I need to run this for 15 columns. Folks keep posting this same checksum example with one column.
Things will work out. Get back up, change some parameters and recode.
May 12, 2009 at 3:49 am
I really appreciate the responding guys, but I don't think SQL Server 2000 can do the except.
Thanks
Tony
Things will work out. Get back up, change some parameters and recode.
May 12, 2009 at 3:52 am
I tried using the data compare. I unmapped the columns and then mapped the two I needed.
However, when reviewing results, it doesn't give me any. It is trying to do a row by row comparison on the same tables. Not the two I mapped.
I didn't get a comparison.
Thank you for suggesting the unmapping though.
Tony
Things will work out. Get back up, change some parameters and recode.
May 12, 2009 at 4:11 am
WebTechie38 (5/12/2009)
I tried using the data compare. I unmapped the columns and then mapped the two I needed.However, when reviewing results, it doesn't give me any. It is trying to do a row by row comparison on the same tables. Not the two I mapped.
I didn't get a comparison.
Thank you for suggesting the unmapping though.
Tony
use TableDiff.exe which is exist in C:\Program files\Microsoft SQL Server\90\COM.
for more info see SQL BOL.
May 12, 2009 at 5:42 am
WebTechie38 (5/12/2009)
I need to run this for 15 columns. Folks keep posting this same checksum example with one column.
"Select *" is not one column it is all the columns,
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 12, 2009 at 5:54 am
You know I've met some great people on this forum and some I truly look up to as mentors.
Then there are times people post things and you are not sure how to respond.
Yes, I believe select * is more than one column. Thank you.
SELECT
*
FROM
(
SELECT DepartmentID, CHECKSUM(*) AS tableA_CheckSum FROM HumanResources.Department
) AS D FULL OUTER JOIN
(
SELECT DepartmentID, CHECKSUM(*) AS tableB_CheckSum FROM dbo.Department
) AS D2 ON D.tableA_CheckSum = D2.tableB_CheckSum
But if I understand the above query, and granted I am still learning, we are looking at one column (DepartmentID). That my friend is one column.
I was simply stating that I need to compare all the columns. In any event, I appreciate you taking the time to respond to my original question.
Tony
Things will work out. Get back up, change some parameters and recode.
May 12, 2009 at 6:32 am
I included DepartmentID in the output assuming it is a unique key so you could include it in the join criteria because of the possibility, however slight, that duplicate checksums are created. It is highly unlikely that there would be a duplicate checksum for the same id if there is a change in other data. The checksum is across all columns (checksum(*)). The comparison would only be across 1 column if the code was checksum(DepartmentID).
Please note that in the FULL OUTER JOIN the join is on the checksum columns, and if you run the entire test I provided, you see that for the row I modified you get 2 rows, 1 with null columns for tableB and 1 with null columns for tableA as the checksum values are different. Note the bolded portions of the code:
SELECT
*
FROM
(
SELECT DepartmentID, CHECKSUM(*) AS tableA_CheckSum FROM HumanResources.Department
) AS D FULL OUTER JOIN
(
SELECT DepartmentID, CHECKSUM(*) AS tableB_CheckSum FROM dbo.Department
) AS D2 ON D.tableA_CheckSum = D2.tableB_CheckSum
Because the join is on the created Checksum columns the comparison is across all columns. If the join was on DepartmentID then the comparison would be for 1 column.
I wish I had remembered the tablediff utility provided as it is really the tool you should use to do the comparison. Kudos to Kishore for bringing it up.
Edit: Added code with bolding.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 12, 2009 at 7:10 am
WebTechie38 (5/12/2009)
I need to run this for 15 columns. Folks keep posting this same checksum example with one column.
Check out BOL for CHECKSUM, and it should make sense why you would use CHECKSUM(*).
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply