May 24, 2008 at 2:08 pm
Comments posted to this topic are about the item Finding Available Keys When Merging Two Tables
May 26, 2008 at 4:08 am
Hello,
I have to say that while the procedure described leads to correct result, it seems to be lacking certain ingenuity so often found in the articles here. In my opinion, this is how one would do the task without taking too much time to think about various possibilities. Nothing wrong with that, if you just need to do it once, but I wouldn't publish it in an article.
To be more precise, the article does several actions where there could be just one (e.g., insert all, delete those that are in first table, delete those that are in second table). Why not insert just those unused outright in the first place?
When filling the table with data, various loops and even import from Excel are mentioned - but not a hint that there is way to do it in set-based SQL.
The article does not contain INSERT INTO statements, that would enable readers to create the same environment easily. Also, it just mentions how to find available keys, not how to apply the result and get new ID's assigned correctly.
This is how I would go about removing duplicates in a similar case (for simplicity, I used temporary tables instead of your permanent tables in 2 databases). You will observe, that I have used a table "numbers" which is not defined in the script. This is a one-column table, also known as "Tally" table, and you can find a lot about it and its uses here on the forums (also how to create and fill it, therefore I won't repeat that, for the sake of simplicity).
IF object_id('tempdb..#emp_one') IS NOT NULL DROP TABLE #emp_one
IF object_id('tempdb..#emp_two') IS NOT NULL DROP TABLE #emp_two
IF object_id('tempdb..#emp_free_no') IS NOT NULL DROP TABLE #emp_free_no
IF object_id('tempdb..#duplicates') IS NOT NULL DROP TABLE #duplicates
GO
/*create test tables*/
CREATE TABLE #emp_one(emp_no VARCHAR(4))
CREATE TABLE #emp_two(emp_no VARCHAR(4))
/*insert sample data*/
INSERT INTO #emp_one
SELECT '0001'
UNION
SELECT '0003'
UNION
SELECT '0007'
UNION
SELECT '0008'
UNION
SELECT '0009'
INSERT INTO #emp_two
SELECT '0001'
UNION
SELECT '0004'
UNION
SELECT '0005'
UNION
SELECT '0006'
UNION
SELECT '0007'
/*prepare table emp_two for renumbering - add column that will hold the new employee number*/
ALTER TABLE #emp_two ADD new_emp_no VARCHAR(4)
GO
/*table of available (unused) employee numbers*/
CREATE TABLE #emp_free_no(id INT IDENTITY, emp_no VARCHAR(4))
INSERT INTO #emp_free_no (emp_no)
SELECT Q.emp_no
FROM
(SELECT RIGHT('000'+ CAST(number as VARCHAR(4)),4) as emp_no
FROM numbers
WHERE number > 0 and number < 10000) as Q
LEFT JOIN #emp_one e1 ON e1.emp_no=Q.emp_no
LEFT JOIN #emp_two e2 ON e2.emp_no=Q.emp_no
WHERE e1.emp_no IS NULL AND e2.emp_no IS NULL
/*table of employee numbers that need to be replaced - exist in both tables
these numbers will remain in table emp_one and will be changed in table emp_two*/
CREATE TABLE #duplicates (dupid INT IDENTITY, emp_no VARCHAR(4))
INSERT INTO #duplicates
SELECT e2.emp_no
FROM #emp_two e2
JOIN #emp_one e1 ON e1.emp_no=e2.emp_no
/*write the new employee number*/
UPDATE e2
SET new_emp_no=ISNULL(f.emp_no,e2.emp_no)
FROM #emp_two e2
LEFT JOIN #duplicates d ON d.emp_no = e2.emp_no
LEFT JOIN #emp_free_no f ON f.id=d.dupid
/*show result*/
SELECT * FROM #emp_two
You can now move the rows from table emp_two to table emp_one, and use new_emp_no as key when inserting into table emp_one.
May 26, 2008 at 5:55 am
Vladan,
Thanks for taking the time to create a different approach. I will research the "Number" table solution, as I am finding that I am using something similar in my latest scripts.
I also want to point out something very clever in your loading script:
SELECT RIGHT('000'+ CAST(number as VARCHAR(4)),4) as emp_no...
One reason I was using Excel to generate my set of test records was that I could create a formula to preface the 000's on the keys. I will be using the statement you have above from here on out to do the same thing!
You noted that there is no mention of how to apply the new keys. You address this in your summary of inserting into a new table and using the new employee id from table two, which is a good solution. In my case, I was taking the new unused ids and using them in a conversion program written in Unibasic on a Unidata database. The Unidata tables were the source tables, and then DTS was bringing a copy of the tables to SQL Server, so the data had to be corrected in the underlying Unidata tables, but it was easier to identify the unused keys in SQL Server.
May 26, 2008 at 10:17 am
I have to agree with Vladan on this one.
The approach pointed out in the article is rather cumbersome, even for a one time thing.
Whenever one needs to compare data between tables, the first thing that should come to mind is some form of join between the tables in question.
metawizard2 nice reply to Vladan, it's always nice to see people who can take criticism and learn and grow from it.
May 26, 2008 at 10:41 am
I think that this illustrates one of the fundamental problems that procedural / OO programmers have working with SQL: thinking in sets is not intuitive when you are used to thinking in rows.
The approach in the article comes from a very "Row" place ( or as Jeff Moden refers to as "RBAR" - row by agonizing row http://www.sqlservercentral.com/articles/TSQL/62867/ ).
After playing around with Vladan's solution, I can see how it is a set solution, but I am still having problems "thinking in sets".
Does anyone have any advice on making that transition to sets?
May 26, 2008 at 11:33 am
I am primarily a developer but love playing in SQL as well, so I can associate to some degree with you.
For me the more I played in Query Analyzer trying to consolidate my code the more I began thinking in set based terms. I love less code, the more consolidated it is the more I like it. Most of the time consolidating the code makes it more set based ( not all the time though ).
Subscribing to newletters like SSC have helped to immerse myself in the technology. I am still very guilty of making mistakes and writing poor SQL code, but I get better with each mistake.
Learn to write selects, the more complicated the better, that taught me most of the syntax and method of thinking that I needed. Then later when I needed to manipulate the data further I was able to take my basic selects and tweak them to perform updates, deletes, and anything else I needed.
May 26, 2008 at 4:48 pm
Good approach, I think... my question would be, how do you find and change all the references to such a table? Now THAT would make for an interesting article. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
May 26, 2008 at 6:59 pm
It has been a long journey who is nearing its end. I will tell the tale in an upcoming article! 😉
May 26, 2008 at 11:01 pm
Kudos to the author. First because the author found a solution to a problem. See to get a solution sometimes is a problem, but here the author got a solution which may not be the best. Now the author can do research to find a more smarter way of doing it.
Second the author shared with us the solution.
Third the author is very sporting to accept criticism.
Thanks......:)
May 27, 2008 at 1:19 am
metawizard2,
now that you explained your background, I can understand better why you chose the way you described... and I'm glad you started to explore the set-based approach. If you'll work with SQL Server, you will find it opens lots of possibilities how to manage data efficiently. As long as you work with a few thousand rows, the difference is not crucial - but once you have tables with hundreds of thousands or millions of rows, row-by-row processing is absolute disaster. Converting a procedure to set-based logic can cut the execution time to a small fraction of the original time (like from several hours to a few minutes).
Good luck in your learning process, I'm sure you'll find almost everything you need on this site - and if you miss or don't understand something, just post your question and we will be happy to assist you! Looking forward to your articles and posts,
Vladan
May 27, 2008 at 1:24 am
😉 Thanks for all the great feedback. You have made this first article a really great experience!
June 6, 2008 at 6:37 am
Good approach, I think... my question would be, how do you find and change all the references to such a table?
Trust you to think of that one :rolleyes:
Now THAT would make for an interesting article
It will, when you write it Jeff :hehe:
Far away is close at hand in the images of elsewhere.
Anon.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply