September 18, 2013 at 10:55 am
I am trying to clean up some data in one of the tables in our database. The table consists of an ID column and a 3 columns that contains alternative names. Each ID is allowed to have a maximum of 3 alternative names. If there are 3 names present then AlternativeName1, AlternativeName2, AlternativeName3 should be filled. If there are 2 names present then AlternativeName1 and AlternativeName2 should be filled. If 1 name is present then AlternativeName1 should be filled.
Currently the alternative Names fields are filled in haphazardly. For example if the is one alternative name is provided it can be found in the AlternativeName2 or AlternativeName3 column. I would like update the table so that it follows the order presented above. I have provided some data below to illustrate what is required.
--Test Table
CREATE TABLE #TEST
(
[ID] INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
[AlternateName] nvarchar(20) NUll,
[AlternateName1] nvarchar(20) NUll,
[AlternateName2] nvarchar(20) NUll
)
--Load Table with test data
INSERT INTO #TEST
(
[AlternateName]
,[AlternateName1]
,[AlternateName2]
)
SELECT 'Micheal','Mickey','' UNION ALL
SELECT '','James','Jimmy' UNION ALL
SELECT '','','Sarah' UNION ALL
SELECT 'Peter','','Pete' UNION ALL
SELECT 'Andrew','Andy','Andre' UNION ALL
SELECT '','John'
SELECT * FROM #TEST
--The output should be
ID AlternateName AlternateName1 AlternateName2
1 Micheal MIckey
2 James Jimmy
3 Sarah
4 Peter Pete
5 Andrew Andy Andre
6 John
Any Ideas?
September 18, 2013 at 11:23 am
SQL 2012 includes the Immediate If function and could be used for this.
Dan
If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.
September 18, 2013 at 1:00 pm
I did not see a way to do this without multiple update statements. But since this is a clean up job, running 2 updates should not be a big deal!
Try this and see what happens.
update #TEST set
AlternateName1 = AlternateName2,
AlternateName2 = ''
where AlternateName1 = ''
update #TEST set
AlternateName = AlternateName1,
AlternateName1 = AlternateName2,
AlternateName2 = ''
where AlternateName = ''
You might be able to optimize and/or combine these 2 statements using the swap "feature" of the update. (From Joe Celko's book "SQL For Smarties".)
SET
a = b,
b = a
will swap a with b.
I have not given it enough thought about how this could be applied to your problem. The setup looked so similar to what was described in the book, I thought I would just throw it out there.
__________________________________________________________________________________________________________
How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/
September 18, 2013 at 1:04 pm
Since nobody else suggested it I guess I feel that I should. A more long term fix for this would be to normalize your data instead of fighting a denormalized table like this. Then there is no need for this type of maintenance stuff. You just look in the alternate names table and find as many as have been entered. 😉
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 18, 2013 at 3:54 pm
To continue where Sean left off:
Or at least add a CHECK constraint:
CHECK (NOT (len(isnull(AlternateName1, '')) = 0 AND (len(AlternateName2) > 0 OR len(AlternateName3) > 0) OR
len(isnull(AlternateName2, '') > 0 AND len(AlternateName3) > 0)
[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]
September 18, 2013 at 4:35 pm
Apologies for responding so late.
I will give some of the suggestions a try. I agree with the comment about the database structure but to cut a long story short I cannot really change it as it is a back end table from our for one of our applications plus there is little politics at play.
Will let you know how I get on.
September 18, 2013 at 5:25 pm
Sean Lange (9/18/2013)
Since nobody else suggested it I guess I feel that I should. A more long term fix for this would be to normalize your data instead of fighting a denormalized table like this. Then there is no need for this type of maintenance stuff. You just look in the alternate names table and find as many as have been entered. 😉
I'm glad someone beat me to this. +1000!
--Jeff Moden
Change is inevitable... Change for the better is not.
September 18, 2013 at 8:35 pm
Jeff Moden (9/18/2013)
Sean Lange (9/18/2013)
Since nobody else suggested it I guess I feel that I should. A more long term fix for this would be to normalize your data instead of fighting a denormalized table like this. Then there is no need for this type of maintenance stuff. You just look in the alternate names table and find as many as have been entered. 😉I'm glad someone beat me to this. +1000!
+1+1
Of course, you could always resort to something like this:
WITH PivotedNames AS (
SELECT ID
,[AlternateName]=MAX(CASE WHEN rn=1 THEN Name ELSE '' END)
,[AlternateName1]=MAX(CASE WHEN rn=2 THEN Name ELSE '' END)
,[AlternateName2]=MAX(CASE WHEN rn=3 THEN Name ELSE '' END)
FROM (
SELECT *, rn=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY n)
FROM #TEST a
CROSS APPLY (
VALUES (1, [AlternateName]),(2, [AlternateName1]),(3, [AlternateName2])) b(n, Name)
WHERE Name <> ''
) a
GROUP BY ID
)
UPDATE a
SET [AlternateName]=b.[AlternateName]
,[AlternateName1]=b.[AlternateName1]
,[AlternateName2]=b.[AlternateName2]
FROM #TEST a
INNER JOIN PivotedNames b
ON a.ID = b.ID;
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 19, 2013 at 1:11 pm
Lol, CASE -
SELECT
AlternateName = CASE WHEN AlternateName = ' ' AND AlternateName1 <> ' '
THEN AlternateName1 WHEN AlternateName = ' ' AND AlternateName2 <> ' ' THEN AlternateName2 ELSE AlternateName END ,
AlternateName1 = CASE WHEN AlternateName1 = ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN AlternateName1 WHEN AlternateName1 <> ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN AlternateName2 WHEN AlternateName1 = ' ' AND AlternateName <> ' ' AND AlternateName2 <> ' '
THEN AlternateName2 ELSE AlternateName1 END ,
AlternateName2 = CASE WHEN AlternateName2 = ' '
THEN AlternateName2 WHEN AlternateName2 <> ' ' AND AlternateName <> ' ' AND AlternateName1 <> ' '
THEN AlternateName2 WHEN AlternateName1 = ' ' AND AlternateName <> ' ' AND AlternateName2 <> ' '
THEN ' ' WHEN AlternateName1 = ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN ' ' WHEN AlternateName1 <> ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN ' ' ELSE AlternateName2 END
FROM
#TEST
September 19, 2013 at 7:34 pm
ccavaco (9/19/2013)
Lol, CASE -SELECT
AlternateName = CASE WHEN AlternateName = ' ' AND AlternateName1 <> ' '
THEN AlternateName1 WHEN AlternateName = ' ' AND AlternateName2 <> ' ' THEN AlternateName2 ELSE AlternateName END ,
AlternateName1 = CASE WHEN AlternateName1 = ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN AlternateName1 WHEN AlternateName1 <> ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN AlternateName2 WHEN AlternateName1 = ' ' AND AlternateName <> ' ' AND AlternateName2 <> ' '
THEN AlternateName2 ELSE AlternateName1 END ,
AlternateName2 = CASE WHEN AlternateName2 = ' '
THEN AlternateName2 WHEN AlternateName2 <> ' ' AND AlternateName <> ' ' AND AlternateName1 <> ' '
THEN AlternateName2 WHEN AlternateName1 = ' ' AND AlternateName <> ' ' AND AlternateName2 <> ' '
THEN ' ' WHEN AlternateName1 = ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN ' ' WHEN AlternateName1 <> ' ' AND AlternateName = ' ' AND AlternateName2 <> ' '
THEN ' ' ELSE AlternateName2 END
FROM
#TEST
Did the OP say 3 columns or 4 columns of names? :w00t:
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 19, 2013 at 8:11 pm
LinksUp (9/18/2013)
I did not see a way to do this without multiple update statements. But since this is a clean up job, running 2 updates should not be a big deal!Try this and see what happens.
update #TEST set
AlternateName1 = AlternateName2,
AlternateName2 = ''
where AlternateName1 = ''
update #TEST set
AlternateName = AlternateName1,
AlternateName1 = AlternateName2,
AlternateName2 = ''
where AlternateName = ''
I'm a speed-freak and will, many times, use more complex code to gain a decent performance advantage. However, much of the time, simplicity is not only easier to write and troubleshoot, but it's frequently faster, as well. Even if it weren't currently the fastest method posted on this thread so far, I'd likely encourage folks to use the method above because is soooooooo bloody simple and effective. Just my opinion but well done, LinksUp!
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2013 at 12:09 am
I opted for the multiple update option. It is added to a larger clean up job that will run at night. It ran very quickly and gave the desired result. Thanks all for your help.
September 20, 2013 at 7:05 am
eseosaoregie (9/20/2013)
I opted for the multiple update option. It is added to a larger clean up job that will run at night. It ran very quickly and gave the desired result. Thanks all for your help.
Thanks for the feedback. I have to ask though... why is it the data gets messed up every night? You would expect something like this to be used on a staging table for new inputs and the permanent table would suffer virtually no changes.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 20, 2013 at 7:36 am
If it is messed up on a nightly basis I would suggest having a trigger to fix the bad data directly. Unless, fixing the source is within your powers.
[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]
September 20, 2013 at 9:12 am
The schema of the database is from an old AX4 application that is currently been to be upgraded to AX2012. The combination of a poorly designed forms and schemas allowed the data to end up in its current form. Most of the data has now been migrated to AX2012 with new forms etc developed. However a requirement from the business came stating that they needed the alternative name data along with some other information for a one-off analysis before switching to AX 2012. I was given the task of assembling the data. That is when I saw the inconsistency in the database. There are also other tables which were poorly designed due to customization in AX4. As part of the upgrade to 2012 these design issues have been cleared up or so I have been told lead DBA/architect on the upgrade project.
In any case I was looking for a quick way to clean it up once and the present the data to the business analysts. It was then added as a step in a nightly job which produced the results this morning. Going forward AX2012 will be used.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply