October 1, 2012 at 11:21 am
All,
I have read and asked lot of questionss about "Multiple space into one" here. I just thought the below scenario which came suddenly in my mind.
Incase if the user entered wrongly or if the file contain wrong data as below
kaarthi
jefff moden
steevve
gillla
barrry youuung
mattt milller
Joeee celkko
lowweel
maaark
biiit buucket
graant fritchey
phhhill factor
ChriiisM
We all know their names. 🙂
Expected Output:
karthi
jef moden
steve
gila
rbary young
mat miler
Joe celko
lowel
mark
bit bucket
grant fritchey
phil factor
ChrisM
For no, I just removed all the multiple letters into one. But this may also cause some issue if some one has really two letters in their name.
I think we have to handle this one too.
karthik
October 1, 2012 at 11:39 am
Actually, I don't recommend handling this at all. It will break properly spelled names.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 1, 2012 at 1:01 pm
Jeff Moden (10/1/2012)
Actually, I don't recommend handling this at all. It will break properly spelled names.
+1
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
October 1, 2012 at 3:31 pm
Jeff Moden (10/1/2012)
Actually, I don't recommend handling this at all. It will break properly spelled names.
+1
Just because you can do something doesn't mean you should. Doing this sort of data validation within the database will only result in a mess. Even if you have a valid list of names to check against and write the code to "figure out" what they really meant, the chance would remain that two distinct names could be identical after all duplicate letters are removed.
Furthermore, it's not a good architectural philosophy. The application that is the transit between the human who can't spell and the database is the appropriate location for the validation. That is where the 'did you really mean X?' conversation can reliably occur. It is then the database's job to store that data and retrieve it on demand.
October 1, 2012 at 3:33 pm
+1
GIGO:w00t:
October 2, 2012 at 2:02 am
While I agree its not a practical solution, as there are names (Yvonne, Pattinson, Lee, off the top of my head) where consecutive characters can be the same and so you will be destroying potentially valid names
However, I can see its use as a test question in a technical interview for an SQL developer just to see how they think and solve a complex SQL problem.
_________________________________________________________________________
SSC Guide to Posting and Best Practices
October 2, 2012 at 6:59 am
Jason-299789 (10/2/2012)
While I agree its not a practical solution, as there are names (Yvonne, Pattinson, Lee, off the top of my head) where consecutive characters can be the same and so you will be destroying potentially valid namesHowever, I can see its use as a test question in a technical interview for an SQL developer just to see how they think and solve a complex SQL problem.
I try not to ask "oolies" during an interview. Asking improbable questions just ticks off the good ones and makes them think the interviewer is just showing off. Ask practical questions. There are plenty of them.
If you want to break the ice on "the next level" of T-SQL programming, explain how important counting is in T-SQL and then ask them to write a script that will count from 1 to 100. You'll be amazed at how many people still resort to a While Loop or a (gasp!) Recursive CTE.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 2, 2012 at 7:20 am
As always a Fair point Jeff.
_________________________________________________________________________
SSC Guide to Posting and Best Practices
October 2, 2012 at 10:07 pm
I do agree all your points. I just thought this one while reading/replying to my another post. As all of you mentioned, it will break the original name into some useless one. I wanted to know the different opinion on this topic from the sql masters. Thats all the intention 🙂
Jeff,
Coming back to your question, count 1 to 100,
select sum(n) from tally where n < 101
am i right 🙂
karthik
October 2, 2012 at 11:09 pm
karthik M (10/2/2012)
Jeff,
Coming back to your question, count 1 to 100,
select sum(n) from tally where n < 101
am i right 🙂
How about:
SELECT 50*101
instead?
Seriously, I don't think that was the answer to Jeff's question, but if it was mine's probably faster. 😀
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 3, 2012 at 12:59 am
Karthick,
Jeff was making valid point about about how to create the Tally table in the first place, most people will write a while loop, or a recursive CTE rather than using a cross join between two of the sys tables in the db.
I suspect Jeff wants to see the developer do the cross join rather than the CTE or a While loop. 😀
_________________________________________________________________________
SSC Guide to Posting and Best Practices
October 3, 2012 at 3:23 am
yeah..n(n+1)/2..
karthik
October 3, 2012 at 4:45 am
karthik M (10/2/2012)
Coming back to your question, count 1 to 100,select sum(n) from tally where n < 101
am i right 🙂
Apparently you didn't run your own code to test to see if it does the job, Karthik.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 3, 2012 at 4:48 am
Jason-299789 (10/3/2012)
Karthick,Jeff was making valid point about about how to create the Tally table in the first place, most people will write a while loop, or a recursive CTE rather than using a cross join between two of the sys tables in the db.
I suspect Jeff wants to see the developer do the cross join rather than the CTE or a While loop. 😀
Or a cascading cross-joined CTE Itzik Ben-Gan style. Or a SELECT from a Tally table.
Bonus points if the Developer says, "It Depends" and then demonstrates and explains all 3 without further prompting. Even more points for the tangent of using IDENTITY vs ISNULL(ROW_NUMBER()) on the creation of a physical Tally Table.
If you know this stuff, you should be able to talk about it spontaneously including some pros and cons. If you've only memorized it, you might only come up with the cross join and offer no extra information without me having to ask a thousand questions to find out what you know.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 3, 2012 at 5:30 am
I've answered the counting question before. I didn't get the job in the end, but I suspect it was due to experience rather than interview technique (when I asked them if there was anything specific that had stopped them from hiring me, I didn't get an actual answer). Anyway, this is how I answered the counting question: -
--== TALLY ==--
SELECT N
FROM Tally
WHERE N > 0 AND N <= 100;
Which would do an index seek on what would hopefully be an index with a fill factor of 100 since the data in the table should be static.
If there isn't already a numbers or tally table defined, I'd prefer to add one as they can be extremely useful in solving lots of different issues.
If, for one reason or another, adding a tally or numbers table is not possible then we could do it in a few ways.
The most common answer is probably to do a loop. This is also the worst performing solution, as SQL is designed to be set-based. However, in the interests of completeness: -
--== WHILE LOOP ==--
DECLARE @HOLDER TABLE (N INT);
DECLARE @COUNTER INT = 0;
WHILE @COUNTER < 100
BEGIN
SET @COUNTER = @COUNTER + 1;
INSERT INTO @HOLDER
SELECT @COUNTER;
END
SELECT N
FROM @HOLDER;
The next way to do it would be a recursive CTE. I've found these to be as bad as loops for performance scaling, so I wouldn't generally advise that they be used. Again, in the interest of completeness: -
--== RECURSIVE CTE ==--
WITH CTE(N) AS (
SELECT 1
UNION ALL
SELECT N+1
FROM CTE
WHERE N+1 <= 100)
SELECT N
FROM CTE;
Finally, we could use a cross join method to get the cartesian product of a set of 10, then use the row number function. Either we could use the method that has been made popular by Ben-Gan: -
--== CASCADING CROSS JOIN ==--
WITH CTE(N) AS (SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))a(N)),
CTE2(N) AS (SELECT 1 FROM CTE X, CTE Y)
SELECT ROW_NUMBER() OVER(ORDER BY(SELECT NULL))
FROM CTE2;
Or, since we know it's only 100, we could just do a cross join on two sets of 10: -
--== CROSS JOIN ==--
SELECT ROW_NUMBER() OVER(ORDER BY(SELECT NULL))
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))a(N)
CROSS JOIN (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))b(N);
So, my preference would be a numbers or tally table, but if I can't use one then I'd go with a cross join method.
Looking back, perhaps I should've gone into the creation of a tally table. Never mind, if I'd got that job then I wouldn't have my current position which I'm enjoying a lot 😀
Viewing 15 posts - 1 through 15 (of 40 total)
You must be logged in to reply to this topic. Login to reply