Multipe letters into one

Question

Post reply

Multipe letters into one

karthik M

One Orange Chip

Points: 29139
More actions
October 1, 2012 at 11:21 am

#268335

All,
I have read and asked lot of questionss about "Multiple space into one" here. I just thought the below scenario which came suddenly in my mind.
Incase if the user entered wrongly or if the file contain wrong data as below
kaarthi
jefff moden
steevve
gillla
barrry youuung
mattt milller
Joeee celkko
lowweel
maaark
biiit buucket
graant fritchey
phhhill factor
ChriiisM
We all know their names. 🙂
Expected Output:
karthi
jef moden
steve
gila
rbary young
mat miler
Joe celko
lowel
mark
bit bucket
grant fritchey
phil factor
ChrisM
For no, I just removed all the multiple letters into one. But this may also cause some issue if some one has really two letters in their name.
I think we have to handle this one too.
karthik

Viewing 15 posts - 1 through 15 (of 41 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 1

Actually, I don't recommend handling this at all. It will break properly spelled names.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 2

Jeff Moden (10/1/2012)
Actually, I don't recommend handling this at all. It will break properly spelled names.

+1

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Ben Teraberry SSChampion Points: 11845 More actions · Answer 3

Jeff Moden (10/1/2012)
Actually, I don't recommend handling this at all. It will break properly spelled names.

+1

Just because you can do something doesn't mean you should. Doing this sort of data validation within the database will only result in a mess. Even if you have a valid list of names to check against and write the code to "figure out" what they really meant, the chance would remain that two distinct names could be identical after all duplicate letters are removed.

Furthermore, it's not a good architectural philosophy. The application that is the transit between the human who can't spell and the database is the appropriate location for the validation. That is where the 'did you really mean X?' conversation can reliably occur. It is then the database's job to store that data and retrieve it on demand.

└> bt

Forum Etiquette: How to post data/code on a forum to get the best help[/url]

Ray M SSC-Insane Points: 21093 More actions · Answer 4

Ray M

SSC-Insane

Points: 21093

October 1, 2012 at 3:33 pm

#1544168

+1

GIGO:w00t:

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 5

While I agree its not a practical solution, as there are names (Yvonne, Pattinson, Lee, off the top of my head) where consecutive characters can be the same and so you will be destroying potentially valid names

However, I can see its use as a test question in a technical interview for an SQL developer just to see how they think and solve a complex SQL problem.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 6

Jason-299789 (10/2/2012)
While I agree its not a practical solution, as there are names (Yvonne, Pattinson, Lee, off the top of my head) where consecutive characters can be the same and so you will be destroying potentially valid names
However, I can see its use as a test question in a technical interview for an SQL developer just to see how they think and solve a complex SQL problem.

I try not to ask "oolies" during an interview. Asking improbable questions just ticks off the good ones and makes them think the interviewer is just showing off. Ask practical questions. There are plenty of them.

If you want to break the ice on "the next level" of T-SQL programming, explain how important counting is in T-SQL and then ask them to write a script that will count from 1 to 100. You'll be amazed at how many people still resort to a While Loop or a (gasp!) Recursive CTE.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 7

As always a Fair point Jeff.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

karthik M One Orange Chip Points: 29139 More actions · Answer 8

I do agree all your points. I just thought this one while reading/replying to my another post. As all of you mentioned, it will break the original name into some useless one. I wanted to know the different opinion on this topic from the sql masters. Thats all the intention 🙂

Jeff,

Coming back to your question, count 1 to 100,

select sum(n) from tally where n < 101

am i right 🙂

karthik

Dwain Camps SSC Guru Points: 86978 More actions · Answer 9

karthik M (10/2/2012)
Jeff,
Coming back to your question, count 1 to 100,
select sum(n) from tally where n < 101
am i right 🙂

How about:

SELECT 50*101

instead?

Seriously, I don't think that was the answer to Jeff's question, but if it was mine's probably faster. 😀

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 10

Karthick,

Jeff was making valid point about about how to create the Tally table in the first place, most people will write a while loop, or a recursive CTE rather than using a cross join between two of the sys tables in the db.

I suspect Jeff wants to see the developer do the cross join rather than the CTE or a While loop. 😀

_________________________________________________________________________
SSC Guide to Posting and Best Practices

karthik M One Orange Chip Points: 29139 More actions · Answer 11

karthik M

One Orange Chip

Points: 29139

October 3, 2012 at 3:23 am

#1544746

yeah..n(n+1)/2..

karthik

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 12

karthik M (10/2/2012)
Coming back to your question, count 1 to 100,
select sum(n) from tally where n < 101
am i right 🙂

Apparently you didn't run your own code to test to see if it does the job, Karthik.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 13

Jason-299789 (10/3/2012)
Karthick,
Jeff was making valid point about about how to create the Tally table in the first place, most people will write a while loop, or a recursive CTE rather than using a cross join between two of the sys tables in the db.
I suspect Jeff wants to see the developer do the cross join rather than the CTE or a While loop. 😀

Or a cascading cross-joined CTE Itzik Ben-Gan style. Or a SELECT from a Tally table.

Bonus points if the Developer says, "It Depends" and then demonstrates and explains all 3 without further prompting. Even more points for the tangent of using IDENTITY vs ISNULL(ROW_NUMBER()) on the creation of a physical Tally Table.

If you know this stuff, you should be able to talk about it spontaneously including some pros and cons. If you've only memorized it, you might only come up with the cross join and offer no extra information without me having to ask a thousand questions to find out what you know.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Cadavre SSC-Forever Points: 41690 More actions · Answer 14

I've answered the counting question before. I didn't get the job in the end, but I suspect it was due to experience rather than interview technique (when I asked them if there was anything specific that had stopped them from hiring me, I didn't get an actual answer). Anyway, this is how I answered the counting question: -

That depends on whether or not you have a numbers or tally table already defined. If you do then: -

--== TALLY ==--

SELECT N

FROM Tally

WHERE N > 0 AND N <= 100;

Which would do an index seek on what would hopefully be an index with a fill factor of 100 since the data in the table should be static.

If there isn't already a numbers or tally table defined, I'd prefer to add one as they can be extremely useful in solving lots of different issues.

If, for one reason or another, adding a tally or numbers table is not possible then we could do it in a few ways.

The most common answer is probably to do a loop. This is also the worst performing solution, as SQL is designed to be set-based. However, in the interests of completeness: -

--== WHILE LOOP ==--

DECLARE @HOLDER TABLE (N INT);

DECLARE @COUNTER INT = 0;

WHILE @COUNTER < 100

BEGIN

SET @COUNTER = @COUNTER + 1;

INSERT INTO @HOLDER

SELECT @COUNTER;

END

SELECT N

FROM @HOLDER;

The next way to do it would be a recursive CTE. I've found these to be as bad as loops for performance scaling, so I wouldn't generally advise that they be used. Again, in the interest of completeness: -

--== RECURSIVE CTE ==--

WITH CTE(N) AS (

SELECT 1

UNION ALL

SELECT N+1

FROM CTE

WHERE N+1 <= 100)

SELECT N

FROM CTE;

Finally, we could use a cross join method to get the cartesian product of a set of 10, then use the row number function. Either we could use the method that has been made popular by Ben-Gan: -

--== CASCADING CROSS JOIN ==--

WITH CTE(N) AS (SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))a(N)),

CTE2(N) AS (SELECT 1 FROM CTE X, CTE Y)

SELECT ROW_NUMBER() OVER(ORDER BY(SELECT NULL))

FROM CTE2;

Or, since we know it's only 100, we could just do a cross join on two sets of 10: -

--== CROSS JOIN ==--

SELECT ROW_NUMBER() OVER(ORDER BY(SELECT NULL))

FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))a(N)

CROSS JOIN (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))b(N);

So, my preference would be a numbers or tally table, but if I can't use one then I'd go with a cross join method.

Looking back, perhaps I should've gone into the creation of a tally table. Never mind, if I'd got that job then I wouldn't have my current position which I'm enjoying a lot 😀

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/