SQL Code Help

Question

SQL Code Help

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply

Jim Mackenzie Hall of Fame Points: 3728 More actions · Answer 1

Yep I agree with you Drew, that's why I asked how many columns were involved, if it was only a couple it might have been feasible.

What about this way, keeping each case seperate and concatenating the results:

SELECT Emp_No,

HR_ID,

ISNULL(Emp_No_Err,'') + ', ' + ISNULL(HR_ID_Err,'') + ', ' + ISNULL(Emp_Name_Err,'') AS Errors

FROM

(SELECTEmp_No,

HR_ID,

CASE WHEN LEN(Emp_No) < 7 THEN 'Emp_No: Too Short' END AS Emp_No_Err,

CASE WHEN LEN(HR_ID) < 7 THEN 'HR_ID: Too Short' END AS HR_ID_Err,

CASE WHEN Emp_Name = 'Alfred' THEN 'Emp_Name: Alfred' END AS Emp_Name_Err

FROM

(SELECT 12345 AS Emp_No, 'abcdef' AS HR_ID, 'Keith' AS Emp_Name) testData

) errorData

SQL SERVER Central Forum Etiquette[/url]

Jim Mackenzie Hall of Fame Points: 3728 More actions · Answer 2

Sorry, duplicated my post by accident. Will try and delete.

SQL SERVER Central Forum Etiquette[/url]

aarionsql SSCrazy Points: 2392 More actions · Answer 3

Thanks Jim, I will try out your code .. I had a basic CASE statement structure ..

drew.allen SSC Guru Points: 76997 More actions · Answer 4

aarionsql (12/5/2011)
HI Drew,
can I have a join in a CASE statement?

No, but you can have a correlated subquery, a (NOT) IN with a subquery, or a (NOT) EXISTS with a subquery in a CASE function. Depending on your conditions and how often the table is referenced, you may want to use a join or apply in the main query instead anyhow.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.

SELECT

Emp_No,

HR_ID,

ErrorChecker.ErrorMsg

FROM

(SELECT 12345 AS Emp_No, 'abcdef' AS HR_ID) testData

CROSS APPLY (

SELECT

d.CommonError,

ErrorMsg =CASE WHEN d.CommonError = 1 THEN 'Emp_No: Too Short, HR_ID: Too Short ' ELSE '' END +

CASE WHEN d.Emp_NoError = 2 THEN 'Emp_No: Too Short ' ELSE '' END +

CASE WHEN d.HR_IDError = 3 THEN 'HR_ID: Too Short ' ELSE '' END

FROM (

SELECT

CommonError = CASE

WHEN LEN(Emp_No) < 7 AND LEN(HR_ID) < 7 THEN 1 ELSE NULL END,

Emp_NoError = CASE

WHEN LEN(Emp_No) < 7 AND LEN(HR_ID) > 6 THEN 2 ELSE NULL END,

HR_IDError = CASE

WHEN LEN(HR_ID) < 7 AND LEN(HR_ID) > 6 THEN 3 ELSE NULL END

) d

) ErrorChecker

Oh, and it's fast, too.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

drew.allen SSC Guru Points: 76997 More actions · Answer 6

ChrisM@Work (12/5/2011)
You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.

I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

drew.allen SSC Guru Points: 76997 More actions · Answer 7

CELKO (12/5/2011)
Your approach to SQL is fundamentally wrong.

Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

drew.allen (12/5/2011)
CELKO (12/5/2011)
Your approach to SQL is fundamentally wrong.
Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.
Drew

He's been told that 1000 times already. I guess he's too old or stuburn to learn anything by now.

Too bad for all of us.

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 9

drew.allen (12/5/2011)
CELKO (12/5/2011)
Your approach to SQL is fundamentally wrong.
Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.
Drew

Pffft, it's actually just spam advertising. All he does is make sure his booklist has a higher hitrate for google spiders. Wonder if we should start reporting his posts that way... 'eh, Steve might not love us.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 10

Evil Kraig F (12/5/2011)
drew.allen (12/5/2011)
CELKO (12/5/2011)
Your approach to SQL is fundamentally wrong.
Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.
Drew
Pffft, it's actually just spam advertising. All he does is make sure his booklist has a higher hitrate for google spiders. Wonder if we should start reporting his posts that way... 'eh, Steve might not love us.

He'd like the feedback and conversation on it.

Of course JC = spam 50% of the time.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 11

drew.allen (12/5/2011)
ChrisM@Work (12/5/2011)
You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.
I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.
Drew

Ouch! Having a bad day, Drew? 😛

Sure it's not the best example, but the point - as stated - was to open up possibilities.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

drew.allen SSC Guru Points: 76997 More actions · Answer 12

ChrisM@Work (12/6/2011)
drew.allen (12/5/2011)
ChrisM@Work (12/5/2011)
You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.
I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.
Drew
Ouch! Having a bad day, Drew? 😛
Sure it's not the best example, but the point - as stated - was to open up possibilities.

It's just inefficient. You have to touch each row twice: once to assign the code and once to assign the actual error message. Sometimes it can be worth trading off efficiency for a gain somewhere else, but I don't see where you're gaining anything here, so you're just sacrificing efficiency.

It also makes your code harder to read, and therefore update, because of the extra layer.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 13

drew.allen (12/6/2011)
ChrisM@Work (12/6/2011)
drew.allen (12/5/2011)
ChrisM@Work (12/5/2011)
You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.
I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.
Drew
Ouch! Having a bad day, Drew? 😛
Sure it's not the best example, but the point - as stated - was to open up possibilities.
It's just inefficient. You have to touch each row twice: once to assign the code and once to assign the actual error message. Sometimes it can be worth trading off efficiency for a gain somewhere else, but I don't see where you're gaining anything here, so you're just sacrificing efficiency.
It also makes your code harder to read, and therefore update, because of the extra layer.
Drew

Point taken.

Now imagine you're writing a query to output cleaned title, firstname, lastname from a table of appallingly bad data. You've set up a shedload of detection and cleaning rules. Some of those cleaning rules will apply to only one column, so a plain CASE WHEN ... THEN ... would be fine. Many however will apply to multiple columns - a single detection rule will dictate changes to both title and firstname, for instance. Why run the expensive detection CASE twice, once for each column?

There's another benefit from this approach too. You can output the detection codes. When you're checking the output for discrepancies you can see which codes are faulty. Trying to figure out which CASE is at fault without this can be extraordinarily difficult. Last but not least - outputting the detection codes allows you to measure which ones are most frequently encountered and put them at the top of the CASE construct, thereby reducing the number of unnecessary checks.

Also - you're not touching the row twice. You're still touching it throughout detection and action.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

drew.allen SSC Guru Points: 76997 More actions · Answer 14

ChrisM@Work (12/6/2011)
Now imagine you're writing a query to output cleaned title, firstname, lastname from a table of appallingly bad data. You've set up a shedload of detection and cleaning rules. Some of those cleaning rules will apply to only one column, so a plain CASE WHEN ... THEN ... would be fine. Many however will apply to multiple columns - a single detection rule will dictate changes to both title and firstname, for instance. Why run the expensive detection CASE twice, once for each column?

Actually, in that situation, I suspect that an OUTER APPLY might actually be more efficient and would almost certainly be easier to read. Something like the following:

SELECT *

FROM YourDirtyDataTable

OUTER APPLY (

SELECT 1 AS ErrorCode1, 'Your specific error' AS ErrorMessage1, 2 AS ErrorCode2

, 'Another error message here' AS ErrorMessage2, <some expression> as Title

, <some other expression> as FirstName

WHERE <your complex detection expression here>

) AS e1

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 15

drew.allen (12/6/2011)
ChrisM@Work (12/6/2011)
Now imagine you're writing a query to output cleaned title, firstname, lastname from a table of appallingly bad data. You've set up a shedload of detection and cleaning rules. Some of those cleaning rules will apply to only one column, so a plain CASE WHEN ... THEN ... would be fine. Many however will apply to multiple columns - a single detection rule will dictate changes to both title and firstname, for instance. Why run the expensive detection CASE twice, once for each column?
Actually, in that situation, I suspect that an OUTER APPLY might actually be more efficient and would almost certainly be easier to read. Something like the following:
SELECT *
FROM YourDirtyDataTable
OUTER APPLY (
SELECT 1 AS ErrorCode1, 'Your specific error' AS ErrorMessage1, 2 AS ErrorCode2
, 'Another error message here' AS ErrorMessage2, <some expression> as Title
, <some other expression> as FirstName
WHERE <your complex detection expression here>
) AS e1
Drew

You're absolutely right - CROSS APPLY is well nice for this - if you exceed SQL Server's expressions limit, you can pop it seamlessly into an iTVF and voila! Error gone, code gone:-D

We'll have to differ on this one Drew. I'm using the method as we speak having started with single CASE expressions, and switching to separate detection/effect has made this project tons easier and far far easier to read.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden