June 14, 2010 at 10:31 am
how on earth you can show in this table that the student did not take any of tests?
Remember, this is a challenge to get the answerer / [students] to think.
Of course you can't tell zero tests were taken from the student test table ... so what do you need to do instead to answer the original request?
Scott Pletcher, SQL Server MVP 2008-2010
June 14, 2010 at 10:34 am
Easy, kick off the job that loads the data during the week-end at 4 PM friday rather than "off" hours. :hehe:
June 14, 2010 at 1:18 pm
Almost right. Required data was only available on Monday morning but senior BA manager required that it should be the Friday report. So, the report is generated on Monday but dated as Friday report. BA's knew the issue from the start, (it was third-party data which arrive on Sunday and that fact could not be changed) and when he was shown "The Firday" report the first question was: "How we managed to get data in it?". Must to say that the "business" was also very impressed, as this BA promised them that, in short future, small modification in e-mail program will be made to send out this report on Firday as well :w00t:.
I left this company before this "modification" was made. I guess they changed the date in the e-mail source header to make it Friday email :-D. And who cares, business always finishes half an hour before EoB on Friday in this nice place. But when they back on Monday, they will see "The Friday" report in their inbox :-D:-D:-D
Army is the place for challenge: They carry on round objects and roll the square ones :hehe:
June 14, 2010 at 3:06 pm
An interesting quote in a (semi-)recent article from Joe Celko:
Here is a characteristic for good SQL programming: do as much work as you can in ONE statement, so the optimizer can have as much information as possible.
[Emphasis on "ONE" not added by me, it was in the original.]
From the article:
http://www.simple-talk.com/sql/t-sql-programming/celkos-sql-stumper-eggs-in-one-basket/
Some more basic info on Celko, from the same article:
"
Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
"
Scott Pletcher, SQL Server MVP 2008-2010
June 14, 2010 at 8:05 pm
scott.pletcher (6/14/2010)
An interesting quote in a (semi-)recent article from Joe Celko:Here is a characteristic for good SQL programming: do as much work as you can in ONE statement, so the optimizer can have as much information as possible.
[Emphasis on "ONE" not added by me, it was in the original.]
From the article:
http://www.simple-talk.com/sql/t-sql-programming/celkos-sql-stumper-eggs-in-one-basket/
Some more basic info on Celko, from the same article:
"
Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
"
Oddly enough, I just proved that dead wrong again today (heh... like most every day) at work. The use of interim-answer temp tables took the duration and cpu usage on a given sproc down from almost 3 seconds to 90 MILLI-seconds and the number of reads down from 244,000 to only 2,000 on a join between 3 tables the largest of which containing 3.3 million rows.
Maybe you can get away with all-in-one queries in Oracle (you can't, really...) but "Divide'n'Conquer" works super fast in SQL Server. Like I said earlier, "Set Based" does NOT mean "All-in-one-query" and neither does high performance code.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 14, 2010 at 8:49 pm
scott.pletcher (6/14/2010)
Hmmm... perhaps I should start a new challenge... I'll write some code and we'll have people figure out how the requirements should have been stated
Sigh. Consistently all you want to offer is pompous air.
Perhaps you could a few more self-important quotes to your signature so that you could acutally take up a whole page with every post.
Heh... all those personal attacks just because I have an opinion and experience contrary to yours. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
June 14, 2010 at 9:35 pm
john-902052 (6/14/2010)
All,I wasn't trying to start a fight. I was just wondering if there was a good hearted individual that would be willing to help me learn to think through things so that I could learn. I am a martial arts instructor and if I were to tell my students to simply read they would never advance but when I have them study and I show them the reason behind the technique they learn very rapidly.
JD
My apologies, John. And you're correct about being shown a technique and having an explanation as to why and how. I believe the same techniques are applicable to SQL Server and just about everything else in life in general. The advantage with SQL Server and this site is that you can have more than one opinion, more than one mentor, more than one technique, and more than one explanation.
For posted problems, sometimes only one person will reply. What you have to understand is that many will see that reply. Silence by the others frequently means that it's a good, high performance solution. Just like many martial arts techniques, if an explanation is lacking it is sometimes best to make the realization of "why" through self study where you not only learn the "why" but you also learn how to find the resources to explain the how. Again, as with martial arts, once you learn the ropes you'll begin to develop your own techniques and possibly surpass your mentor.
For other posted problems, you may get several answers. Heh... again, just like in martial arts, there are those in the school who are ready to teach and help whenever and wherever they can. There are even some friendly rivalries and a lot can be learned when two or more such rivals go at a problem. There are also so those not so friendly rivalries and not so friendly people but much can still be learned even from watching a seeming dog pile... as with martial arts, not everything learned will be just about SQL Server.
With all that being said, my suggestion is to not limit yourself to the opinions of a single mentor even if that mentor is perceived to be very good whether the mentor be in person or by writings. Be it martial arts or SQL Server, if you learn the best of multiple styles, no person knowing a single style will be able to touch you. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
June 15, 2010 at 12:44 am
Hello again, John,
Here’s an example of what I meant in the previous post. Hopefully, you'll understand why it's good to have more than one mentor.
Using the original problem description, many folks would solve it using code similar to the following.
/**********************************************************************************************************************
Original problem:
A studentgrade table has one row per student and test. Each student may have taken anywhere from 0 to 3 (never more)
tests.
studentgrade --> (studid int, test# int, score int)
Write a single query -- no subqueries, joins, CTEs, etc. -- that lists:
studentid
# of tests taken
hi score (if only one test, it shows here)
mid score (if only two tests, 2nd shows here; if only one test, will be null)
low score (will only be non-null if all 3 tests are found for that student)
**********************************************************************************************************************/
--===== Create and populate the test table.
-- Note that the student hasn't been told to include the test data in their answer so the student wouldn't turn
-- in the following code to the instructor.
CREATE TABLE studentgrade (studid INT, test# INT, score INT)
INSERT INTO studentgrade (studid, test#, score) VALUES (1, 1, 50)
INSERT INTO studentgrade (studid, test#, score) VALUES (2, 1, 25)
INSERT INTO studentgrade (studid, test#, score) VALUES (2, 2, 75)
INSERT INTO studentgrade (studid, test#, score) VALUES (3, 1, 30)
INSERT INTO studentgrade (studid, test#, score) VALUES (3, 2, 60)
INSERT INTO studentgrade (studid, test#, score) VALUES (3, 3, 90)
--===== Solve the problem
SELECT studid,
COUNT(score) AS [# of tests taken],
MAX(score) AS [hi score],
CASE COUNT(score)
WHEN 3 THEN SUM(score)-MIN(score)-MAX(score)
WHEN 2 THEN MIN(score)
END AS [mid score],
CASE COUNT(score)
WHEN 3 THEN MIN(score)
END AS [low score]
FROM studentgrade
GROUP BY studid
ORDER BY studid
;That would produce the following supposedly “correct” result, the student would get an “A” for the assignment, and everyone would be happy. They’d all be wrong, but they’d all be happy. 😉
studid # of tests taken hi score mid score low score
1 1 50 NULL NULL
2 2 75 25 NULL
3 3 90 60 30
(3 row(s) affected)
Soooooo… what’s the problem with that? As well intentioned as the problem is, why do I say it’s not a correct answer?
First, the students are left up to their own devices as to how to make the data and they’re not required to turn in any data creation code. As a result, the instructor never gets to see the data creation code and never gets the opportunity to explain why using such a method (INSERT/VALUES) may be wrong in so many ways. It’s RBAR, which will make it comparatively slow and shouldn't be used in the real world. It also doesn’t contain all the “tests” identified in the original problem which states that “Each student may have taken anywhere from 0 to 3 (never more) tests.” with emphasis on the “0”. Since the student didn’t test for the “0” requirement, the instructor also doesn’t get to see what happens if you do have data for students that have no scores especially since the table allows it (columns are not NOT NULL).
For example, if we use the following data, we’ll see a nice little warning come up.
--===== Create and populate the test table.
CREATE TABLE studentgrade (studid INT, test# INT, score INT)
INSERT INTO studentgrade
(studid, test#, score)
SELECT 1, 1, 50 UNION ALL
SELECT 2, 1, 25 UNION ALL
SELECT 2, 2, 75 UNION ALL
SELECT 3, 1, 30 UNION ALL
SELECT 3, 2, 60 UNION ALL
SELECT 3, 3, 90 UNION ALL
SELECT 4, 1, 50 UNION ALL
SELECT 4, NULL, NULL UNION ALL
SELECT 4, NULL, NULL UNION ALL
SELECT 5, 1, 25 UNION ALL
SELECT 5, 2, 75 UNION ALL
SELECT 5, NULL, NULL UNION ALL
SELECT 6, NULL, NULL UNION ALL
SELECT 7, NULL, NULL UNION ALL
SELECT 7, NULL, NULL UNION ALL
SELECT 8, NULL, NULL UNION ALL
SELECT 8, NULL, NULL UNION ALL
SELECT 8, NULL, NULL UNION ALL
SELECT 9, NULL, NULL UNION ALL
SELECT 9, 2, NULL UNION ALL
SELECT 9, NULL, 50
;
Here are the results. Notice the warning...
Studid # of tests taken hi score mid score low score
1 1 50 NULL NULL
2 2 75 25 NULL
3 3 90 60 30
4 1 50 NULL NULL
5 2 75 25 NULL
6 0 NULL NULL NULL
7 0 NULL NULL NULL
8 0 NULL NULL NULL
9 1 50 NULL NULL
Warning: Null value is eliminated by an aggregate or other SET operation.
(9 row(s) affected)
Big deal, right? It’s just a little warning. But, that warning could possibly put the screws to a GUI or a file output. The GUI would see it as a separate result set and possibly fire off some error code because it was expecting a single return instead of two. Unbeknownst to many, the same goes for the row count that’s displayed. It’s a separate return and a GUI could mistake it for an error and it’s not likely that a file spec will include such a thing, either. Because the requirement is to do it all in “a single query -- no subqueries, joins, CTEs, etc”, the opportunity is missed to teach that the devil is in the data and that every script or stored procedure should probably have SET NOCOUNT ON as the very first thing to do.
The opportunity to teach about proper table construction (the NOT NULL thing in this case) was missed and so was the opportunity to talk about the pro’s and con’s of suppressing ANSI_WARNINGS especially if in a distributed environment (which is VERY real world).
Because of the do it all in “a single query -- no subqueries, joins, CTEs, etc” requirement, several other real world opportunities are missed. One of the opportunities is the opportunity to suppress the warning about NULLs being eliminated without actually using SET ANSI_WARNINGS OFF which would play havoc in a distributed environment. The other missed opportunity is one of having a bit of forethought to prevent “time bombs” in the code. For example, WHAT IF the requirements were to change from 3 tests to 5 tests next year? Because the original code was very specifically written to only meet the spec of never having more than 3 exams, all hell will break loose because the code isn’t easily salvageable to handle 5 tests. I’m not talking about “adding scope” to the problem. I’m talking about writing good, easy-to-modify code. For example, if imagination weren’t suppressed by the “single query” requirement, someone might write some code like the following.
SET NOCOUNT ON
;
WITH
cteGradeRank AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY studid ORDER BY studid ,score DESC) AS GradeRank,
studid,
Score AS Score
FROM studentgrade
)
SELECT studid,
SUM(CASE WHEN Score >= 0 THEN 1 ELSE 0 END) AS [# of tests taken],
MAX(CASE WHEN GradeRank = 1 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) [hi score],
MAX(CASE WHEN GradeRank = 2 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) [mid score],
MAX(CASE WHEN GradeRank = 3 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) [low score]
FROM cteGradeRank
GROUP BY studid
;
The result of the above code using the previous test data looks like this…
studid # of tests taken hi score mid score low score
----------- ---------------- ---------- ---------- ----------
1 1 50
2 2 75 25
3 3 90 60 30
4 1 50
5 2 75 25
6 0
7 0
8 0
9 1 50
Notice, no NULL warnings and no row counts are present to confuse a GUI or mar a file output. Let’s see how easy it is to modify that code for 5 tests.
SET NOCOUNT ON
;
--===== Create and populate the test table but with up to 5 tests.
CREATE TABLE studentgrade (studid INT, test# INT, score INT)
INSERT INTO studentgrade
(studid, test#, score)
SELECT 1, 1, 50 UNION ALL
SELECT 2, 1, 25 UNION ALL
SELECT 2, 2, 75 UNION ALL
SELECT 3, 1, 30 UNION ALL
SELECT 3, 2, 60 UNION ALL
SELECT 3, 3, 90 UNION ALL
SELECT 4, 1, 50 UNION ALL
SELECT 4, NULL, NULL UNION ALL
SELECT 4, NULL, NULL UNION ALL
SELECT 5, 1, 25 UNION ALL
SELECT 5, 2, 75 UNION ALL
SELECT 5, NULL, NULL UNION ALL
SELECT 6, NULL, NULL UNION ALL
SELECT 7, NULL, NULL UNION ALL
SELECT 7, NULL, NULL UNION ALL
SELECT 8, NULL, NULL UNION ALL
SELECT 8, NULL, NULL UNION ALL
SELECT 8, NULL, NULL UNION ALL
SELECT 9, NULL, NULL UNION ALL
SELECT 9, 2, NULL UNION ALL
SELECT 9, NULL, 50 UNION ALL
SELECT 10, 2, 10 UNION ALL
SELECT 10, 4, 30 UNION ALL
SELECT 10, 8, 60 UNION ALL
SELECT 10, 9, 80 UNION ALL
SELECT 11, 1, 10 UNION ALL
SELECT 11, 2, 30 UNION ALL
SELECT 11, 5, 60 UNION ALL
SELECT 11, 3, 80 UNION ALL
SELECT 11, 8, 80
;
WITH
cteGradeRank AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY studid ORDER BY studid ,score DESC) AS GradeRank,
studid,
Score AS Score
FROM studentgrade
)
SELECT studid,
SUM(CASE WHEN Score >= 0 THEN 1 ELSE 0 END) AS [# of tests taken],
MAX(CASE WHEN GradeRank = 1 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) Score01,
MAX(CASE WHEN GradeRank = 2 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) Score02,
MAX(CASE WHEN GradeRank = 3 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) Score03,
MAX(CASE WHEN GradeRank = 4 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) Score04,
MAX(CASE WHEN GradeRank = 5 THEN ISNULL(CAST(Score AS VARCHAR(10)),'') ELSE '' END) Score05
FROM cteGradeRank
GROUP BY studid
;
Here’s the output from the code above.
studid # of tests taken Score01 Score02 Score03 Score04 Score05
----------- ---------------- ---------- ---------- ---------- ---------- ----------
1 1 50
2 2 75 25
3 3 90 60 30
4 1 50
5 2 75 25
6 0
7 0
8 0
9 1 50
10 4 80 60 30 10
11 5 80 80 60 30 10
All the necessary modifications were done by adding two lines of code (using copy and paste), modifying 5 column titles, and modifying the operand of two WHEN’s. No changes in logic needed to be made. Try making the modification from the original 3 column code that was forced by the “single query” requirement. The very “real world” opportunity to make easily modifiable, easy to support code was missed because of the “single query” requirement. Yes, I know the original requirement may have been designed to force a student into using CASE. I think there are better ways to teach CASE than forcing an non-real world example and that’s why I don’t like such “challenges”… they aren’t “real world” and they don’t give people the opportunity to become creative and, possibly, surpass their mentor.
There are a lot of other missed opportunities in the problem, as well. For example, the name of the given table is “studentgrade”. I suppose that part of the previous teachings could have been to follow best practices and the student is supposed to know enough to name the table using a proper 2 part naming convention such as “dbo.studentgrade” but such challenges on the internet or in books frequently don’t teach that good practice and they don’t demonstrate it by example in the definition of the challenge.
Another missed opportunity… whether it be a challenge on the internet for some sort of prize or a homework problem for a student, column names such as “hi score” in the problem definition pose a real problem. If it’s for a prize, can you name the column according to best practices without a space or will you have to give up on best practices just to meet the exact requirements? The same goes for students. Will the instructor take away points for not following instructions to the “T”? I know that in real life, I’d challenge the designer of the requirements on best practices but will a neophyte be so bold? Lesson lost.
Finally, a huge missed opportunity. Two actually. First, what about performance of code? Has the student or prize competitor been shown a way to create more than a handful of test data in a timely fashion? Likely, not. Lesson lost. As a result, the student will only create a handful of data and the biggest lesson of them all will be lost… that of performance of code because there’s simply not going to be enough data to test with.
Now, assuming that we’re not trying to ram CASE statements down a student’s throat (I’d use a different method than forcing a “single query” requirement which teaches the wrong lesson IMHO anyway), assuming that we do want students to become creative in their code, and assuming that we do want students to learn more than what is taught, “challenges” can still be a great tool but only if they provide a problem that doesn’t require violation of best practices and only if they’re worded correctly.
BWAA-HAAA!!! My word… look at the time! I guess that’s enough for tonight. 😉 I’ll try to get back to this after work tomorrow… errr, today.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 15, 2010 at 1:29 am
Jeff Moden (6/11/2010)
I don't like those challenges because they teach the wrong thing. There are many times where the resolution of a problem using a single query is just flat out the wrong thing to do for scalability and/or performance.
I assume you are talking about Jacob Sebastians here http://beyondrelational.com/blogs/tc/default.aspx.
You are spot on that the single query method can be wrong for a production environment. Period , no arguments against that.
However for the challenge environment it can make things interesting. It forces you to be 'unnatural' , and challenge one or two of your own assumptions on the way.
IMHO it does give good practice in designing a query and mostly importantly , reading and understanding the execution plan. Thats where a challenge can really help your day job.
The amount of time ive spent attempting to remove just that one extra sort operation 🙂
scott.pletcher (6/14/2010)
An interesting quote in a (semi-)recent article from Joe Celko:Here is a characteristic for good SQL programming: do as much work as you can in ONE statement, so the optimizer can have as much information as possible.
[Emphasis on "ONE" not added by me, it was in the original.]
Joe is obviously quite an authority on SQL.
However that doesn't mean that his word is gospel.
Just look at his opposition to surrogate keys or insistence that using the format of YYYY-MM-DD is the correct way to pass dates.
June 15, 2010 at 1:38 am
I guess I'll have to agree to disagree with you on the challenge thing, Dave. And, no... just to be sure, I'm not picking on Jacob's good forum per se. I just don't like the concept because, as you well know (I've seen you do it very well), you can get a whole lot better performance in many cases by using "Divide'n'Conquer" methods.
What I'd really like to see more of are things like the Speed Phreak challenges where there are no limits other than using T-SQL. You can still figure out how to remove that extra sort operation there, too. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
June 15, 2010 at 3:55 am
I agree with Jeff. This sort of challenges, instead of teaching right tecniques and deep understanding of their mechanisms, might lead to the habit of writing code noodles.
I would never write the code for Scott challenge in the way it asked. Actually, I would never expect, in a real life, to be asked to write code in a particular way. Most likely, you will receive requirements from BA and it is your responisbility as professional devloper/consultant to find the optimal solution. That is what you paid for in a real life.
Of cause challenges, like the given by Scott, are good for the brain development, but it is mostly applicable in the school basics: math, science etc. I guess, when you are learning practical subjects (eg. SQL), it is more appropriate to learn how things work in real life, why they work in certain way, what the impacts of practical application of the different tecniques and best practices.
I believe that the given challenge provide nothing more than entertainment - no real benefits in learning SQL - it is my, reserved, opinion.
John, this forum is the best source for learning SQL and its advanced tecniques. Here you will find practical solutions to many different standard and non-standard problems, sometimes with very good discussions and explanations of how and why SQL behaves in a way it does.
Truth Is Born Of Arguments.
June 15, 2010 at 4:02 am
Well, to agree with most of you, those test aren't a real "teacher" IMHO.
A mentor to me, would be a group of resources, not one person.
Use websites like SSC, SQLtips etc, attend regular user group meetings, get involved with the other people there, maybe take on non-paying projects where you assist others...the possibilities are endless!
One single mentor would mean you are going to pay one person alot of money to dedicate their time to your learning path, teaching you everything he/she knows, while there are a whole world of knowledge from different sources at your disposal.
Most important, practice your skills! doing a difficult task once, does not mean you'll get it perfect the next time, plus there is always room for improvement, it's like a Never-Ending SQL Story! 😀
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
This thing is addressing problems that dont exist. Its solution-ism at its worst. We are dumbing down machines that are inherently superior. - Gilfoyle
June 15, 2010 at 7:41 am
Hey John:
Speaking as one who has been "schooled" at SSC (by Jeff, but also by many others), the many mentors provided online by SSC is not to be scoffed at. If you think its tough finding a mentor in YOUR area, try working in a small town in Mississippi sometime. Becoming a part of the community that is SSC has given me a chance to share in the knowledge of people from all over the U.S. (and the UK, and S.A., and N.Z., and Italy, and Hyderabad... .) The community is important, because lets face it, people who really understand or even care about database work are few and far between.
What I love most about SSC is the discussions that wind up running long after the author of the question has gotten an answer to his problem and moved on. They can turn into mini-seminars on the nuances of SQL. There are a lot of intelligent people on here, and often they have different perspectives. It's to your benefit to understand those different perspectives, and to learn how they approach problems.
If you have a question, just post it up. If you think it's an elementary question, there are forums just for newbies. I encourage you to join in here, because it is a most excellent resource and a lot of harmless fun.
Good luck to you.
Bob
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
June 15, 2010 at 8:05 am
Jeff, you can over-engineer, I'll give you that. Huge amounts of wasted I/O to no purpose. But you feel good about it, and obviously for you that's all that counts.
Scott Pletcher, SQL Server MVP 2008-2010
June 15, 2010 at 8:37 am
scott.pletcher (6/15/2010)
Jeff, you can over-engineer, I'll give you that. Huge amounts of wasted I/O to no purpose. But you feel good about it, and obviously for you that's all that counts.
Hey Scott, thanks for hanging out here and providing us with your knowledge.
I'm always happy to hear comments and ideas, especially when they provide value to the community (which you can obviously provide as you are an MVP).
Please concentrate your efforts in that direction rather than continuing this peeing contest.
Viewing 15 posts - 31 through 45 (of 54 total)
You must be logged in to reply to this topic. Login to reply