December 19, 2019 at 8:44 pm
This is a bit of an odd one for me..
I have a query that returns:
Response_ID UNIQUEIDENTIFIER,
Alias VARCHAR(MAX),
Answer_Raw VARCHAR(MAX)
This query is quick.. returning in less than a second.
If I try and run this:
SELECT
Response_ID,
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
FROM
(SELECT Response_ID, Alias, Answer_Raw FROM Survey_Answers WHERE Survey_ID = 2) a
PIVOT
(
MAX(Answer_Raw)
FOR Alias IN (
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
)
) AS PivotTable;
The result can take anywhere from 2-3 minutes.
However, if I put the results from Survey_Answers into a temporary table variable, the result comes back in less than a second again. So while I have a workaround, I'm totally unsure as to WHY, and any hints in the right direction would be appreciated.
DECLARE @tbl TABLE (
Response_ID UNIQUEIDENTIFIER,
Alias VARCHAR(MAX),
Answer_Raw VARCHAR(MAX)
)
INSERT INTO @tbl
SELECT
Response_ID,
Alias,
Answer_Raw
FROM
Survey_Answers
WHERE
Survey_ID = 2
SELECT
Response_ID,
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
FROM
@tbl AS SourceTable
PIVOT
(
MAX(Answer_Raw)
FOR Alias IN (
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
)
) AS PivotTable;
--------------------------
I long for a job where my databases dont have any pesky users accessing them 🙂
December 20, 2019 at 8:44 am
Your original query refers to Survey_Answers table and fetches only 3 columns and has a WHERE clause too. This has been done as part of subquery. Since you have not shared the execution plan for comparison, hence we cannot comment what really is causing such a huge difference in execution time? At first instance it seems the subquery is an accused but without proof can't declare it as convict. There is a nice article A SUB-QUERY DOES NOT HURT PERFORMANCE by Grant Fritchey.
I have few questions for you :
If you can also share the DDL and some dummy data for us to test then we can test it and share our observations.
December 20, 2019 at 1:39 pm
Post the plans for both queries and it might be possible to suss out what's going on.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
December 20, 2019 at 1:51 pm
What do you get when you write it with straight criteria instead of selecting from the sub-query?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 20, 2019 at 2:24 pm
Thanks all - I thought there might just be some super simple reason like PIVOT just works better with in-memory data etc.
This example of demographics data takes < 1 second with the temp table var and approximately 16 seconds without it. Other, larger sets of data take 2-3 seconds and 2-3 minutes respectively.
A little more detail. Survey_Answers is a view that sits in another database on the same SQL instance. The user pulling the data has full access to both databases. Survey_Answers pulls from 12 tables and with full disclosure includes probably far more information than is really needed for this particular PIVOT query. However, it still seems super weird to me that inserting into a table variable from the view AND doing the pivot takes less than a second while just doing the pivot directly on the view takes 16 seconds.
I've attached 4 files
I appreciate any direction you guys can give me!
Thanks
--------------------------
I long for a job where my databases dont have any pesky users accessing them 🙂
December 20, 2019 at 3:03 pm
Thanks all - I thought there might just be some super simple reason like PIVOT just works better with in-memory data etc.
There might be!!! Did you try what I suggested in my previous post?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 20, 2019 at 3:16 pm
Hi Jeff,
Sorry, I might be dim here - I can't use it without the subquery because I need the where clause on Survey_Answers (limited to Survey_ID = 2)
The WHERE clause before the PIVOT doesn't work.. adding the WHERE clause after the PIVOT ran for over a minute before I killed the query.
Though if you can point out where I'm doing it wrong, that'd be great, I'd love to be able to learn something new 🙂
--------------------------
I long for a job where my databases dont have any pesky users accessing them 🙂
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply