PIVOT taking forever unless temp table var used

Question

Post reply

PIVOT taking forever unless temp table var used

torpkev

SSCarpal Tunnel

Points: 4819

December 19, 2019 at 8:44 pm

#3709255

This is a bit of an odd one for me..

I have a query that returns:

Response_ID UNIQUEIDENTIFIER,

Alias VARCHAR(MAX),

Answer_Raw VARCHAR(MAX)

This query is quick.. returning in less than a second.

If I try and run this:

SELECT  
Response_ID,
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
FROM 
(SELECT Response_ID, Alias, Answer_Raw FROM Survey_Answers WHERE Survey_ID = 2) a
PIVOT 
( 
MAX(Answer_Raw)
FOR Alias IN (
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
) 
) AS PivotTable;

The result can take anywhere from 2-3 minutes.

However, if I put the results from Survey_Answers into a temporary table variable, the result comes back in less than a second again. So while I have a workaround, I'm totally unsure as to WHY, and any hints in the right direction would be appreciated.

DECLARE @tbl TABLE (
Response_ID UNIQUEIDENTIFIER,
Alias VARCHAR(MAX),
Answer_Raw VARCHAR(MAX)
)

INSERT INTO @tbl
SELECT
Response_ID,
Alias,
Answer_Raw
FROM
Survey_Answers
WHERE
Survey_ID = 2

SELECT  
Response_ID,
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
FROM 
@tbl AS SourceTable 
PIVOT 
( 
MAX(Answer_Raw)
FOR Alias IN (
[Gender],
[YoB],
[AgeAtEnroll],
[Ethnicity],
[Race_americanindian],
[Race_asian],
[Race_black],
[Race_nativehawaiian],
[Race_white],
[Race_unknown],
[Race_notreported],
[Handedness],
[Employment],
[EnglishPrimaryLang],
[LearnedEnglishAge],
[EducationLevel],
[MotherEducationLevel],
[SzHistory],
[NumFamilySz],
[FamilySinastrality],
[LearningDisability],
[RepeatGrade],
[SpecialAccomCoursework]
) 
) AS PivotTable;

--------------------------

I long for a job where my databases dont have any pesky users accessing them 🙂

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Brahmanand Shukla SSC Eights! Points: 990 More actions · Answer 1

Your original query refers to Survey_Answers table and fetches only 3 columns and has a WHERE clause too. This has been done as part of subquery. Since you have not shared the execution plan for comparison, hence we cannot comment what really is causing such a huge difference in execution time? At first instance it seems the subquery is an accused but without proof can't declare it as convict. There is a nice article A SUB-QUERY DOES NOT HURT PERFORMANCE by Grant Fritchey.

I have few questions for you :

In the 2nd run of the modified query with the Table Variable, have you also included the time taken for insert into table variable when you got result within a second? In other words, have you executed the INSERT INTO table variable and SELECT with pivot separately or together?
How many times you executed this query? Is it part of any Stored Procedure?

If you can also share the DDL and some dummy data for us to test then we can test it and share our observations.

SQL Server Carpenter - Consultant & Trainer
Author of Learn T-SQL from Scratch

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 2

Post the plans for both queries and it might be possible to suss out what's going on.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 3

What do you get when you write it with straight criteria instead of selecting from the sub-query?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

torpkev SSCarpal Tunnel Points: 4819 More actions · Answer 4

Thanks all - I thought there might just be some super simple reason like PIVOT just works better with in-memory data etc.

This example of demographics data takes < 1 second with the temp table var and approximately 16 seconds without it. Other, larger sets of data take 2-3 seconds and 2-3 minutes respectively.

A little more detail. Survey_Answers is a view that sits in another database on the same SQL instance. The user pulling the data has full access to both databases. Survey_Answers pulls from 12 tables and with full disclosure includes probably far more information than is really needed for this particular PIVOT query. However, it still seems super weird to me that inserting into a table variable from the view AND doing the pivot takes less than a second while just doing the pivot directly on the view takes 16 seconds.

I've attached 4 files

without_temp_tbl_var.sql - This is the query where I don't use the @tbl var. It takes 16 seconds
without_temp_tbl_var.sqlplan - The execution plan for the above
with_temp_tbl_var.sql - The query not using the @tbl var. It takes less than a second
with_temp_tbl_var.sqlplan - The execution plan for the above

I appreciate any direction you guys can give me!

Thanks

--------------------------

I long for a job where my databases dont have any pesky users accessing them 🙂

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 5

torpkev wrote:

Thanks all - I thought there might just be some super simple reason like PIVOT just works better with in-memory data etc.

There might be!!! Did you try what I suggested in my previous post?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

torpkev SSCarpal Tunnel Points: 4819 More actions · Answer 6

Hi Jeff,

Sorry, I might be dim here - I can't use it without the subquery because I need the where clause on Survey_Answers (limited to Survey_ID = 2)

The WHERE clause before the PIVOT doesn't work.. adding the WHERE clause after the PIVOT ran for over a minute before I killed the query.

Though if you can point out where I'm doing it wrong, that'd be great, I'd love to be able to learn something new 🙂

--------------------------

I long for a job where my databases dont have any pesky users accessing them 🙂