View vs Ad-hoc T-SQL

Question

Post reply

View vs Ad-hoc T-SQL

uncle_boris

SSCommitted

Points: 1588
More actions
August 22, 2011 at 10:18 pm

#246623

Hello, I have a view that basically joins two tables of > 1million rows.
Example Only:
create VIEW TESTVIEW as select a.*, b.* from table1 a join table2 on a.ID = b.ID
clustered index on the ID field
When I do Select * from TESTVIEW where ID = 123 it takes ~44 seconds.
When I just run the ad-hoc T-SQL code in the VIEW it take < 1 second.
Looking at the execution plans I see the VIEW has two high % steps at the beginning using a clustered index scan on both tables and returning a large number of rows before filtering in one of the final steps.
The ad-hoc query used a clustered index seek at the beginning and only returns one row at this point.
So to me anyway it looks like the VIEW returns ALL rows (hence the index scan) first and then does the WHERE clause and returns one row where as the ad-hoc T_SQL does the filter early on??
Is this normal and can this be prevented so the view runs with the same or similar execution plan as the ad-hoc code as I thought a view might have the intelligence to work out the WHERE clause is selective so process that first...but obviously not.
I know a view is not necessarily going to be faster but I am surprised by the difference.
I would be interested to know the logic behind this if anybody know....thanks

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

spaghettidba SSC Guru Points: 105732 More actions · Answer 1

It highly depends on the view definition vs T-SQL code you're running.

Can you post the actual code and actual execution plans?

However, a couple of pointers should apply anyway:

* Are you selecting the same number of columns?

* Are filters identical?

Hope this helps

Gianluca

-- Gianluca Sartori

uncle_boris SSCommitted Points: 1588 More actions · Answer 2

I was copying the exact code from inside the view and just added the "WHERE ID = 123" bit at the end, compared to running the view "select * from TESTVIEW where ID = 123

The view code has a left join between two tables as below.

As mentioned if i add "and ID = 123" on the end of this and run it as a T-SQL statement it filters first and uses an index seek thus finishing in 1 second

If I run the view all rows are returned and then the WHERE filter is applied. And with a lot of fields and CASE statements etc being returned for every row it takes over 40 seconds.

thanks

SELECT [CUSTOMER_ID]

,[Name_Local]

,[Name_Eng]

,[TITLE]

,[FIRST_NAME]

,[MIDDLE_NAME]

,[LAST_NAME]

,[SALUTATION]

,[GENDER]

,[DOB]

,[Marital_Status]

,[BUILDING]

,[BLOCK]

,[STREET_NO]

,[STREET]

,[SUBURB]

,[CITY]

,[FLOOR]

,[POST_CODE]

,[STATE]

,[COUNTRY]

,[Nationality]

,[Occupation]

,[Annual_Income]

,

,[Home_Phone]

,[Work_Phone]

,[Cell_Phone]

,[FAX]

,[Enrolled_Store_ID]

,[Most_Visited_Store_ID]

,[First_Transaction_Date]

,[Privacy]

,[Customer_Membership]

, isnull([Member_Status],'1-Never Used Card') as [Member_Status]

,[FIRST_DATA_ENTRY]

,cast(floor(cast([FIRST_DATA_ENTRY_DATE] as float)) as datetime) [FIRST_DATA_ENTRY_DATE]

,[FIRST_DATA_ENTRY_BY]

,[Contactable_By_Mail]

,[Contactable_By_Email]

,[Contactable_By_SMS]

,[Contactable_By_Phone]

,[Total_Spend_at_Enrolled]

,[Total_TXN_at_Enrolled]

,[Total_Spend_at_Other]

,[Total_TXN_at_Other]

,[Last_Transaction_Date]

,[Last_Transaction_Store_ID]

,[Last_Customer_Level]

,[Last_YTDSpend]

,[Last_Cardnumber]

,[Last_ISO_Cardnumber]

,[Last_Name_On_Card]

,[Last_Current_Points_Balance]

,[Last_Current_Savings_Balance]

,[Last_Current_Debit_Balance]

,[Total_Spend]

,rank()over(order by Total_Spend desc) as Rank_PTD_Spend

,[Total_Spend_Ex_GST]

,[Total_TXN]

,[Total_Credits]

,[Total_Points_Deducted]

,[Total_Savings_Added]

,[Total_Savings_Deducted]

,[Total_Debit_Added]

,[Total_Debit_Deducted]

,[Total_Visits]

,[Points_to_next_Voucher]

,[No_of_Interim_Awards_Issued]

,null as [Migration_flag]

,Enrol_Age

,[Enrol_Age_Group] = CASE

WHEN Enrol_Age <14 THEN '< 14 yrs'

WHEN Enrol_Age between 14 and 17 THEN '14 - 17 yrs'

WHEN Enrol_Age between 18 and 24 THEN '18 - 24 yrs'

WHEN Enrol_Age between 25 and 34 THEN '25 - 34 yrs'

WHEN Enrol_Age between 35 and 49 THEN '35 - 49 yrs'

WHEN Enrol_Age between 50 and 64 THEN '50 - 64 yrs'

WHEN Enrol_Age between 65 and 110 THEN '65 + yrs'

ELSE 'UNKNOWN'

END

,[Current_AGE]

,[CURRENT_AGE_GROUP] = case when [Current_AGE] is not null then

(CASE

WHEN [Current_AGE] <14 THEN '< 14 yrs'

WHEN [Current_AGE] between 14 and 17 THEN '14 - 17 yrs'

WHEN [Current_AGE] between 18 and 24 THEN '18 - 24 yrs'

WHEN [Current_AGE] between 25 and 34 THEN '25 - 34 yrs'

WHEN [Current_AGE] between 35 and 49 THEN '35 - 49 yrs'

WHEN [Current_AGE] between 50 and 64 THEN '50 - 64 yrs'

WHEN [Current_AGE] between 65 and 110 THEN '65 + yrs'

ELSE 'UNKNOWN'

END )

else 'UNKNOWN'

end

,[DNS]

,[DNS_DATE]

,[LifeStage_ID]

, Total_Days_since_Enrol

, Total_Days_Since_Last_Transaction

, Total_Days_Since_Last_visit

,[CUST_SUNDRY_INT_1]

,[CUST_SUNDRY_INT_2]

,isnull([CUST_SUNDRY_VARCHAR_1],0) [CUST_SUNDRY_VARCHAR_1]

,[CUST_SUNDRY_VARCHAR_2]

,[DELETED]

,C2.[Batch_Nbr]

, Batch_no

FROM dbo.[CUSTOMERS_DW] C1

LEFT JOIN dbo.[Customers_PTD_KPI] C2 ON C1.[CUSTOMER_ID] = C2.[CUSTOMER_ID]

where c1.deleted = 0

spaghettidba SSC Guru Points: 105732 More actions · Answer 3

Can you attach the actual execution plans for both statements?

-- Gianluca Sartori

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 4

Something's not right here... you should be getting an error about not being able to resolve the Customer_ID column because it lives in both tables and you don't have a table alias on it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

uncle_boris SSCommitted Points: 1588 More actions · Answer 5

The last line on the ad-hoc statement (code inside the view run from SSMS) would normally be "where c1.deleted = 0 and C1.CUSTOMER_ID = 123" and when running the VIEW "Select * from TESTVIEW where customer_id = 123"

Unfortunately I do not know the technique or protocol to display two execution plans in this forum so they are viewable.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 6

I'm talking about the first line of the SELECT list.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

uncle_boris SSCommitted Points: 1588 More actions · Answer 7

it should be and was "Select C1.[CUSTOMER_ID]" -- a typo on my behalf here.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 8

uncle_boris (8/23/2011)
Unfortunately I do not know the technique or protocol to display two execution plans in this forum so they are viewable.

Run the code with the "Actual Execution Plan" turned on. Right click the execution plan and save it.

Start a new post here. Before you save the post, scroll down and find the "Edit Attachements" button.

Follow your nose from there. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Or... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

David Lester Hall of Fame Points: 3647 More actions · Answer 10

"When I do Select * from TESTVIEW where ID = 123 it takes ~44 seconds.

When I just run the ad-hoc T-SQL code in the VIEW it take < 1 second"

Boris, when you are selecting from the view, is the where inside the view, or outside? I would imagine that might alter the execution plan if the where was outside the view.

(I am positive someone here with better internals knowledge than I will correct me if I am wrong)

uncle_boris SSCommitted Points: 1588 More actions · Answer 11

Hello, thank you for all the replies.

I have found the issue and like a lot of problems the answer is simple and obvious.

The cause of the difference between running the VIEW and running the VIEW code ad-hoc is caused by the RANK function.

It looks like when running the ad-hoc code the WHEREclause is run first and then the RANK is done on a record set of one -- hence the < 1 second run time.

When running the view and passing in the WHERE clause the RANK is done on all the records (> 1000000) and then the WHERE clause retrieves the one row.

so the respective RANK values end up as:

Ad-hoc

Rank: 1

VIEW

Rank: 9548

even though the queries appears to be the same.

This makes sense to me anyway so an interesting lesson learnt.

Again, thank you for the replies.

spaghettidba SSC Guru Points: 105732 More actions · Answer 12

Glad to hear you solved your problem!

You could have seen it clearly from the execution plans, by the way.

This is why it's so important to always include actual execution plans when posting performance problems.

Regards,

Gianluca

-- Gianluca Sartori

royce.bacon SSC Eights! Points: 990 More actions · Answer 13

royce.bacon

SSC Eights!

Points: 990

August 30, 2011 at 11:03 am

#1375137

.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

royce.bacon (8/30/2011)
I’m not positive on this but I believe what is happening is that when you execute SELECT * FROM TESTVIEW WHERE customer_id = 123 that it is first running the view which has no limits/filters in it therefore it returns all records and then the WHERE is applied to the results from the output of the view. When you execute the dynamic SQL with the WHERE in the SQL it knows to filter on the one customer id immediately.

SQL is not that stupid. The predicate applied to the view will be pushed as far down as possible. Depending on the view, that may be right down to the tables, or it may not, but it depends on what the view is defined as.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass