View vs Ad-hoc T-SQL

  • Hello, I have a view that basically joins two tables of > 1million rows.

    Example Only:

    create VIEW TESTVIEW as select a.*, b.* from table1 a join table2 on a.ID = b.ID

    clustered index on the ID field

    When I do Select * from TESTVIEW where ID = 123 it takes ~44 seconds.

    When I just run the ad-hoc T-SQL code in the VIEW it take < 1 second.

    Looking at the execution plans I see the VIEW has two high % steps at the beginning using a clustered index scan on both tables and returning a large number of rows before filtering in one of the final steps.

    The ad-hoc query used a clustered index seek at the beginning and only returns one row at this point.

    So to me anyway it looks like the VIEW returns ALL rows (hence the index scan) first and then does the WHERE clause and returns one row where as the ad-hoc T_SQL does the filter early on??

    Is this normal and can this be prevented so the view runs with the same or similar execution plan as the ad-hoc code as I thought a view might have the intelligence to work out the WHERE clause is selective so process that first...but obviously not.

    I know a view is not necessarily going to be faster but I am surprised by the difference.

    I would be interested to know the logic behind this if anybody know....thanks

  • It highly depends on the view definition vs T-SQL code you're running.

    Can you post the actual code and actual execution plans?

    However, a couple of pointers should apply anyway:

    * Are you selecting the same number of columns?

    * Are filters identical?

    Hope this helps

    Gianluca

    -- Gianluca Sartori

  • I was copying the exact code from inside the view and just added the "WHERE ID = 123" bit at the end, compared to running the view "select * from TESTVIEW where ID = 123

    The view code has a left join between two tables as below.

    As mentioned if i add "and ID = 123" on the end of this and run it as a T-SQL statement it filters first and uses an index seek thus finishing in 1 second

    If I run the view all rows are returned and then the WHERE filter is applied. And with a lot of fields and CASE statements etc being returned for every row it takes over 40 seconds.

    thanks

    SELECT [CUSTOMER_ID]

    ,[Name_Local]

    ,[Name_Eng]

    ,[TITLE]

    ,[FIRST_NAME]

    ,[MIDDLE_NAME]

    ,[LAST_NAME]

    ,[SALUTATION]

    ,[GENDER]

    ,[DOB]

    ,[Marital_Status]

    ,[BUILDING]

    ,[BLOCK]

    ,[STREET_NO]

    ,[STREET]

    ,[SUBURB]

    ,[CITY]

    ,[FLOOR]

    ,[POST_CODE]

    ,[STATE]

    ,[COUNTRY]

    ,[Nationality]

    ,[Occupation]

    ,[Annual_Income]

    ,

    ,[Home_Phone]

    ,[Work_Phone]

    ,[Cell_Phone]

    ,[FAX]

    ,[Enrolled_Store_ID]

    ,[Most_Visited_Store_ID]

    ,[First_Transaction_Date]

    ,[Privacy]

    ,[Customer_Membership]

    , isnull([Member_Status],'1-Never Used Card') as [Member_Status]

    ,[FIRST_DATA_ENTRY]

    ,cast(floor(cast([FIRST_DATA_ENTRY_DATE] as float)) as datetime) [FIRST_DATA_ENTRY_DATE]

    ,[FIRST_DATA_ENTRY_BY]

    ,[Contactable_By_Mail]

    ,[Contactable_By_Email]

    ,[Contactable_By_SMS]

    ,[Contactable_By_Phone]

    ,[Total_Spend_at_Enrolled]

    ,[Total_TXN_at_Enrolled]

    ,[Total_Spend_at_Other]

    ,[Total_TXN_at_Other]

    ,[Last_Transaction_Date]

    ,[Last_Transaction_Store_ID]

    ,[Last_Customer_Level]

    ,[Last_YTDSpend]

    ,[Last_Cardnumber]

    ,[Last_ISO_Cardnumber]

    ,[Last_Name_On_Card]

    ,[Last_Current_Points_Balance]

    ,[Last_Current_Savings_Balance]

    ,[Last_Current_Debit_Balance]

    ,[Total_Spend]

    ,rank()over(order by Total_Spend desc) as Rank_PTD_Spend

    ,[Total_Spend_Ex_GST]

    ,[Total_TXN]

    ,[Total_Credits]

    ,[Total_Points_Deducted]

    ,[Total_Savings_Added]

    ,[Total_Savings_Deducted]

    ,[Total_Debit_Added]

    ,[Total_Debit_Deducted]

    ,[Total_Visits]

    ,[Points_to_next_Voucher]

    ,[No_of_Interim_Awards_Issued]

    ,null as [Migration_flag]

    ,Enrol_Age

    ,[Enrol_Age_Group] = CASE

    WHEN Enrol_Age <14 THEN '< 14 yrs'

    WHEN Enrol_Age between 14 and 17 THEN '14 - 17 yrs'

    WHEN Enrol_Age between 18 and 24 THEN '18 - 24 yrs'

    WHEN Enrol_Age between 25 and 34 THEN '25 - 34 yrs'

    WHEN Enrol_Age between 35 and 49 THEN '35 - 49 yrs'

    WHEN Enrol_Age between 50 and 64 THEN '50 - 64 yrs'

    WHEN Enrol_Age between 65 and 110 THEN '65 + yrs'

    ELSE 'UNKNOWN'

    END

    ,[Current_AGE]

    ,[CURRENT_AGE_GROUP] = case when [Current_AGE] is not null then

    (CASE

    WHEN [Current_AGE] <14 THEN '< 14 yrs'

    WHEN [Current_AGE] between 14 and 17 THEN '14 - 17 yrs'

    WHEN [Current_AGE] between 18 and 24 THEN '18 - 24 yrs'

    WHEN [Current_AGE] between 25 and 34 THEN '25 - 34 yrs'

    WHEN [Current_AGE] between 35 and 49 THEN '35 - 49 yrs'

    WHEN [Current_AGE] between 50 and 64 THEN '50 - 64 yrs'

    WHEN [Current_AGE] between 65 and 110 THEN '65 + yrs'

    ELSE 'UNKNOWN'

    END )

    else 'UNKNOWN'

    end

    ,[DNS]

    ,[DNS_DATE]

    ,[LifeStage_ID]

    , Total_Days_since_Enrol

    , Total_Days_Since_Last_Transaction

    , Total_Days_Since_Last_visit

    ,[CUST_SUNDRY_INT_1]

    ,[CUST_SUNDRY_INT_2]

    ,isnull([CUST_SUNDRY_VARCHAR_1],0) [CUST_SUNDRY_VARCHAR_1]

    ,[CUST_SUNDRY_VARCHAR_2]

    ,[DELETED]

    ,C2.[Batch_Nbr]

    , Batch_no

    FROM dbo.[CUSTOMERS_DW] C1

    LEFT JOIN dbo.[Customers_PTD_KPI] C2 ON C1.[CUSTOMER_ID] = C2.[CUSTOMER_ID]

    where c1.deleted = 0

  • Can you attach the actual execution plans for both statements?

    -- Gianluca Sartori

  • Something's not right here... you should be getting an error about not being able to resolve the Customer_ID column because it lives in both tables and you don't have a table alias on it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The last line on the ad-hoc statement (code inside the view run from SSMS) would normally be "where c1.deleted = 0 and C1.CUSTOMER_ID = 123" and when running the VIEW "Select * from TESTVIEW where customer_id = 123"

    Unfortunately I do not know the technique or protocol to display two execution plans in this forum so they are viewable.

  • I'm talking about the first line of the SELECT list.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • it should be and was "Select C1.[CUSTOMER_ID]" -- a typo on my behalf here.

  • uncle_boris (8/23/2011)


    Unfortunately I do not know the technique or protocol to display two execution plans in this forum so they are viewable.

    Run the code with the "Actual Execution Plan" turned on. Right click the execution plan and save it.

    Start a new post here. Before you save the post, scroll down and find the "Edit Attachements" button.

    Follow your nose from there. šŸ˜‰

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Or... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • "When I do Select * from TESTVIEW where ID = 123 it takes ~44 seconds.

    When I just run the ad-hoc T-SQL code in the VIEW it take < 1 second"

    Boris, when you are selecting from the view, is the where inside the view, or outside? I would imagine that might alter the execution plan if the where was outside the view.

    (I am positive someone here with better internals knowledge than I will correct me if I am wrong)

  • Hello, thank you for all the replies.

    I have found the issue and like a lot of problems the answer is simple and obvious.

    The cause of the difference between running the VIEW and running the VIEW code ad-hoc is caused by the RANK function.

    It looks like when running the ad-hoc code the WHEREclause is run first and then the RANK is done on a record set of one -- hence the < 1 second run time.

    When running the view and passing in the WHERE clause the RANK is done on all the records (> 1000000) and then the WHERE clause retrieves the one row.

    so the respective RANK values end up as:

    Ad-hoc

    Rank: 1

    VIEW

    Rank: 9548

    even though the queries appears to be the same.

    This makes sense to me anyway so an interesting lesson learnt.

    Again, thank you for the replies.

  • Glad to hear you solved your problem!

    You could have seen it clearly from the execution plans, by the way.

    This is why it's so important to always include actual execution plans when posting performance problems.

    Regards,

    Gianluca

    -- Gianluca Sartori

  • .

  • royce.bacon (8/30/2011)


    Iā€™m not positive on this but I believe what is happening is that when you execute SELECT * FROM TESTVIEW WHERE customer_id = 123 that it is first running the view which has no limits/filters in it therefore it returns all records and then the WHERE is applied to the results from the output of the view. When you execute the dynamic SQL with the WHERE in the SQL it knows to filter on the one customer id immediately.

    SQL is not that stupid. The predicate applied to the view will be pushed as far down as possible. Depending on the view, that may be right down to the tables, or it may not, but it depends on what the view is defined as.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply