Why does where clause cause query to slow down so much?

  • Hi,

    I have the following query that I'm using to extract some demo report data.

    SELECT

    HOURS.EMPID,

    HOURS.WORKDATE,

    HOURS.MINS,

    CODES.CORRCODEID,

    CODES.CORRCODE,

    CODES.NAME AS [CORRCODE NAME],

    CODES.CWTYPE,

    DEFAULT_STORE.[STORE NAME] AS [DEFAULT STORE NAME],

    DEFAULT_STORE.[STORE ID] AS [DEFAULT STORE ID],

    DEFAULT_STORE.[STORE SQUARE FOOTAGE] AS [DEFAULT STORE SQUARE FOOTAGE],

    ACTUAL_STORE.[STORE NAME] AS [ACTUAL STORE NAME],

    ACTUAL_STORE.[STORE ID] AS [ACTUAL STORE ID],

    ACTUAL_STORE.[STORE SQUARE FOOTAGE] AS [ACTUAL STORE SQUARE FOOTAGE],

    WORKED_DEPT_REC.PARENTID [WORKED_DEPT_PARENTID],

    WORKED_DEPT_REC.NAME [WORKED_DEPT_NAME],

    WORKED_DEPT_REC.COSTCENTRE [WORKED_DEPT_COSTCENTRE],

    WORKED_DEPT_REC.DEPTID [WORKED_DEPT_DEPTID],

    DEFAULT_DEPT_REC.PARENTID [DEFAULT_DEPT_PARENTID],

    DEFAULT_DEPT_REC.NAME [DEFAULTDEPT_DEPT_NAME],

    DEFAULT_DEPT_REC.COSTCENTRE [DEFAULT_DEPT_COSTCENTRE],

    DEFAULT_DEPT_REC.DEPTID [DEFAULT_DEPT_DEPTID],

    DEFAULT_SHIFT.SHIFTID [DEFAULT_SHIFT_SHIFTID],

    DEFAULT_SHIFT.TARGET [DEFAULT_SHIFT_TARGET_MINS],

    DEFAULT_SHIFT.NAME [DEFAULT_SHIFT_NAME],

    ACTUAL_SHIFT.SHIFTID [ACTUAL_SHIFT_SHIFTID],

    ACTUAL_SHIFT.TARGET [ACTUAL_SHIFT_TARGET_MINS],

    ACTUAL_SHIFT.NAME [ACTUAL_SHIFT_NAME]

    --INTO

    --REPORT_DATA

    FROM

    CORRECTI HOURS

    -- GET THE CORRECTIONS CODE DETAILS

    INNER JOIN

    CORRCODE CODES

    ON

    HOURS.CORRCODE = CODES.CORRCODEID

    -- JOIN ON THE EMPLOYEE RECORD

    INNER JOIN

    EMPLOYEE E

    ON

    HOURS.EMPID = E.EMPID

    -- GET THE WORKED DEPT RECORD FROM COMPANY STRUCTURE TABLE

    LEFT OUTER JOIN

    COMPSTRU WORKED_DEPT_REC

    ON

    WORKED_DEPT_REC.DEPTID = ISNULL(HOURS.DEPTID, E.DEPTID)

    -- GET THE DEFAULT DEPT ID FROM THE EMPLOYEE HISTORY

    INNER JOIN

    EMP_HIST DEFAULT_DEPT

    ON

    DEFAULT_DEPT.FINDEX = '-z' AND

    DEFAULT_DEPT.EMPID = HOURS.EMPID AND

    HOURS.WORKDATE BETWEEN DEFAULT_DEPT.EFFECTIVE AND DEFAULT_DEPT.EFFECTIVEEND

    -- GET THE DEFAULT DEPT RECORD FROM COMPANY STRUCTURE TABLE

    LEFT OUTER JOIN

    COMPSTRU DEFAULT_DEPT_REC

    ON

    DEFAULT_DEPT_REC.DEPTID = ISNULL(DEFAULT_DEPT.NEWVALUE, E.DEPTID)

    -- GET THE ROSTER ID FROM THE EMPLOYEE HISTORY

    INNER JOIN

    EMP_HIST DEFAULT_ROSTER

    ON

    DEFAULT_ROSTER.FINDEX = '-x' AND

    DEFAULT_ROSTER.EMPID = HOURS.EMPID AND

    HOURS.WORKDATE BETWEEN DEFAULT_ROSTER.EFFECTIVE AND DEFAULT_ROSTER.EFFECTIVEEND

    -- GET THE DEFAULT SHIFT ID FROM ROSTDAYS

    LEFT OUTER JOIN

    ROSTDAYS RD

    ON

    DEFAULT_ROSTER.NEWVALUE = ISNULL(RD.ROSTERID, E.CLK_ROSTER) AND

    RD.WORKDATE = HOURS.WORKDATE

    LEFT OUTER JOIN

    SHIFTS DEFAULT_SHIFT

    ON

    DEFAULT_SHIFT.SHIFTID = RD.SHIFTID

    -- GET THE ACTUAL SHIFT RECORD FROM BASIC

    LEFT OUTER JOIN

    BASIC B

    ON

    B.WORKDATE = HOURS.WORKDATE AND

    B.EMPID = HOURS.EMPID AND

    B.SUBALLOC = '0'

    LEFT OUTER JOIN

    SHIFTS ACTUAL_SHIFT

    ON

    ACTUAL_SHIFT.SHIFTID = ISNULL(B.SHIFTID, RD.SHIFTID)

    LEFT OUTER JOIN

    (

    SELECT

    STORE.NAME [STORE NAME],

    STORE.DEPTID [STORE ID],

    STORE.OTHER [STORE SQUARE FOOTAGE],

    DEPTS.DEPTID

    FROM

    COMPSTRU STORE

    OUTER APPLY

    dbo.Tesco_udf_Dept_Tree(STORE.DEPTID,1) AS DEPTS

    WHERE

    STORE.PARENTID = '!#'

    ) DEFAULT_STORE

    ON

    DEFAULT_DEPT_REC.DEPTID = DEFAULT_STORE.DEPTID

    LEFT OUTER JOIN

    (

    SELECT

    STORE.NAME [STORE NAME],

    STORE.DEPTID [STORE ID],

    STORE.OTHER [STORE SQUARE FOOTAGE],

    DEPTS.DEPTID

    FROM

    COMPSTRU STORE

    OUTER APPLY

    dbo.Tesco_udf_Dept_Tree(STORE.DEPTID,1) AS DEPTS

    WHERE

    STORE.PARENTID = '!#'

    ) ACTUAL_STORE

    ON

    WORKED_DEPT_REC.DEPTID = ACTUAL_STORE.DEPTID

    It runs quickly enough for my needs. It's just that when I add the following where clause:

    (WHERE DEFAULT_STORE.[STORE ID] <> ACTUAL_STORE.[STORE ID])

    the query ends up going from taking 18 seconds to run to failing after an hour with the tempdb expanding to 50GB and running out of disk space.

    Can someone enlighten me as to why the where clause has such an effect as I'm only filtering on columns already included in the query without the where clause? Also, the actual execution plan changes significantly with the addition of the WHERE.

    Thanks

    Mick

  • It is called SARGability. When you add that where clause it causes a full scan. It has to evaluate every single row to determine if it belongs in the result set.

    You could try splitting this logic into two pieces.

    WHERE DEFAULT_STORE.[STORE ID] > ACTUAL_STORE.[STORE ID]

    OR

    DEFAULT_STORE.[STORE ID] < ACTUAL_STORE.[STORE ID]

    Do a quick google search on SARGability and you will find lots of articles discussing what is happening with your query. Makes sense once you understand it.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply