Query performance slows drastically with an Index

  • I'm using SQL2005 x64 (SP2/CTP) dev ed on dual-dual 3ghz machine with 8gb ram. The data files a nicely spread out on raid system on their own spindles. I'm the only user.

    I have a query that is joining rows from a table with 45 million records to a table with with out 1000 rows and uses a single column predicate for left table that results in about 140k rows.

    If I run the query without an Index it does a table scan of the left table and completes the query in about 2 mins and for most of the time use 100% of CPU time avialable.

    If I build a non-clustered index of the left table based on the predicate column and the joining colum, and then include all the other columns covered in the query it uses an index-seek and it take about 30 mins and uses about 13% of CPU.

    The odd thing is is that the execution plan looks as expected for both scenarios: in the no-index version the table scan is at 93%; in the index version the index-seek is about 4% and all the other processes are more evenly distrubuted.

    If anybody has any clues where to start looking I'd be grateful.

  • If you could include the Queries, table schema, and indexes etc. it would make solving this issue much easier.

    Feel free to rename objects and columns if it would be a security risk.  All that matters is the datatypes on the tables.

    Col1 VARCHAR(30),

    Col2 SMALLINT,

    etc.

  • OK, I will do this. It is then end of my working day and I've been flummoxed with this problems for two days now. I just thought I'd ask the question first put I'll add all the DDLs tommorrow.

    Thanks

  • OK. I have slightly progressed on this and appears not to be the index, but a join.

    In query 1, I am joining two tables - one has a list of business addresses, the other a postcode boundary file and the join is on the postcode. The query is selecting just those business that are with the areas defined by the boundary file. This works in less that 1sec. So the indexing seems fine.

    In query 2, I am do the same as query 1, only I am also joining the postcode boundary file a 2nd time but on the businesses previous postcode. The query is then attempting to select only those businesses that have moved into the area. This takes several minutes even with the index.

    QUERY 1

    /* Declare Table variable for the joining table,

    and populate with postal code data from the boundary table [v2_lookup].[dbo].[special_geog_class] */

    DECLARE @table TABLE (

    lkup_postcode VARCHAR(8),

    lkup_area VARCHAR(128))

    INSERT INTO @table

    SELECT postcode,

    area

    FROM V2_LOOKUP.DBO.SPECIAL_GEOG_CLASS

    WHERE SG = 'SGMO1PAT'

    /*

    Select from the core table [BETA]

    where records were not new

    join to the boundary table

    */

    SELECT '02-movesin' ROWNAME,

    LC.lkup_area,

    CL.B_YP_GRP_CODE,

    CL.PRV_YP,

    CL.B_SOURCE,

    CL.CHG_NEW,

    CL.CHG_OLD,

    COUNT(* ) CNT,

    SUM(CL.B_ACTUAL_SIZE) EMP,

    SUM(CL.PRV_ACTUAL_SIZE) PRV_EMP

    FROM __V2_BETA_DEV_R2.DBO.BETA CL

    JOIN @table LC

    ON LC.lkup_postcode = CL.B_POSTCODE

    WHERE CL.CHG_NEW = 'n'

    GROUP BY LC.lkup_area ,--isnull(lc2.lkup_area ,prv_nuts4),

    CL.B_YP_GRP_CODE,CL.PRV_YP,CL.B_SOURCE,CL.CHG_NEW,CL.CHG_OLD

    QUERY 2

    /* Declare Table variable for the joining table,

    and populate with postal code data from the boundary table [v2_lookup].[dbo].[special_geog_class] */

    DECLARE @table TABLE (

    lkup_postcode VARCHAR(8),

    lkup_area VARCHAR(128))

    INSERT INTO @table

    SELECT postcode,

    area

    FROM V2_LOOKUP.DBO.SPECIAL_GEOG_CLASS

    WHERE SG = 'SGMO1PAT'

    /*

    Select from the core table [BETA]

    where records were not new

    join to the boundary table

    */

    SELECT '02-movesin' ROWNAME,

    LC.lkup_area,

    CL.B_YP_GRP_CODE,

    CL.PRV_YP,

    CL.B_SOURCE,

    CL.CHG_NEW,

    CL.CHG_OLD,

    COUNT(* ) CNT,

    SUM(CL.B_ACTUAL_SIZE) EMP,

    SUM(CL.PRV_ACTUAL_SIZE) PRV_EMP

    FROM __V2_BETA_DEV_R2.DBO.BETA CL

    JOIN @table LC

    ON LC.lkup_postcode = CL.B_POSTCODE

    left outer JOIN @table LC2

    ON LC2.lkup_postcode = CL.PRV_PCODE

    WHERE LC.lkup_area != ISNULL(LC2.lkup_area ,PRV_NUTS4) AND

    CL.CHG_NEW = 'n'

    GROUP BY LC.lkup_area ,isnull(lc2.lkup_area ,prv_nuts4),

    CL.B_YP_GRP_CODE,CL.PRV_YP,CL.B_SOURCE,CL.CHG_NEW,CL.CHG_OLD

    DDL for [BETA]

    CREATE TABLE [dbo].[beta](

    [b_file] [varchar](4) NOT NULL,

    [b_name] [varchar](64) NULL,

    [BM_urn] [varchar](16) NULL,

    [BL_urn] [varchar](16) NULL,

    [b_source] [varchar](16) NULL,

    [b_ADDED_date] [varchar](32) NULL,

    [b_AMENDED_date] [varchar](32) NULL,

    [b_actual_size] [int] NULL,

    [b_yp_grp_code] [varchar](8) NULL,

    [b_postcode] [varchar](8) NULL,

    [b_NUTS5] [varchar](10) NULL,

    [b_nuts4] [varchar](7) NULL,

    [b_nuts3] [varchar](5) NULL,

    [b_nuts2] [varchar](4) NULL,

    [b_nuts1] [varchar](3) NULL,

    [b_SOA] [varchar](9) NULL,

    [b_country] [varchar](2) NULL,

    [prv_actual_size] [int] NULL,

    [chg_actual_size] [varchar](1) NULL,

    [prv_YP] [varchar](5) NULL,

    [chg_YP] [varchar](1) NULL,

    [prv_pcode] [varchar](8) NULL,

    [prv_NUTS5] [varchar](10) NULL,

    [prv_NUTS4] [varchar](7) NULL,

    [prv_NUTS3] [varchar](5) NULL,

    [prv_NUTS2] [varchar](4) NULL,

    [prv_NUTS1] [varchar](3) NULL,

    [chg_NUTS5] [varchar](1) NULL,

    [prv_SOA] [varchar](9) NULL,

    [chg_SOA] [varchar](1) NULL,

    [chg_new] [varchar](1) NULL,

    [chg_old] [varchar](1) NULL

    ) ON [PRIMARY]

    DDL for Index of [BETA]

    CREATE NONCLUSTERED INDEX [pcodex2] ON [dbo].[beta]

    (

    [chg_new] ASC,

    [b_postcode] ASC,

    [prv_pcode] ASC,

    [prv_NUTS4] ASC,

    [b_file] ASC,

    [b_actual_size] ASC,

    [prv_actual_size] ASC

    )

    INCLUDE ( [b_source],

    [prv_YP],

    [chg_YP],

    [chg_SOA],

    [chg_old],

    [b_yp_grp_code])

    WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    DDL for Index of [Special_geog_class]

    CREATE NONCLUSTERED INDEX [speg] ON [dbo].[Special_geog_class]

    (

    [sg] ASC,

    ASC,

    [postcode] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF) ON [PRIMARY]

  • For this query, you are likely to get better performance by indexing LC.lkup_postcode, and creating an index on CL.B_POSTCODE and an index on CL.PRV_PCODE.  Adding lkup_area and possibly prv_nuts4 to one or both of these indexes may help.  You cannot explicitly add indexes to table variables, so use a #temp table instead.  http://databases.aspfaq.com/database/should-i-use-a-temp-table-or-a-table-variable.html

    If you get acceptable performance with postcode, you may not want to add the other index if space is a concern.

    Good luck,

    Bill

  • Would getting rid of the temp table altogether help things?  Then you could just put the index on the table if it doesn't already exist.

  • If your __V2_BETA_DEV_R2.DBO.BETA CL retruns less rows when "CL.CHG_NEW = 'n'" then following should be faster...

    Check it out...

    --QUERY 1

    /* Declare Table variable for the joining table,

    and populate with postal code data from the boundary table [v2_lookup].[dbo].[special_geog_class] */

    DECLARE

    @table TABLE (

    lkup_postcode

    VARCHAR(8),

    lkup_area

    VARCHAR(128))

    INSERT

    INTO @table

    SELECT

    postcode,

    area

    FROM

    V2_LOOKUP.DBO.SPECIAL_GEOG_CLASS

    WHERE

    SG = 'SGMO1PAT'

    /*

    Select from the core table [BETA]

    where records were not new

    join to the boundary table

    */

    select

    * into #test FROM __V2_BETA_DEV_R2.DBO.BETA CL

    where

    CL.CHG_NEW = 'n'

    SELECT

    '02-movesin' ROWNAME,

    LC

    .lkup_area,

    CL

    .B_YP_GRP_CODE,

    CL

    .PRV_YP,

    CL

    .B_SOURCE,

    CL

    .CHG_NEW,

    CL

    .CHG_OLD,

    COUNT

    (* ) CNT,

    SUM

    (CL.B_ACTUAL_SIZE) EMP,

    SUM

    (CL.PRV_ACTUAL_SIZE) PRV_EMP

    FROM

    #test CL

    JOIN

    @table LC

    ON

    LC.lkup_postcode = CL.B_POSTCODE

    GROUP

    BY LC.lkup_area ,--isnull(lc2.lkup_area ,prv_nuts4),

    CL

    .B_YP_GRP_CODE,CL.PRV_YP,CL.B_SOURCE,CL.CHG_NEW,CL.CHG_OLD

    --QUERY 2

    /* Declare Table variable for the joining table,

    and populate with postal code data from the boundary table [v2_lookup].[dbo].[special_geog_class] */

    DECLARE

    @table TABLE (

    lkup_postcode

    VARCHAR(8),

    lkup_area

    VARCHAR(128))

    INSERT

    INTO @table

    SELECT

    postcode,

    area

    FROM

    V2_LOOKUP.DBO.SPECIAL_GEOG_CLASS

    WHERE

    SG = 'SGMO1PAT'

    select

    * into #test FROM __V2_BETA_DEV_R2.DBO.BETA CL

    where

    CL.CHG_NEW = 'n'

    /*

    Select from the core table [BETA]

    where records were not new

    join to the boundary table

    */

    SELECT

    '02-movesin' ROWNAME,

    LC

    .lkup_area,

    CL

    .B_YP_GRP_CODE,

    CL

    .PRV_YP,

    CL

    .B_SOURCE,

    CL

    .CHG_NEW,

    CL

    .CHG_OLD,

    COUNT

    (* ) CNT,

    SUM

    (CL.B_ACTUAL_SIZE) EMP,

    SUM

    (CL.PRV_ACTUAL_SIZE) PRV_EMP

    FROM

    #test CL

    JOIN

    @table LC

    ON

    LC.lkup_postcode = CL.B_POSTCODE

    left

    outer JOIN @table LC2

    ON

    LC2.lkup_postcode = CL.PRV_PCODE

    WHERE

    LC.lkup_area != ISNULL(LC2.lkup_area ,PRV_NUTS4)

    GROUP

    BY LC.lkup_area ,isnull(lc2.lkup_area ,prv_nuts4),

    CL

    .B_YP_GRP_CODE,CL.PRV_YP,CL.B_SOURCE,CL.CHG_NEW,CL.CHG_OLD

     

    MohammedU
    Microsoft SQL Server MVP

  • thanks for all of your replies...

    Bill, I'll try your solution later

    Sam, removing the table variable and using the tables direct adds time to the query performance as the initital scan using the predicate is required to be implemented twice, or at least that's my explanation for the increase

    Mohammed, the use of #temp table is marginal because the blockage is occuring with the the left join

    but I'll try and take some of your idea through this weekened and see where we get to

    thanks again and if any body else has any suggestions please lets have them

    Garry

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply