Can this Query be Tune up ! Its going to kill a server

  • Hi All,

    I have this query which is misbehaving badly, at the moment it take around 3hours to run, and before it used to take 40 minutes.

    At first when I ran this, it took 3 hours and I checked the execution plan, which showed a table scan in red saying some statistics were missing, I ran a simple update statistics statement on the table and re-ran the query again.

    Now It has ran for 1.5 hours and I am forced to stop it as its grinding everything to a halt.

    SELECT PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103) Startdate,

    COUNT(DISTINCT(PERSONURN)) AS 'BILLABLE CDRS',

    COUNT(*) - COUNT(DISTINCT(PERSONURN)) AS 'COUNTS',

    SUM(CASE WHEN EVENTTYPE & 4 = 4 THEN 1 ELSE 0 END) AS 'CUSTOMER',

    SUM(CASE WHEN EVENTTYPE & 8 = 8 THEN 1 ELSE 0 END) AS 'LAPSED',

    SUM(CASE WHEN EVENTTYPE & 8 <> 8 AND EVENTTYPE & 4  <> 4 AND EVENTTYPE > 0 THEN 1 ELSE 0 END) AS 'OTHER',

    SUM(CASE WHEN EVENTTYPE = 0 THEN 1 ELSE 0 END) AS 'ACTIVE',

    COUNT(*) AS 'TOTAL CDRS'

    FROM PERSONPROFILE20060922 (nolock)

    GROUP BY PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103)

    Index on this tble

     CREATE  UNIQUE  INDEX [index_name] ON [dbo].[PERSONPROFILE20060922]( [EVENTDATE],[PERSONURN], [EventType])

     

    THE total number of rows in the table is around 10 milion

    PERSONURN nvarchar

    EventType int

    EVENTDATE datetime

    I have also attached the execution plan when the first query took 3hours, and the second time when I updated the statistics, it wasnt worth letting it complete, as it already took 1.5hours and didnt complete.

    Execution plan for the first run took over 3hours


    Kindest Regards,

    John Burchel (Trainee Developer)

  •  

    here's my suggestions.

    SELECT PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103) Startdate,

    COUNT(DISTINCT(PERSONURN)) AS 'BILLABLE CDRS',

    COUNT(*) - COUNT(DISTINCT(PERSONURN)) AS 'COUNTS',

    SUM(CASE WHEN EVENTTYPE & 4 = 4 THEN 1 ELSE 0 END) AS 'CUSTOMER',

    SUM(CASE WHEN EVENTTYPE & 8 = 8 THEN 1 ELSE 0 END) AS 'LAPSED',

    SUM(CASE WHEN EVENTTYPE & 8 <> 8 AND EVENTTYPE & 4  <> 4 AND EVENTTYPE > 0 THEN 1 ELSE 0 END) AS 'OTHER',

    SUM(CASE WHEN EVENTTYPE = 0 THEN 1 ELSE 0 END) AS 'ACTIVE',

    COUNT(*) AS 'TOTAL CDRS'

    FROM PERSONPROFILE20060922 (nolock)

    GROUP BY PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103)

    CONVERT(VARCHAR(30),EVENTDATE,103) don't use a convert in the group by! to convert the group by, the ENTIRE table must be scanned, stuffed into temp db with the modified column, and then returned. just group by eventdate, and then change the format when you are printing it at the client application. this is your #1 issue to fix this query.

    a general rule to use is never use a function in a WHERE statement or GROUP statement unless you absolutely have to. WHERE X='3' is what they call SARGABLE, but CONVERT(varchar,X) = '3' is not, and will almost always get you a nice slow table scan for your trouble. this is true for any function, because each function is resolved at the field level, and so each row must be touched.

    PRODUCTCODE,EVENTDATE--because you are grouping by these two columns, they should either have an index, or be included in the existing index.

    EVENTTYPE --because you are doing calculations on this column, I would include it in the index, so that the query plan can use it more effectively.

    WHERE X=Y --do you really need the entire 10 million rows reflected in the group by? can you filter the results with a WHERE statement to limit the number of rows? I have to assume that you doneed everything, but if you can limit the rows, do it.

    so my final suggestions would be:

    CREATE  INDEX [ix_PRODCODE_EVENTDATE_EVENTTYPE] ON [dbo].[PERSONPROFILE20060922]( [PRODUCTCODE],[EVENTDATE],[PERSONURN], [EventType])

    SELECT PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103) Startdate,

    COUNT(DISTINCT(PERSONURN)) AS 'BILLABLE CDRS',

    COUNT(*) - COUNT(DISTINCT(PERSONURN)) AS 'COUNTS',

    SUM(CASE WHEN EVENTTYPE & 4 = 4 THEN 1 ELSE 0 END) AS 'CUSTOMER',

    SUM(CASE WHEN EVENTTYPE & 8 = 8 THEN 1 ELSE 0 END) AS 'LAPSED',

    SUM(CASE WHEN EVENTTYPE & 8 <> 8 AND EVENTTYPE & 4  <> 4 AND EVENTTYPE > 0 THEN 1 ELSE 0 END) AS 'OTHER',

    SUM(CASE WHEN EVENTTYPE = 0 THEN 1 ELSE 0 END) AS 'ACTIVE',

    COUNT(*) AS 'TOTAL CDRS'

    FROM PERSONPROFILE20060922 (nolock)

    GROUP BY PRODUCTCODE, EVENTDATE

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Hi

     

    I assume you are using convert to get to a round date rather than the full datetime.  I rather think that as you are looking to examine every record, indexing is relatively unimportant, you will scan anyway.  You could play with adding a computed column for the converted datetime value, if you are a little lucky and carefull you might be able to index it.  Alternatively you could use seperate day month year columns and index on those.

    However, I would have an investigation into why it is running differently now - has it grown significantly?  Is it heavily fragmented (dbcc showcontig)?  What is the clustered index and primary key - they seem to be missing from your post.

     

     

     

     

     

     

  • Thanks all

     

    With regards to the date column, its a datetime column so I need the convert to roll it up. But I will find a way around that.

    Also, its heavily fragmented

    - Pages Scanned................................: 231229

    - Extents Scanned..............................: 29385

    - Extent Switches..............................: 227761

    - Avg. Pages per Extent........................: 7.9

    - Scan Density [Best Count:Actual Count].......: 12.69% [28904:227762]

    - Logical Scan Fragmentation ..................: 51.86%

    - Extent Scan Fragmentation ...................: 98.75%

    - Avg. Bytes Free per Page.....................: 2878.1

    - Avg. Page Density (full).....................: 64.44%

    I dont have a Clustered index, as this table is subject to a lot of inserts and having a clustered key there wouldnt do me any good.

     


    Kindest Regards,

    John Burchel (Trainee Developer)

  • Its probably good to have a clustered index anyway, otherwise you simply have a heap structure.  On that many records it would be good.

    You need a primary key - and it needs to be specified as such, but it doesn't have to be clustered.

    Try defragging the table, but also try setting a primary key and choosing a clustered index.  It may not help, but I think it unlikely.

  • - Scan Density [Best Count:Actual Count].......: 12.69% [28904:227762]

    - Logical Scan Fragmentation ..................: 51.86%

    - Extent Scan Fragmentation ...................: 98.75%

    Dude your table is screaming for degragmentation. If you still don't want to create a clustered index, create one and then drop it, but you will have to do that operation regularly

    Cheers,


    * Noel

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply