October 3, 2006 at 4:50 am
Hi All,
I have this query which is misbehaving badly, at the moment it take around 3hours to run, and before it used to take 40 minutes.
At first when I ran this, it took 3 hours and I checked the execution plan, which showed a table scan in red saying some statistics were missing, I ran a simple update statistics statement on the table and re-ran the query again.
Now It has ran for 1.5 hours and I am forced to stop it as its grinding everything to a halt.
SELECT PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103) Startdate,
COUNT(DISTINCT(PERSONURN)) AS 'BILLABLE CDRS',
COUNT(*) - COUNT(DISTINCT(PERSONURN)) AS 'COUNTS',
SUM(CASE WHEN EVENTTYPE & 4 = 4 THEN 1 ELSE 0 END) AS 'CUSTOMER',
SUM(CASE WHEN EVENTTYPE & 8 = 8 THEN 1 ELSE 0 END) AS 'LAPSED',
SUM(CASE WHEN EVENTTYPE & 8 <> 8 AND EVENTTYPE & 4 <> 4 AND EVENTTYPE > 0 THEN 1 ELSE 0 END) AS 'OTHER',
SUM(CASE WHEN EVENTTYPE = 0 THEN 1 ELSE 0 END) AS 'ACTIVE',
COUNT(*) AS 'TOTAL CDRS'
FROM PERSONPROFILE20060922 (nolock)
GROUP BY PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103)
Index on this tble
CREATE UNIQUE INDEX [index_name] ON [dbo].[PERSONPROFILE20060922]( [EVENTDATE],[PERSONURN], [EventType])
THE total number of rows in the table is around 10 milion
PERSONURN nvarchar
EventType int
EVENTDATE datetime
I have also attached the execution plan when the first query took 3hours, and the second time when I updated the statistics, it wasnt worth letting it complete, as it already took 1.5hours and didnt complete.
October 3, 2006 at 6:02 am
here's my suggestions.
SELECT PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103) Startdate,
COUNT(DISTINCT(PERSONURN)) AS 'BILLABLE CDRS',
COUNT(*) - COUNT(DISTINCT(PERSONURN)) AS 'COUNTS',
SUM(CASE WHEN EVENTTYPE & 4 = 4 THEN 1 ELSE 0 END) AS 'CUSTOMER',
SUM(CASE WHEN EVENTTYPE & 8 = 8 THEN 1 ELSE 0 END) AS 'LAPSED',
SUM(CASE WHEN EVENTTYPE & 8 <> 8 AND EVENTTYPE & 4 <> 4 AND EVENTTYPE > 0 THEN 1 ELSE 0 END) AS 'OTHER',
SUM(CASE WHEN EVENTTYPE = 0 THEN 1 ELSE 0 END) AS 'ACTIVE',
COUNT(*) AS 'TOTAL CDRS'
FROM PERSONPROFILE20060922 (nolock)
GROUP BY PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103)
CONVERT(VARCHAR(30),EVENTDATE,103) don't use a convert in the group by! to convert the group by, the ENTIRE table must be scanned, stuffed into temp db with the modified column, and then returned. just group by eventdate, and then change the format when you are printing it at the client application. this is your #1 issue to fix this query.
a general rule to use is never use a function in a WHERE statement or GROUP statement unless you absolutely have to. WHERE X='3' is what they call SARGABLE, but CONVERT(varchar,X) = '3' is not, and will almost always get you a nice slow table scan for your trouble. this is true for any function, because each function is resolved at the field level, and so each row must be touched.
PRODUCTCODE,EVENTDATE--because you are grouping by these two columns, they should either have an index, or be included in the existing index.
EVENTTYPE --because you are doing calculations on this column, I would include it in the index, so that the query plan can use it more effectively.
WHERE X=Y --do you really need the entire 10 million rows reflected in the group by? can you filter the results with a WHERE statement to limit the number of rows? I have to assume that you doneed everything, but if you can limit the rows, do it.
so my final suggestions would be:
CREATE INDEX [ix_PRODCODE_EVENTDATE_EVENTTYPE] ON [dbo].[PERSONPROFILE20060922]( [PRODUCTCODE],[EVENTDATE],[PERSONURN], [EventType])
SELECT PRODUCTCODE, CONVERT(VARCHAR(30),EVENTDATE,103) Startdate,
COUNT(DISTINCT(PERSONURN)) AS 'BILLABLE CDRS',
COUNT(*) - COUNT(DISTINCT(PERSONURN)) AS 'COUNTS',
SUM(CASE WHEN EVENTTYPE & 4 = 4 THEN 1 ELSE 0 END) AS 'CUSTOMER',
SUM(CASE WHEN EVENTTYPE & 8 = 8 THEN 1 ELSE 0 END) AS 'LAPSED',
SUM(CASE WHEN EVENTTYPE & 8 <> 8 AND EVENTTYPE & 4 <> 4 AND EVENTTYPE > 0 THEN 1 ELSE 0 END) AS 'OTHER',
SUM(CASE WHEN EVENTTYPE = 0 THEN 1 ELSE 0 END) AS 'ACTIVE',
COUNT(*) AS 'TOTAL CDRS'
FROM PERSONPROFILE20060922 (nolock)
GROUP BY PRODUCTCODE, EVENTDATE
Lowell
October 3, 2006 at 6:29 am
Hi
I assume you are using convert to get to a round date rather than the full datetime. I rather think that as you are looking to examine every record, indexing is relatively unimportant, you will scan anyway. You could play with adding a computed column for the converted datetime value, if you are a little lucky and carefull you might be able to index it. Alternatively you could use seperate day month year columns and index on those.
However, I would have an investigation into why it is running differently now - has it grown significantly? Is it heavily fragmented (dbcc showcontig)? What is the clustered index and primary key - they seem to be missing from your post.
October 3, 2006 at 7:59 am
Thanks all
With regards to the date column, its a datetime column so I need the convert to roll it up. But I will find a way around that.
Also, its heavily fragmented
- Pages Scanned................................: 231229
- Extents Scanned..............................: 29385
- Extent Switches..............................: 227761
- Avg. Pages per Extent........................: 7.9
- Scan Density [Best Count:Actual Count].......: 12.69% [28904:227762]
- Logical Scan Fragmentation ..................: 51.86%
- Extent Scan Fragmentation ...................: 98.75%
- Avg. Bytes Free per Page.....................: 2878.1
- Avg. Page Density (full).....................: 64.44%
I dont have a Clustered index, as this table is subject to a lot of inserts and having a clustered key there wouldnt do me any good.
October 3, 2006 at 9:01 am
Its probably good to have a clustered index anyway, otherwise you simply have a heap structure. On that many records it would be good.
You need a primary key - and it needs to be specified as such, but it doesn't have to be clustered.
Try defragging the table, but also try setting a primary key and choosing a clustered index. It may not help, but I think it unlikely.
October 3, 2006 at 3:25 pm
- Scan Density [Best Count:Actual Count].......: 12.69% [28904:227762]
- Logical Scan Fragmentation ..................: 51.86%
- Extent Scan Fragmentation ...................: 98.75%
Dude your table is screaming for degragmentation. If you still don't want to create a clustered index, create one and then drop it, but you will have to do that operation regularly
Cheers,
* Noel
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply