February 20, 2010 at 5:04 am
CREATE TABLE SampleData
(
ID INTEGER PRIMARY KEY,
UID INTEGER NOT NULL,
Stage INTEGER NOT NULL,
Status INTEGER NOT NULL,
Date DATETIME NOT NULL
);
GO
INSERT SampleData
(ID,UID,Stage,Status,Date)
VALUES (01, 1, 1, 2, '12 Jan 2010'),
(02, 2, 2, 2, '13 Jan 2010'),
(03, 1, 1, 3, '13 Jan 2010'),
(04, 3, 2, 1, '20 Jan 2010'),
(05, 1, 2, 2, '25 Jan 2010'),
(06, 2, 1, 1, '25 Jan 2010'),
(07, 1, 2, 2, '02 Feb 2010'),
(08, 1, 1, 1, '02 Feb 2010'),
(09, 2, 2, 3, '12 Feb 2010'),
(10, 3, 2, 2, '12 Feb 2010'),
(11, 2, 1, 1, '12 Feb 2010'),
(12, 1, 2, 3, '12 Feb 2010'),
(13, 2, 1, 2, '12 Feb 2010'),
(14, 2, 2, 3, '22 Feb 2010');
GO
From table I want for given date range each user wise Stage & Status performance Where I want count for 1. stage 1 & status 3 2. stage 2 & status 2 3. stage 2 & status 3 4. stage 3 & status 3
only right now my query is
Select UID,
sum( (case Stage when 1 then (case Status when 3 then 1 else 0 end ) else 0 end )) as Stage13,
sum( (case Stage when 2 then (case Status when 2 then 1 else 0 end ) else 0 end ) ) as Stage22,
sum( (case Stage when 2 then (case Status when 3 then 1 else 0 end ) else 0 end ) ) as Stage23,
sum( (case Stage when 3 then (case Status when 3 then 1 else 0 end ) else 0 end ) ) as Stage33
from SampleData where date between # and #
group by UID
i have morethen 1,00,00,000 datarow where performance is very slow any other method??
February 20, 2010 at 5:54 am
You know, the people that help out here are all un-paid volunteers. Providing the DDL scripts (CREATE TABLE, CREATE INDEX, etc.) for the tables affected, and INSERT statements to put some test data into those tables that shows your problem will go a long way in getting people to look at your issue and help you out. Please include code for what you have already tried. Don't forget to include what your expected results should be, based on the sample data provided. As a bonus to you, you will get tested code back. For more details on how to get all of this into your post, please look at the first link in my signature.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 20, 2010 at 7:14 am
Indexed view.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 20, 2010 at 7:59 am
To expand on my previously reply slightly...
USE tempdb;
GO
IF OBJECT_ID(N'dbo.PreAggregatedSampleData', N'V') IS NOT NULL DROP VIEW dbo.PreAggregatedSampleData;
IF OBJECT_ID(N'dbo.SampleData', N'U') IS NOT NULL DROP TABLE dbo.SampleData;
GO
CREATE TABLE dbo.SampleData
(
id INTEGER PRIMARY KEY,
[user_id] INTEGER NOT NULL,
stage INTEGER NOT NULL,
[status] INTEGER NOT NULL,
[date] DATETIME2 NOT NULL
);
GO
INSERT dbo.SampleData
(id ,[user_id], [stage], [status], [date])
VALUES (01, 1, 1, 2, '12 Jan 2010'),
(02, 2, 2, 2, '13 Jan 2010'),
(03, 1, 1, 3, '13 Jan 2010'),
(04, 3, 2, 1, '20 Jan 2010'),
(05, 1, 2, 2, '25 Jan 2010'),
(06, 2, 1, 1, '25 Jan 2010'),
(07, 1, 2, 2, '02 Feb 2010'),
(08, 1, 1, 1, '02 Feb 2010'),
(09, 2, 2, 3, '12 Feb 2010'),
(10, 3, 2, 2, '12 Feb 2010'),
(11, 2, 1, 1, '12 Feb 2010'),
(12, 1, 2, 3, '12 Feb 2010'),
(13, 2, 1, 2, '12 Feb 2010'),
(14, 2, 2, 3, '22 Feb 2010');
GO
CREATE VIEW dbo.PreAggregatedSampleData
WITH SCHEMABINDING
AS
SELECT [date], [user_id], stage, [status], COUNT_BIG(*) AS group_row_count
FROM dbo.SampleData
GROUP BY
[date], [user_id], stage, [status]
GO
CREATE UNIQUE CLUSTERED INDEX c
ON dbo.PreAggregatedSampleData ([date], [user_id], [stage], [status]);
GO
SELECT Pivoted.[user_id],
Pivoted.Stage11,
Pivoted.Stage12,
Pivoted.Stage13,
Pivoted.Stage21,
Pivoted.Stage22,
Pivoted.Stage23
FROM (
-- Pre-aggregated data from the indexed view
SELECT [user_id],
column_name = 'Stage' + CONVERT(VARCHAR(2), [stage] * 10 + [status]),
row_count = COUNT_BIG(*)
FROM dbo.PreAggregatedSampleData SDB
WHERE date BETWEEN '1 Jan 2010' and '1 Mar 2010'
GROUP BY
[user_id], [stage], [status]
) T
PIVOT (
-- Reformat into the desired columns
SUM(row_count)
FOR column_name IN
(
[Stage11],
[Stage12],
[Stage13],
[Stage21],
[Stage22],
[Stage23]
)
) Pivoted;
GO
IF OBJECT_ID(N'dbo.PreAggregatedSampleData', N'V') IS NOT NULL DROP VIEW dbo.PreAggregatedSampleData;
IF OBJECT_ID(N'dbo.SampleData', N'U') IS NOT NULL DROP TABLE dbo.SampleData;
GO
-- END SCRIPT
Output:
user_idStage11Stage12Stage13Stage21Stage22Stage23
1111NULL21
221NULLNULL12
3NULLNULLNULL11NULL
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 21, 2010 at 1:08 am
thanks Wayne,
I corrected my mistake..
February 21, 2010 at 1:10 am
Hi paul
If i create index on same table is there any thing wrong.. insted of view index.
Regards,
Pritesh
February 21, 2010 at 1:21 am
pritesh-807803 (2/21/2010)
If i create index on same table is there any thing wrong.. insted of view index.
Pritesh,
The point of an indexed view is that it automatically keeps partial aggregations up to date. The idea is that summing partial aggregations is potentially much faster than adding all the rows individually. Of course, maintaining the partial aggregations is not for free, so you need to factor that in to your considerations.
Adding an index to the base table might not buy you much...you still need to add the rows up individually. I had rather assumed that you had already created an optimal index on the base table.
If you are unsure what the purpose of an indexed view is compared to a normal view, start your research with the following Books Online entry: Types of Views
This is a fairly advanced topic, so you need to do some investigation to make sure you understand the issues before making a decision.
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 21, 2010 at 2:45 am
Rather than aggregating and analysing your data in a single step, run a full-speed preaggregate, then analyse the results. All that CASEing and SUMing one row at a time will slow things down. You can run the aggregate result into a temp table or you can use a derived table like this:
SELECT UID,
SUM( (case Stage when 1 then (case [Status] when 3 then NumRows else 0 end ) else 0 end ) ) as Stage13,
SUM( (case Stage when 2 then (case [Status] when 2 then NumRows else 0 end ) else 0 end ) ) as Stage22,
SUM( (case Stage when 2 then (case [Status] when 3 then NumRows else 0 end ) else 0 end ) ) as Stage23,
SUM( (case Stage when 3 then (case [Status] when 3 then NumRows else 0 end ) else 0 end ) ) as Stage33
FROM (
SELECT UID, Stage, [Status], COUNT(*) AS NumRows
FROM #SampleData
WHERE [date] between # and #
GROUP BY UID, Stage, [Status]
) d
GROUP BY UID
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
February 21, 2010 at 3:19 am
That's nice code Chris. It represents a nice compromise between the benefits of an indexed view, and the drawbacks related to the maintenance of aggregates for that view. It ultimately won't be as fast as the indexed view approach, but it may be fast enough, and simpler, for this requirement. I like it.
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 21, 2010 at 3:55 am
Paul White (2/21/2010)
That's nice code Chris. It represents a nice compromise between the benefits of an indexed view, and the drawbacks related to the maintenance of aggregates for that view. It ultimately won't be as fast as the indexed view approach, but it may be fast enough, and simpler, for this requirement. I like it.Paul
Thanks Paul, that's very kind.
There's a very similar scenario at work, with three aggregated columns instead of two over a table of around 60M rows. It takes about a minute for an unrestricted aggregate (then about 5 seconds for the analysis) that only happens once-ish a year, the rest of the time it's say 50k rows preaggregated, processed & returned in an instant. It runs like a three-legged dog as a single-step process, taking many many times longer than the combined time of both steps run separately.
We may well have to move to an indexed view as the table increases in size.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply