Quering data from same table

Question

Quering data from same table

pritesh-807803

Mr or Mrs. 500

Points: 564
More actions
February 20, 2010 at 5:04 am

#139028

CREATE TABLE SampleData
(
ID INTEGER PRIMARY KEY,
UID INTEGER NOT NULL,
Stage INTEGER NOT NULL,
Status INTEGER NOT NULL,
Date DATETIME NOT NULL
);
GO
INSERT SampleData
(ID,UID,Stage,Status,Date)
VALUES (01, 1, 1, 2, '12 Jan 2010'),
(02, 2, 2, 2, '13 Jan 2010'),
(03, 1, 1, 3, '13 Jan 2010'),
(04, 3, 2, 1, '20 Jan 2010'),
(05, 1, 2, 2, '25 Jan 2010'),
(06, 2, 1, 1, '25 Jan 2010'),
(07, 1, 2, 2, '02 Feb 2010'),
(08, 1, 1, 1, '02 Feb 2010'),
(09, 2, 2, 3, '12 Feb 2010'),
(10, 3, 2, 2, '12 Feb 2010'),
(11, 2, 1, 1, '12 Feb 2010'),
(12, 1, 2, 3, '12 Feb 2010'),
(13, 2, 1, 2, '12 Feb 2010'),
(14, 2, 2, 3, '22 Feb 2010');
GO
From table I want for given date range each user wise Stage & Status performance Where I want count for 1. stage 1 & status 3 2. stage 2 & status 2 3. stage 2 & status 3 4. stage 3 & status 3
only right now my query is
Select UID,
sum( (case Stage when 1 then (case Status when 3 then 1 else 0 end ) else 0 end )) as Stage13,
sum( (case Stage when 2 then (case Status when 2 then 1 else 0 end ) else 0 end ) ) as Stage22,
sum( (case Stage when 2 then (case Status when 3 then 1 else 0 end ) else 0 end ) ) as Stage23,
sum( (case Stage when 3 then (case Status when 3 then 1 else 0 end ) else 0 end ) ) as Stage33
from SampleData where date between # and #
group by UID
i have morethen 1,00,00,000 datarow where performance is very slow any other method??

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

WayneS SSC Guru Points: 95461 More actions · Answer 1

You know, the people that help out here are all un-paid volunteers. Providing the DDL scripts (CREATE TABLE, CREATE INDEX, etc.) for the tables affected, and INSERT statements to put some test data into those tables that shows your problem will go a long way in getting people to look at your issue and help you out. Please include code for what you have already tried. Don't forget to include what your expected results should be, based on the sample data provided. As a bonus to you, you will get tested code back. For more details on how to get all of this into your post, please look at the first link in my signature.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Paul White SSC Guru Points: 150468 More actions · Answer 2

Indexed view.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Paul White SSC Guru Points: 150468 More actions · Answer 3

To expand on my previously reply slightly...

USE tempdb;

GO

IF OBJECT_ID(N'dbo.PreAggregatedSampleData', N'V') IS NOT NULL DROP VIEW dbo.PreAggregatedSampleData;

IF OBJECT_ID(N'dbo.SampleData', N'U') IS NOT NULL DROP TABLE dbo.SampleData;

GO

CREATE TABLE dbo.SampleData

(

id INTEGER PRIMARY KEY,

[user_id] INTEGER NOT NULL,

stage INTEGER NOT NULL,

[status] INTEGER NOT NULL,

[date] DATETIME2 NOT NULL

);

GO

INSERT dbo.SampleData

(id ,[user_id], [stage], [status], [date])

VALUES (01, 1, 1, 2, '12 Jan 2010'),

(02, 2, 2, 2, '13 Jan 2010'),

(03, 1, 1, 3, '13 Jan 2010'),

(04, 3, 2, 1, '20 Jan 2010'),

(05, 1, 2, 2, '25 Jan 2010'),

(06, 2, 1, 1, '25 Jan 2010'),

(07, 1, 2, 2, '02 Feb 2010'),

(08, 1, 1, 1, '02 Feb 2010'),

(09, 2, 2, 3, '12 Feb 2010'),

(10, 3, 2, 2, '12 Feb 2010'),

(11, 2, 1, 1, '12 Feb 2010'),

(12, 1, 2, 3, '12 Feb 2010'),

(13, 2, 1, 2, '12 Feb 2010'),

(14, 2, 2, 3, '22 Feb 2010');

GO

CREATE VIEW dbo.PreAggregatedSampleData

WITH SCHEMABINDING

AS

SELECT [date], [user_id], stage, [status], COUNT_BIG(*) AS group_row_count

FROM dbo.SampleData

GROUP BY

[date], [user_id], stage, [status]

GO

CREATE UNIQUE CLUSTERED INDEX c

ON dbo.PreAggregatedSampleData ([date], [user_id], [stage], [status]);

GO

SELECT Pivoted.[user_id],

Pivoted.Stage11,

Pivoted.Stage12,

Pivoted.Stage13,

Pivoted.Stage21,

Pivoted.Stage22,

Pivoted.Stage23

FROM (

-- Pre-aggregated data from the indexed view

SELECT [user_id],

column_name = 'Stage' + CONVERT(VARCHAR(2), [stage] * 10 + [status]),

row_count = COUNT_BIG(*)

FROM dbo.PreAggregatedSampleData SDB

WHERE date BETWEEN '1 Jan 2010' and '1 Mar 2010'

GROUP BY

[user_id], [stage], [status]

) T

PIVOT (

-- Reformat into the desired columns

SUM(row_count)

FOR column_name IN

(

[Stage11],

[Stage12],

[Stage13],

[Stage21],

[Stage22],

[Stage23]

)

) Pivoted;

GO

IF OBJECT_ID(N'dbo.PreAggregatedSampleData', N'V') IS NOT NULL DROP VIEW dbo.PreAggregatedSampleData;

IF OBJECT_ID(N'dbo.SampleData', N'U') IS NOT NULL DROP TABLE dbo.SampleData;

GO

-- END SCRIPT

Output:

user_idStage11Stage12Stage13Stage21Stage22Stage23

1111NULL21

221NULLNULL12

3NULLNULLNULL11NULL

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

pritesh-807803 Mr or Mrs. 500 Points: 564 More actions · Answer 4

pritesh-807803

Mr or Mrs. 500

Points: 564

February 21, 2010 at 1:08 am

#1122451

thanks Wayne,

I corrected my mistake..

pritesh-807803 Mr or Mrs. 500 Points: 564 More actions · Answer 5

Hi paul

If i create index on same table is there any thing wrong.. insted of view index.

Regards,

Pritesh

Paul White SSC Guru Points: 150468 More actions · Answer 6

pritesh-807803 (2/21/2010)
If i create index on same table is there any thing wrong.. insted of view index.

Pritesh,

The point of an indexed view is that it automatically keeps partial aggregations up to date. The idea is that summing partial aggregations is potentially much faster than adding all the rows individually. Of course, maintaining the partial aggregations is not for free, so you need to factor that in to your considerations.

Adding an index to the base table might not buy you much...you still need to add the rows up individually. I had rather assumed that you had already created an optimal index on the base table.

If you are unsure what the purpose of an indexed view is compared to a normal view, start your research with the following Books Online entry: Types of Views

This is a fairly advanced topic, so you need to do some investigation to make sure you understand the issues before making a decision.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 7

Rather than aggregating and analysing your data in a single step, run a full-speed preaggregate, then analyse the results. All that CASEing and SUMing one row at a time will slow things down. You can run the aggregate result into a temp table or you can use a derived table like this:

SELECT UID,

SUM( (case Stage when 1 then (case [Status] when 3 then NumRows else 0 end ) else 0 end ) ) as Stage13,

SUM( (case Stage when 2 then (case [Status] when 2 then NumRows else 0 end ) else 0 end ) ) as Stage22,

SUM( (case Stage when 2 then (case [Status] when 3 then NumRows else 0 end ) else 0 end ) ) as Stage23,

SUM( (case Stage when 3 then (case [Status] when 3 then NumRows else 0 end ) else 0 end ) ) as Stage33

FROM (

SELECT UID, Stage, [Status], COUNT(*) AS NumRows

FROM #SampleData

WHERE [date] between # and #

GROUP BY UID, Stage, [Status]

) d

GROUP BY UID

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 8

That's nice code Chris. It represents a nice compromise between the benefits of an indexed view, and the drawbacks related to the maintenance of aggregates for that view. It ultimately won't be as fast as the indexed view approach, but it may be fast enough, and simpler, for this requirement. I like it.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 9

Paul White (2/21/2010)
That's nice code Chris. It represents a nice compromise between the benefits of an indexed view, and the drawbacks related to the maintenance of aggregates for that view. It ultimately won't be as fast as the indexed view approach, but it may be fast enough, and simpler, for this requirement. I like it.
Paul

Thanks Paul, that's very kind.

There's a very similar scenario at work, with three aggregated columns instead of two over a table of around 60M rows. It takes about a minute for an unrestricted aggregate (then about 5 seconds for the analysis) that only happens once-ish a year, the rest of the time it's say 50k rows preaggregated, processed & returned in an instant. It runs like a three-legged dog as a single-step process, taking many many times longer than the combined time of both steps run separately.

We may well have to move to an indexed view as the table increases in size.

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]