Help Required Pleased: I don't even know how to pharase the question to search for an answer myself :-(...

Question

Help Required Pleased: I don't even know how to pharase the question to search for an answer myself :-(...

BCLynch

Old Hand

Points: 376
More actions
June 21, 2011 at 11:02 am

#240244

Hi All
Imagine i have a query returning two columns (EventID, EventType) like this (I've added a 3rd commentary column to try and clarify the meaning of the returned rows. Its not actually returned!)
EVENTID EVTYPE My Comment
EVENT-01START X TYPE 1: Has a "Start X" AND a "END X" record
EVENT-01END X TYPE 1: Has a "Start X" AND a "END X" record
EVENT-02START X Type 2: Has a "Start X" BUT HAS NO 'END X' record
EVENT-03START X TYPE 1: Has a "Start X" AND an "END X" record
EVENT-03END X TYPE 1: Has a "Start X" AND a "END X" record
EVENT-04START X TYPE 1: Has a "Start X" AND an "END X" record
EVENT-04END X TYPE 1: Has a "Start X" AND a "END X" record
What I wnt to do is to return a count from this subset of the records which are TYPE1 (i.e. that have a "STARTX" AND an "ENDX" record for a given "EventID" value
and
a count of the records which are TYPE2 (i.e. that have only a "STARTX" OR "ENDX" record for a given EventID
In the subset there are 4 unique EVENTIDs (01,02,03,04) of which 3 EVENTIDs are of Type1 (01,03,04) and 1 EVENTID which is of TYPE 2 (03). I know this to be true because I can look at the next record and see if it closes the set ("StartX", "ENDX")for the given EventID but I'm damned If I can work out a way to do the aggregates
I'd be really grateful for any pointers on how to most efficiently (as the actual subset is very large)
Thanks
Kinch

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Ron McCullough SSC Guru Points: 63877 More actions · Answer 1

You are more likely to have someone assist you if you post your table definition, some sample data (which you have but not in a readily useable format).

May I suggest clicking on the first link in my signature block and follow to learn how to do this quickly. The article also contains the T-SQL code to post typical data quickly and easily.

If everything seems to be going well, you have obviously overlooked something.

Ron

Please help us, help you -before posting a question please read[/url]
Before posting a performance problem please read[/url]

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 2

The code for this can get pretty complex, so if you can provide consumable sample data like you'll find in the first link of my signature, that'd help us present you finalized code.

In general, you're looking at using a ROW_NUMBER() to assign an ordered value to the records, then self-join on that result between that row and the previous one. From there, with both rows now on a single row, you can compare/contrast to find the necessary aggregation.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

BCLynch Old Hand Points: 376 More actions · Answer 3

Hi there

Excellent - a definite clue!

I shall attempt to do it myself first (am a masochist) and failing that I'll post up some scripts

Many thanks

B

BCLynch Old Hand Points: 376 More actions · Answer 4

Ok I haven't given up yet I've posted the table and insert script below since there are currently people around willing to help right now 🙂

Sorry for not posting it properly in the first place - I considered it a generic problem but thats's no excuse.

As I mentioned before, for this simple generic example table I want to write a query that returns the following answer

TYPE COUNT

TypeA 3

TypeB 1

I will then try and apply the lessons learned from the solution to the many problems I'm going to need this for over the next couple of weeks

Many Thanks guys - seriously

B

-- 01. Create Sample Table START --

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE TABLE [dbo].[tblEVENT](

[EventID] [nvarchar](10) NOT NULL,

[EventType] [nvarchar](50) NOT NULL,

[MyComment] [nvarchar](150) NULL,

CONSTRAINT [PK_tblEVENT] PRIMARY KEY CLUSTERED

(

[EventID] ASC,

[EventType] DESC

)WITH (PAD_INDEX = OFF,

STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,

ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

-- 01. Create Sample Table END --

-- 02. Insert Samnple Data START--

INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-01 ', N'START X ', N'This is TYPE01 - has a START and an END record for the EventID value')

INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-01 ', N'END X ', N'This is TYPE01 - has a START and an END record for the EventID value')

INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-02 ', N'START X ', N'** This is TYPE02 - has only a START or an END record for the EventID value')

INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-03 ', N'START X ', N'This is TYPE01 - has a START and an END record for the EventID value')

INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-03 ', N'END X ', N'This is TYPE01 - has a START and an END record for the EventID value')

INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-04', N'START X', N'This is TYPE01 - has a START and an END record for the EventID value')

INSERT [dbo].[tblEVENT] ([EventID], [EventType], [MyComment]) VALUES (N'EVENT-04', N'END X', N'This is TYPE01 - has a START and an END record for the EventID value')

-- 02. Insert Samnple Data End--

BCLynch Old Hand Points: 376 More actions · Answer 5

And one other question... what does one call (if anything) such a problem? 🙂

BCLynch Old Hand Points: 376 More actions · Answer 6

Am I on the right track at all here?

SELECT DISTINCT

A.EventID,

A.EventType,

B.EventType

FROM

(select

* ,

row_number() OVER (order by EventID, EventTYpe asc) as RowNoA

from dbo.tblEVENT

) as A

inner join

(

select

* ,

row_number() OVER (order by EventID, EventTYpe asc) as RowNoB

from dbo.tblEVENT

) as B

ON

A.RowNoA=B.RowNoB+1

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 7

BCLynch (6/21/2011)
And one other question... what does one call (if anything) such a problem? 🙂

Annoying. 😉

Right Track: Kinda, use a single cte and join it twice. Also, EventType needs to be DESC otherwise END comes before START. Working out the code between a couple of hot topics at work.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 8

I went jamming down the wrong road, sorry. Had another methodology in my head, but it's really overkill.

Here's the typical way to do it:

;WITH LeftMethod AS

(

SELECT

eid.EventID,

CASE WHEN eSt.EventID IS NULL THEN 'Type 2'

WHEN eEn.EventID IS NULL THEN 'Type 2'

ELSE 'Type 1'

END AS CountType

FROM

(SELECT DISTINCT EventID FROM tblEvent) AS eid

LEFT JOIN

tblEvent AS eSt

ONeid.EventID = eSt.EventID

AND eSt.EventType = 'START X'

LEFT JOIN

tblEvent AS eEn

ONeid.EventID = eEn.EventID

AND eEn.EventType = 'END X'

)

SELECT CountType, Count(*) FROM LeftMethod GROUP BY CountType

The way I mentioned can work (using Row_number()), but it's a lot more coding.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

BCLynch Old Hand Points: 376 More actions · Answer 9

Craig, excellent stuff - thanks a lot for taking the time. You're a cool guy.

Cheers

B

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 10

BCLynch (6/21/2011)
Craig, excellent stuff - thanks a lot for taking the time. You're a cool guy.
Cheers
B

Thanks. :blush: Glad to help. Let me know if anything in there seems confusing.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

tfifield SSCrazy Eights Points: 9655 More actions · Answer 11

There's another option to this. If the table is large and you have to scan it anyway, it might be more optimum to just make one pass at it and do it in an aggregate like this:

; WITH CTEEvents AS

(SELECT

EventID

, SUM(CASE WHEN EventType = 'START X' THEN 1 ELSE 0 END) AS HasStart

, SUM(CASE WHEN EventType = 'END X' THEN 1 ELSE 0 END) AS HasEnd

FROM tblEVENT

GROUP BY EventID

)

SELECT

SUM(CASE WHEN HasStart > 0 AND HasEnd > 0 THEN 1 ELSE 0 END) AS Type1

, SUM(CASE WHEN (HasStart + HasEnd) > 0

AND (HasStart = 0 OR HasEnd = 0) THEN 1 ELSE 0 END) AS Type2

FROM CTEEvents

Todd Fifield

BCLynch Old Hand Points: 376 More actions · Answer 12

That's also a really neat solution and, funnily enough, just last night I was pondering how one would handle adding (for example) a "a DO X" in between my "BEGIN X" and "END X" dummy example, that is, testing for completenes of 2,3 or even n records. This way makes that pretty clear I think. I'm sure how to do it the the other way yet . Is it n+1 self-joins for n states I wonder? I'll find out soon enough 🙂

B

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 13

BCLynch (6/24/2011)
That's also a really neat solution and, funnily enough, just last night I was pondering how one would handle adding (for example) a "a DO X" in between my "BEGIN X" and "END X" dummy example, that is, testing for completenes of 2,3 or even n records. This way makes that pretty clear I think. I'm sure how to do it the the other way yet . Is it n+1 self-joins for n states I wonder? I'll find out soon enough 🙂
B

I gave you the quick and dirty way to achieve your goal. If you're going to have multiple states, do it in a single pass as Todd recommended. Multiple self-joins can chew on memory (or tempdb, depending on the optimization path) pretty quick.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

BCLynch Old Hand Points: 376 More actions · Answer 14

Craig, I completely appreciate that the solution for 2 states is not the necessarily the optimal solution for n states and that you solved the stated problem.

If my last post reads as critical or ungrateful then I'm sorry as certainly it wasn't so intended. I was really delighted with the code you posted and addapted it to immediate effect for my real 2 state problem

Thanks again

B