October 4, 2013 at 10:22 am
Ok, this will get you started with sample data:
create table #groupings (
ID int,
EventName varchar(255),
Venue varchar(255),
EventDate datetime
)
insert into #groupings
values (1,'Football','Arena','2013-10-04 09:58:42.670'),
(2,'Football','Arena','2013-10-04 09:58:42.670'),
(3,'Football','Arena','2013-10-04 10:45:42.670'),
(4,'Football','Arena','2013-10-04 09:58:42.670'),
(1,'Soccer','Field','2013-10-04 09:58:42.670'),
(2,'Soccer','Field','2013-10-04 09:38:42.670'),
(3,'Soccer','Field','2013-10-04 09:58:42.670'),
(4,'Soccer','Field','2013-10-04 09:28:42.670')
select min(EventName) EventName, Venue, EventDate,
stuff((select ',' + cast(ID as varchar) from #groupings where Venue = g.venue and Eventdate = g.eventdate for xml path ('')),1,1,'') IDs
from #groupings g
group by Venue, EventDate
order by venue, eventdate
drop table #groupings
This gives me:
EventName VenueEventDate IDs
FootballArena 2013-10-04 09:58:42.6701,2,4
FootballArena 2013-10-04 10:45:42.6703
SoccerField 2013-10-04 09:28:42.6704
SoccerField 2013-10-04 09:38:42.6702
SoccerField 2013-10-04 09:58:42.6701,3
and what I need is:
EventName VenueEventDate IDs
FootballArena 2013-10-04 09:58:42.6701,2,3,4
SoccerField 2013-10-04 09:58:42.6701,2,3,4
I know, group on EventName and Venue instead of time but that's not possible as the Event names in my real life data are all different and I have no option but to group on Venue and DateTime (thus the min(EventName) to homogenize them).
What I don't know how to do is how to do something like:
group by Venue, fuzzy-time(EventDate +/- an hour). Oh, and if I really want the data to be correct I need the incorrect times to yield to the majority. If there are two identical times and one unique time I need the result to be the time the "majority" agrees on.
Ideas?
Erin
October 4, 2013 at 10:26 am
Sorry could you please correct your SQL which populates the table?
October 4, 2013 at 10:42 am
(8 row(s) affected)
Works just fine on my machine. Anyone else having difficulties with the SQL?
October 4, 2013 at 10:52 am
I had to change few things but try this:
create table #groupings (
ID int,
EventName varchar(255),
Venue varchar(255),
EventDate datetime
)
insert into #groupings
select 1,'Football','Arena','2013-10-04 09:58:42.670'
UNion
select 2,'Football','Arena','2013-10-04 09:58:42.670'
union
select 3,'Football','Arena','2013-10-04 10:45:42.670'
union
select 4,'Football','Arena','2013-10-04 09:58:42.670'
union
select 1,'Soccer','Field','2013-10-04 09:58:42.670'
union
select 2,'Soccer','Field','2013-10-04 09:38:42.670'
union
select 3,'Soccer','Field','2013-10-04 09:58:42.670'
union
select 4,'Soccer','Field','2013-10-04 09:28:42.670'
select EventName, Venue, min(EventDate),
substring((select ',' + cast(ID as varchar)
from #groupings g1
where g1.EventName= g2.EventName
order by g1.EventName
for xml path ('')),2, 1000) [Students]
from #groupings g2
group by EventName, Venue
October 4, 2013 at 10:59 am
A valiant effort, DVA2007, but you'll note in my first post that I cannot group on Event Name because they are disparate (sorry I mislead you with my test data).
For example, you've got:
Football
Raiders vs Broncos
Pigskin huddle
Football
as event names instead of the generic 'football' that I've put in place there. Thus the reason for using the min() function.
More ideas.
October 4, 2013 at 11:05 am
Erin Ramsay (10/4/2013)
More ideas.
Normalize your data and then you aren't fighting this. Honestly the biggest challenge you are facing is the lack of normalization.
That being said you have a real challenge here. You might be able to use the logic from today's article about grouping islands of contiguous dates. I am pretty swamped at the moment but if nobody else jumps in to help I can try to take a look at this early next week.
http://www.sqlservercentral.com/articles/T-SQL/71550/[/url]
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
October 4, 2013 at 11:08 am
Nevermind... if I flatten the hour out of the datetime I can do it. Thanks to SQLPrincess for the co-solution!!
October 4, 2013 at 11:13 am
You know, Sean, if I could normalize the data I would but it's third party data and I don't have that option.. good thought though.
October 6, 2013 at 1:07 pm
I know this is a few days late, but I wanted to have a try at this. Here is what I came up with. Does it do what you require?
with countDates as (
select venue,
eventdate,
dateadd(hour, -1, eventdate) beforedate,
dateadd(hour, 1, eventdate) afterdate,
count(*) over (partition by venue, eventdate) repeated
from #groupings
)
select
eventname,
venue,
ca.eventdate
from #groupings g
cross apply (
select top 1 eventdate
from countDates c
where g.venue = c.venue and
g.eventdate between beforedate and afterdate
order by repeated desc) ca
group by
eventname,
venue,
ca.eventdate;
October 6, 2013 at 6:48 pm
It appears I'm also late to the party but perhaps this is another alternative:
SELECT EventName, Venue, EventDate
,IDs=STUFF((
SELECT ', ' + CAST(ID AS VARCHAR(5))
FROM #groupings b
WHERE a.EventName = b.EventName AND a.Venue = b.Venue
ORDER BY ID
FOR XML PATH('')
), 1, 2, '')
FROM
(
SELECT ID, EventName, Venue, rc, EventDate
,mr=MAX(rc) OVER (PARTITION BY EventName, Venue)
FROM
(
SELECT ID, EventName, Venue, EventDate
,rc=COUNT(*) OVER (PARTITION BY EventName, Venue, EventDate)
FROM #groupings
) a
) a
WHERE mr=rc
GROUP BY EventName, Venue, EventDate;
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply