March 13, 2009 at 12:26 pm
I have data in a table in which every 30 min new row is added. The data is almost last 2 yrs.
I want to select last row of each week or month.
Here is sample data
Date Name Available Cosumed
2009-03-06 00:03:26:970 ABC 120 10
2009-03-06 00:18:26:970 ABC 110 5
2009-03-06 00:32:26:970 ABC 105 1
2009-03-06 00:47:26:970 ABC 104 2
2009-03-06 01:03:26:970 ABC 102 13
March 13, 2009 at 12:34 pm
You can get the last date (and time) of a given month or week or whatever by using Max and Group By.
select max(Date)
from dbo.MyTable
group by datepart(year, Date), datepart(month, Date);
That'll give you the last entry in each month, by year. You can then join to that (or use an In statement) to get the rest of the data from the table for that entry.
select *
from dbo.MyTable
where Date in
select max(Date)
from dbo.MyTable
group by datepart(year, Date), datepart(month, Date));
Something like that.
Look up Datepart and Group By in Books Online and you'll be able to figure out the other intervals you want.
Does that help?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 13, 2009 at 1:50 pm
I need last rows of each week and each month.
March 13, 2009 at 5:09 pm
Yes, we understood that the first time. He told you how to do it. I am posting this code only because I suspect English might be difficult for you. Look at the script to put your data in a sample table variable. Please do that before posting to save the time of the people who are helping you.
declare @sample table (xDate datetime, Name varchar(10), Available int, Consumed int)
insert into @sample
select '2009-03-06 00:03:26:970','ABC',120,10 union all
select '2009-03-06 00:18:26:970','ABC',110,5 union all
select '2009-03-06 00:32:26:970','ABC',105,1 union all
select '2009-03-06 00:47:26:970','ABC',104,2 union all
select '2009-03-06 01:03:26:970','ABC',102,13 union all
select '2009-03-12 01:03:26:970','ABC',102,13
;with lastRowOfWeek as
(select datepart(wk,xDate) as xWeek, max(xdate) as maxDate
from @sample
group by datepart(wk,xdate)
)
select xWeek,S.*
from @sample S
join lastRowOfWeek L on L.maxDate = S.xdate
;with lastRowOfMonth as
(select datepart(mm,xdate) as xMonth, max(xdate) as maxDate
from @sample
group by datepart(mm,xdate)
)
select xMonth,S.*
from @sample S
join lastRowOfMonth L on L.maxDate = S.xdate
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 13, 2009 at 5:34 pm
As above I have created a sample table variable to use as an example. The following methos uses Common Table Expression and seems to work well
--Populate a test table with a couple of years of data
DECLARE @myTable TABLE ([Date] DATETIME, Name CHAR(3), Available INT, Consumed INT)
DECLARE @counter INT
DECLARE @startDate DATETIME
SET @counter = 0
SET @startDate = '01 Jan 2008'
WHILE @startDate < '02 Jan 2010'
BEGIN
INSERT @myTable
VALUES(@startDate,'ABC',@counter+50,@counter)
SET @startDate = DATEADD(mi,30,@startDate)
SET @counter = @counter +1
END
--Use CTE to number the rows by day of week and day of month in reverse
;WITH monkey AS
(
SELECT
--For each day number every entry in reverse order so 1 is the last tnrey for that day
ROW_NUMBER() OVER (PARTITION BY CONVERT(VARCHAR, [Date], 112) ORDER BY [Date] DESC) AS [perDayCount],
--For each month number every entry in reverse order so 1 is the last tnrey for that month
ROW_NUMBER() OVER (PARTITION BY LEFT(MONTH([Date]),6) ORDER BY DAY([Date]) DESC) AS [perMonthCount],
* --The rest of the columns of course
FROM @myTable
)
-- Output relevant results specifying if it is the end of the week and/or end of the month
SELECT[Date], Name, Available, Consumed,
CASE DATENAME(dw,[Date])
WHEN 'Sunday' THEN 'Yes'
ELSE 'No'
END AS isLastDayOfWeek,
CASE perMonthCount
WHEN 1 THEN 'Yes'
ELSE 'No'
END AS isLastDayOfMonth
FROMmonkey
WHEREperMonthCount = 1
OR(perDayCount = 1
ANDDATENAME(dw,[Date]) = 'Sunday')
ORDERBY [Date]
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
March 13, 2009 at 5:57 pm
That's an interesting approach, Rob. Mind if I borrow it to run a time trial?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 13, 2009 at 6:09 pm
Not at all. I have found myself mightily impressed with 2k5s CTE, ROW_NUMBER/RANK/NTILE, PIVOT/UNPIVOT functionality and am basically using interesting problems on forums like this as training excercises on how to use them while also being able to help people out using these great tools.
They generally provide efficient, readable and maintainable solutions and the more people who play around the better. I may get stuck some day with a problem you have already had to overcome.
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
March 13, 2009 at 6:28 pm
They are certainly powerful and have a lot of applications.
In this instance, using the GROUP BY/MAX is a lot more efficient on the cpu. From a look at the execution plans, it seems to be because your solution is having to work against all 35000+ input rows at every step up until the end. After the initial table scan(s), the GROUP BY/MAX is only dealing with about 100.
It's also worth noting that the GROUP BY/MAX is producing more rows. Part of this is producing 53 weeks for a year, because even a partial week that falls within one year is distinguished from the rest of the week which falls in the following year. This may not be the solution our man is after; it all depends on what is called "end of the week".
/*
--Populate a test table with a couple of years of data
Create Table #mytable ([Date] DATETIME, Name CHAR(3), Available INT, Consumed INT)
DECLARE @counter INT
DECLARE @startDate DATETIME
SET @counter = 0
SET @startDate = '01 Jan 2008'
WHILE @startDate < '02 Jan 2010'
BEGIN
INSERT #MyTable
VALUES(@startDate,'ABC',@counter+50,@counter)
SET @startDate = DATEADD(mi,30,@startDate)
SET @counter = @counter +1
END
*/
set statistics time on;
--Use CTE to number the rows by day of week and day of month in reverse
;WITH monkey AS
(
SELECT
--For each day number every entry in reverse order so 1 is the last tnrey for that day
ROW_NUMBER() OVER (PARTITION BY CONVERT(VARCHAR, [Date], 112) ORDER BY [Date] DESC) AS [perDayCount],
--For each month number every entry in reverse order so 1 is the last tnrey for that month
ROW_NUMBER() OVER (PARTITION BY LEFT(MONTH([Date]),6) ORDER BY DAY([Date]) DESC) AS [perMonthCount],
* --The rest of the columns of course
FROM #MyTable
)
-- Output relevant results specifying if it is the end of the week and/or end of the month
SELECT [Date], Name, Available, Consumed,
CASE DATENAME(dw,[Date])
WHEN 'Sunday' THEN 'Yes'
ELSE 'No'
END AS isLastDayOfWeek,
CASE perMonthCount
WHEN 1 THEN 'Yes'
ELSE 'No'
END AS isLastDayOfMonth
FROM monkey
WHERE perMonthCount = 1
OR (perDayCount = 1
AND DATENAME(dw,[Date]) = 'Sunday')
ORDER BY [Date]
set statistics time off;
set statistics time on;
with lastRowOfWeek as
(select datepart(wk,Date) as xWeek, max(date) as maxDate
from #MyTable
group by datepart(wk,date)
)
, lastRowOfMonth as
(select datepart(mm,date) as xMonth, max(date) as maxDate
from #MyTable
group by datepart(mm,date)
)
select 'week',xWeek,S.*
from #MyTable S
join lastRowOfWeek L on L.maxDate = S.date
union all
select 'month',xMonth,S.*
from #MyTable S
join lastRowOfMonth L on L.maxDate = S.date
set statistics time off;
(116 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 688 ms, elapsed time = 826 ms.
(132 row(s) affected)
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 109 ms, elapsed time = 604 ms.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 13, 2009 at 6:43 pm
Aye, when you replied I was in the middle of starting comparrisons myself but my "server" is a single CPU dual core mini Dell also running sharepoint and exchange as testbeds. The creation of a real table using the while loop took about 5 mins (I wanted to test my time trials on a real table tather than variable).
I've been working with t-sql for a a long time but am newish to 2k5 so this kind of comparison is a good reminder for me so I don't get carried away with "that's cool, I'll do it like this" all the time.
Even with hindsight it is sometimes too easy to go with the first solution you find or work out based on limited tests and not ever realise the production bottlenecks that might arise.
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
March 13, 2009 at 7:51 pm
Not a problem at all, Rob. I have some code that uses a tally table to populate sample tables. It will run a lot quicker than a while loop. Be happy to share it if you'd like.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
March 13, 2009 at 8:03 pm
But yeah, your total time is 6-8 times faster than mine
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
March 13, 2009 at 8:07 pm
OK, so your total cost is 6-8 times faster than mine.
Tally table example would be great. PM me since we have already crowded this topic somewhat.
Thought I'd add this one though as a confirm that your solution is the one to go with.
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
March 14, 2009 at 11:14 am
Rob Fisk (3/13/2009)
OK, so your total cost is 6-8 times faster than mine.Tally table example would be great. PM me since we have already crowded this topic somewhat.
Thought I'd add this one though as a confirm that your solution is the one to go with.
Nah... post it all here, if you don't mind. Would love to see what you guys come up with.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 14, 2009 at 12:14 pm
Just thought I run this past you guys...
On the subject of generating two years of test data... there's no need for RBAR. The following will run in about 3 seconds or less...
--=================================================================================================
-- Generate two years of test data and store it in a table.
-- The test data represents 1 "sample" every 30 minutes during the given range of dates.
-- Takes about 3 seconds on my 5 year old single cpu P4 1.8Ghz 1GB Ram.
-- This is NOT a part of the solution... it's just something to tes the solution against.
-- Jeff Moden.
--=================================================================================================
--===== Declare some obviously named variables
DECLARE @StartDate DATETIME,
@EndDate DATETIME,
@PeriodCount INT
--===== Assign the start and end dates and determine how many 30 minute periods are in that range
-- not including the end date itself.
SELECT @StartDate = '01 Jan 2008',
@EndDate = DATEADD(yy,2,@StartDate),
@PeriodCount = DATEDIFF(mi,@StartDate,@EndDate)/30
--===== Now, generate all the data in a very high speed fashion using a pseudo-cursor where the
-- row source is provided by a simple self cross join on a table known to have enough rows
-- to do the job (11k rows * 11k rows = 121M rows... should be more than enough)
SELECT TOP (@PeriodCount)
DATEADD(mi,(ROW_NUMBER() OVER (ORDER BY sc1.ID)-1) * 30, @StartDate) AS Date,
'ABC' AS Name,
ABS(CHECKSUM(NEWID()))%100+100 AS Available,
ABS(CHECKSUM(NEWID()))%100 AS Consumed
INTO #TestTable
FROM Master.sys.SysColumns sc1
CROSS JOIN Master.sys.SysColumns sc2
Then, there's no reason to do things like MAX or GROUP BY... go straight to the heart of the problem...
SET STATISTICS TIME ON
SELECT *, DATENAME(dw,date)
FROM #TestTable
WHERE --==== Find end of NEXT month and subtract 30 minutes to find last entry of month
Date = DATEADD(mi,-30,DATEADD(mm,DATEDIFF(mm,0,Date),1))
OR
--==== Find end of NEXT week and subtract 30 minutes to find last entry of month
-- Use the (-1) to control which day of the week is the end of the week (Sunday, in this case)
Date = DATEADD(mi,-30,DATEADD(wk,DATEDIFF(wk,0,Date),1))-1
SET STATISTICS TIME OFF
Here're the results on my box...
SQL Server Execution Times:
CPU time = [font="Arial Black"]79[/font] ms, elapsed time = 334 ms.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 14, 2009 at 12:58 pm
Jeff, two things.
1. Some code is missing from your first example. But I presume you were going to cross join syscolumns to itself. I already sent Rob similar code to generate a million rows fast.
2. Our production databases run almost 24-7. I see lots of transactions that fall just before midnight. I'm not really getting how your WHERE clause in the second example would help me isolate the very last row.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply