March 9, 2009 at 3:49 pm
Hi,
I am trying to write sql formy report and they are asking report to count like Range By Group (means in a bucaket) and AVG.
By bucket
1.0
2.1
3.2
4.3
5.4
6.5-9
7.10-15
how can I make it in my sql?
My sql is:
select datediff(d,convert(datetime,left(Start,10)),convert(datetime,left(Stop,10)))"No of Days"
from STrack
and i am getting results like:
No of Days
0
0
0
1
1
2
2
3
3
4
4
4
5
5
5
5
5
5
..
Please let me know if you need more details.
Thank you!
March 9, 2009 at 5:54 pm
You haven't exactly made your requirements clear, but what I think you are trying to do is to allocate rows in your STrack table to a data bin ("bucket") based on the date difference between the Start and Stop columns measured in days. Presumably, there is some other column, that you haven't mentioned, for which you want to obtain an average value when the rows are grouped using these bins.
I suggest that you first create a temporary table or a table variable or a table valued function that defines the ranges of the data bins. Below, I'm using a table variable named @dataBin.
DECLARE @dataBin TABLE (
BinId int NOT NULL PRIMARY KEY,
MinValue int NOT NULL,
MaxValue int NOT NULL
)
INSERT INTO @dataBin (BinId, MinValue, MaxValue)
SELECT 1, 0, 0 UNION ALL
SELECT 2, 1, 1 UNION ALL
SELECT 3, 2, 2 UNION ALL
SELECT 4, 3, 3 UNION ALL
SELECT 5, 4, 4 UNION ALL
SELECT 6, 5, 9 UNION ALL
SELECT 7, 10, 15
Here's some test data based on some unconstrained assumptions I've made about your table structure.
DECLARE @STrack TABLE (
[Id] int NOT NULL IDENTITY(1,1) PRIMARY KEY,
[Start] varchar(23) NOT NULL,
[Stop] varchar(23) NOT NULL,
[ColumnToAverage] float
)
INSERT INTO @STrack ([Start], [Stop], [ColumnToAverage])
SELECT '2009-02-15T12:00:00', '2009-02-17T12:30:00', 3.5 UNION ALL
SELECT '2009-02-16T12:00:00', '2009-02-17T13:30:00', 6.5 UNION ALL
SELECT '2009-02-17T12:00:00', '2009-02-19T12:30:00', 8.5 UNION ALL
SELECT '2009-02-18T12:00:00', '2009-02-18T12:30:00', 1.3 UNION ALL
SELECT '2009-02-19T12:00:00', '2009-03-01T12:30:00', 5.2 UNION ALL
SELECT '2009-02-20T12:00:00', '2009-02-21T12:30:00', 1.0 UNION ALL
SELECT '2009-02-21T12:00:00', '2009-02-24T12:30:00', 20.0 UNION ALL
SELECT '2009-02-22T12:00:00', '2009-02-28T12:30:00', 14.5 UNION ALL
SELECT '2009-02-23T12:00:00', '2009-02-27T12:30:00', 9.1 UNION ALL
SELECT '2009-02-24T12:00:00', '2009-03-01T12:30:00', 5.0 UNION ALL
SELECT '2009-02-25T12:00:00', '2009-03-01T12:30:00', 3.1
Once you have the @dataBin table, determining the correct data bin is simple. I've put the DATEDIFF expression in a CTE to try to make the JOIN with the @dataBin table in the query as clear as possible, though this is not strictly necessary.
;WITH cteSTrack AS (
SELECT
DATEDIFF(day, CONVERT(datetime, SUBSTRING([Start], 1, 10)), CONVERT(datetime, SUBSTRING([Stop], 1, 10))) AS [NumDays],
[ColumnToAverage]
FROM @STrack
)
SELECT
R.[BinId],
COUNT(S.[ColumnToAverage]) AS [BinnedCount],
AVG(S.[ColumnToAverage]) AS [BinnedAverage]
FROM cteSTrack S
RIGHT OUTER JOIN @dataBin R ON (S.[NumDays] BETWEEN R.[MinValue] AND R.[Maxvalue])
GROUP BY R.[BinId]
ORDER BY R.[BinId]
A couple of points to note:
1) Can the date difference exceed 15 days? If this possible, the above query will ignore any such rows. If you don't want to ignore these rows you might want to create another "catch-all" data bin with MinValue and MaxValue column values of 16 and 2147483647 respectively, or whatever alternative is appropriate for your business rules.
2) What are the datatypes of the STrack.Start and STrack.Stop columns. From your SQL I guess that they are character strings such as varchar. Is there a good reason for this, and why aren't you using the datetime or smalldatetime datatype instead? Are the date/time strings in a consistent format that can be reliably converted to the correct datetime?
EDIT: Changed INNER JOIN to RIGHT OUTER JOIN in query so that all bins are returned in results even if they contain no rows.
March 10, 2009 at 11:40 am
Thanks Andrew.
I might have requirement changes but i will first try your example and let you know.
Thanks very much for your help and suggestion.
Regards,
March 16, 2009 at 2:27 pm
hi,
I have another question related to this:
I am using datediff function in sql server 2005 but its still retrieve the data if its only time different for the same day.
My current data is like:
Start Stop2008-09-01T09:00:01 2008-09-01T10:00:02
2008-09-01T16:00:00 2008-09-01T16:42:42
Now is it same day but diffrent time but its not consolidating both the date/time in as Day 1.
When i am using with my following query:
datediff(d,convert(datetime,left(Start,10)) ,convert(datetime,left(Stop,10)))"No of Days"
Could you please show me how can i handle it?
My start and stop field is Unfortunately defined as VARCHAR data type instead of dateTime.
March 17, 2009 at 6:03 am
Are all your date/times being stored as varchars in the following format?
yyyy-MM-ddTHH:mm:ss
e.g.
2008-09-01T09:00:01
If so, this is a standard (ISO 8601) format that should be reliably converted to a datetime whatever the SQL server configuration / locale setting, so to calculate the difference in days between the two columns, you should simply be able to use the following expression:
DATEDIFF(day, [Start], [Stop]) AS "No of Days"
March 17, 2009 at 9:28 pm
Thank you very much!
I will double check and let you know.
Thanks!
March 18, 2009 at 7:22 am
Thanks Andrew.
My start and stop field is in VARCHAR but its in following format:
2008-09-01T09:00:012008-09-01T10:00:02
2008-09-01T16:00:002008-09-01T16:42:42
2008-09-01T17:00:002008-09-01T17:05:00
2008-09-04T09:00:012008-09-05T08:00:02
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:00
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:00
I tried to run your following sql but its throw me an conversion error:
DATEDIFF(day, [Start], [Stop]) AS "No of Days"
Error:
Msg 241, Level 16, State 1, Line 1
Conversion failed when converting datetime from character string.
March 18, 2009 at 8:32 am
SQL Server won't be able to implicitly convert the varchar strings like the following to the datetime data type.
'2009-01-27T14:32:37.7389657-08:00'
Having a mixture of different formats is one of the problems you can face when storing date/times as varchars. Since you are calculating the time difference in days (i.e. number of day boundaries crossed) then the timezone offset may have an effect on your results.
What sort of datetimes do you want to use? - UTC or local times.
What sort of datetimes are those rows that do not include the milliseconds and timezone offset in the Start and Stop columns? - UTC or local times.
If you decide to use local times throughout, then it should be safe to ignore the UTC time zone offset in the DATEDIFF calculation. This assumes that the Start and Stop columns are stored with the same timezone offset (or absence thereof) for a given row. The following expression truncates the milliseconds and timezone offset from the varchar columns before the implicit conversion to a datetime.
DATEDIFF(day, SUBSTRING([Start], 1, 19), SUBSTRING([Stop], 1, 19))
The reason that I have truncated the milliseconds is that the SQL datetime type precision is only 3 milliseconds and that if the truncated datetime string were '2009-01-27T23:59:59.999', SQL Server will convert this to a datetime of midnight on the next day ('2009-01-28 00:00:00.000'), which would change your results.
If you need to use UTC datetimes in your DATEDIFF expression, then you will need to adjust your datetimes by the number of hours and minutes in the timezone offset. For example, to convert the following to UTC, you should add 8 hours to the local time:
'2009-01-27T18:32:37.7389657-08:00' (local)
becomes
'2009-01-28T02:32:37.7389657Z' (UTC)
i.e. UTC value is on following day, demonstrating that the timezone offset may have an effect on your final results.
March 18, 2009 at 9:00 am
Thanks once again!
Honestly, we will be getting user data from different country so it won't be local ( Ihope i have answered correctly).
But from the recent data, i have the datetime format which i have supplied in my previous email.
We have just started to get the data.
In my report, i have to count some data per day for reporting purpose.
Is it anyway i can control the incoming data?
Regards,
March 18, 2009 at 9:51 am
When I said local time, I meant the local time whereever the source data was generated, not where the SQL Server is located.
For your reporting, do you need the date/times from different rows to be compared, in which case you will definitely need to convert the times to a standard time zone (e.g. UTC), or do you just need to determine the difference between the Start and Stop columns within the same row, in which case you might be able to avoid converting to UTC (or the time zone where the SQL Server is located). However, as I demonstrated in my previous post, the number of day boundaries between 2 times can vary by +/- 1 depending on the time zone in which the DATEDIFF calculation is performed. This variation may not be important to you for reporting purposes, so you can do the DATEDIFF calculation using the local times, ignoring any time zone offset.
March 18, 2009 at 2:03 pm
Thanks much Andrew for your quick response.
Now I got you what you are pointing for local time.
Thanks for your detail clarificaiton.
I just need to determine the difference between the Start and Stop columns within the same row for each day.
So looking to my all the information, you suggest which one is the best to use?
DATEDIFF(day, SUBSTRING([Start], 1, 19), SUBSTRING([Stop], 1, 19)) As "Num of Days"
or
DATEDIFF(day, [Start], [Stop]) AS "Number of Days"
or
datediff(d,convert(datetime,left(Start,10)) ,convert(datetime,left(Stop,10)))"No of Days"
Thanks for your help!
March 18, 2009 at 2:47 pm
OK - if you are happy to ignore any time zone offsets and variations in time zone offset between different rows, and assuming that the date/time strings stored in your table can have one of 2 formats like the following examples:
2008-09-04T09:00:01
2009-01-27T14:32:37.7389657-08:00
then SQL Server can reliably and consistently convert the first 19 characters of both string formats to a datetime value.
Therefore, the number of day boundaries between the Start and Stop columns can be calculated with the following expression:
DATEDIFF(day, SUBSTRING([Start], 1, 19), SUBSTRING([Stop], 1, 19)) As "Num of Days"
March 19, 2009 at 1:12 pm
Thanks once again Andrew!
I tried to run your suggested DateDiff with SUBSTRING but i am confused after looking data, please correct me if i confused myself wrong way without proper understanding.
Please see the existing records for Start and Stop.
StartStop
2008-09-01T09:00:012008-09-01T10:00:02
2008-09-01T16:00:002008-09-01T16:42:42
2008-09-01T17:00:002008-09-01T17:05:00
2008-09-04T09:00:012008-09-05T08:00:02
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:00
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:00
Please see the data after running DateDiff with SUBSTRING:
StartStopNo of Days
2008-09-01T09:00:012008-09-01T10:00:020
2008-09-01T16:00:002008-09-01T16:42:420
2008-09-01T17:00:002008-09-01T17:05:000
2008-09-04T09:00:012008-09-05T08:00:021
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:000
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:000
I have used following:
DATEDIFF(day, SUBSTRING(Start,1,19),SUBSTRING(Stop, 1, 19))"No of Days"
Thanks for your help!
March 20, 2009 at 3:56 am
From your post on 16th March I assumed that you wanted the difference in days between the Start and Stop columns. The TSQL DATEDIFF function will return the number of date boundaries (of the specified type) between the 2 date/times, so
SELECT DATEDIFF(day, '2009-03-18T23:59:59', '2009-03-19T00:00:01')
will return 1 (day) although there are only 2 seconds between the 2 times.
Is this not what you want?
If you want the time difference between the Start and Stop columns in terms of whole days, then you need to do something like this instead:
SELECT DATEDIFF(second, SUBSTRING([Start], 1, 19), SUBSTRING([Stop], 1, 19)) / 86400
There are 86400 seconds in a day, normally.
If you do want the time difference in terms of the whole number of days then you don't need to worry about the time zone complications I mentioned in earlier posts, as long as the Start and Stop columns for any given row have the same time zone offset.
March 20, 2009 at 9:54 am
Thanks once again for your help and detail explaination!
Let me explain again if i have mis communicated.
I have two columns and data are like:
2008-09-01T09:00:012008-09-01T10:00:02
2008-09-01T16:00:002008-09-01T16:42:42
2008-09-01T17:00:002008-09-01T17:05:00
2008-09-04T09:00:012008-09-05T08:00:02
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:00
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:00
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:00
2009-01-28T12:27:10.910625-08:002009-01-28T12:27:37.42625-08:00
2009-01-28T13:07:07.80125-08:002009-01-28T13:07:38.504375-08:00
2009-01-22T10:48:57.9266883-08:002009-01-22T10:49:19.5967602-08:00
2009-01-22T10:49:30.0178731-08:002009-01-22T10:50:43.0458513-08:00
2009-01-26T11:51:22.9243673-08:002009-01-26T11:52:46.7363309-08:00
2009-01-26T11:54:00.6108581-08:002009-01-26T12:03:19.0291592-08:00
2009-01-26T12:06:13.2467942-08:002009-01-26T13:08:10.285505-08:00
2009-01-26T13:12:15.6589346-08:002009-01-26T13:12:31.5494579-08:00
2009-01-26T15:25:05.4829274-08:002009-01-26T15:54:57.3152096-08:00
2009-01-26T15:55:14.4557249-08:002009-01-26T16:27:32.6933201-08:00
2009-01-26T16:28:24.4429889-08:002009-01-26T16:28:29.0523344-08:00
I need to count something with and without joining other tables for different usage of activities like:
How many types, how many times Printing per day.
So i need to use start and stop columns from that table with joining other table to track how many times printed/day.
I hope i explained correctly.
Please let me know me that if i have different timing for the same day then it will combined (bundled in one) or not?
Because when i am running your query DateDiff then i am getting this results:
Query:
====
SELECT DATEDIFF(second, SUBSTRING([Start], 1, 19), SUBSTRING([Stop], 1, 19)) / 86400
Start Stop(No Of Days)
2008-09-01T09:00:012008-09-01T10:00:020
2008-09-01T16:00:002008-09-01T16:42:420
2008-09-01T17:00:002008-09-01T17:05:000
2008-09-04T09:00:012008-09-05T08:00:020
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:000
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:000
2009-01-28T12:27:10.910625-08:002009-01-28T12:27:37.42625-08:000
2009-01-28T13:07:07.80125-08:002009-01-28T13:07:38.504375-08:000
2009-01-22T10:48:57.9266883-08:002009-01-22T10:49:19.5967602-08:000
2009-01-22T10:49:30.0178731-08:002009-01-22T10:50:43.0458513-08:000
2009-01-26T11:51:22.9243673-08:002009-01-26T11:52:46.7363309-08:000
2009-01-26T11:54:00.6108581-08:002009-01-26T12:03:19.0291592-08:000
2009-01-26T12:06:13.2467942-08:002009-01-26T13:08:10.285505-08:000
2009-01-26T13:12:15.6589346-08:002009-01-26T13:12:31.5494579-08:000
2009-01-26T15:25:05.4829274-08:002009-01-26T15:54:57.3152096-08:000
2009-01-26T15:55:14.4557249-08:002009-01-26T16:27:32.6933201-08:000
2009-01-26T16:28:24.4429889-08:002009-01-26T16:28:29.0523344-08:000
When I run this query:
DATEDIFF(day, SUBSTRING([Start], 1, 19), SUBSTRING([Stop], 1, 19)) As "Num of Days"
Then following data:
DLSPrinterStartDLSPrinterStopNum of Days
2008-09-01T09:00:012008-09-01T10:00:020
2008-09-01T16:00:002008-09-01T16:42:420
2008-09-01T17:00:002008-09-01T17:05:000
2008-09-04T09:00:012008-09-05T08:00:021
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:37.7389657-08:002009-01-27T14:32:37.7389657-08:000
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:000
2009-01-27T14:32:35.6295907-08:002009-01-27T14:32:35.6295907-08:000
2009-01-28T12:27:10.910625-08:002009-01-28T12:27:37.42625-08:000
2009-01-28T13:07:07.80125-08:002009-01-28T13:07:38.504375-08:000
2009-01-22T10:48:57.9266883-08:002009-01-22T10:49:19.5967602-08:000
2009-01-22T10:49:30.0178731-08:002009-01-22T10:50:43.0458513-08:000
2009-01-26T11:51:22.9243673-08:002009-01-26T11:52:46.7363309-08:000
2009-01-26T11:54:00.6108581-08:002009-01-26T12:03:19.0291592-08:000
2009-01-26T12:06:13.2467942-08:002009-01-26T13:08:10.285505-08:000
2009-01-26T13:12:15.6589346-08:002009-01-26T13:12:31.5494579-08:000
2009-01-26T15:25:05.4829274-08:002009-01-26T15:54:57.3152096-08:000
2009-01-26T15:55:14.4557249-08:002009-01-26T16:27:32.6933201-08:000
2009-01-26T16:28:24.4429889-08:002009-01-26T16:28:29.0523344-08:000
2009-01-28T13:07:53.535625-08:002009-01-28T13:08:27.27-08:000
2009-01-20T11:37:59.6777837-08:002009-01-20T17:38:47.6750383-08:000
2009-01-23T10:44:49.1436605-08:002009-01-23T10:44:54.0968172-08:000
Thanks,
Viewing 15 posts - 1 through 15 (of 29 total)
You must be logged in to reply to this topic. Login to reply