July 22, 2010 at 5:24 pm
Hi
How can I Covert Daily Data into weekly data?
My Table is:
CREATE TABLE DSA
(
[CP Code] varchar(50),
[Date] date,
[Total Pageviews] bigint,
[Total Volume in MB] varchar(50),
)
Data in Table is:
CP CodeDateTotal PageviewsTotal Volume in MB
510142010-06-27262322199636.0095
510142010-06-28198571888588.3845
510142010-06-29181571281359.7489
510142010-06-30212610493943.4801
510142010-07-01170267282257.0021
510142010-07-02166646879573.3334
510142010-07-03201912386187.5129
510142010-07-04178337382654.0857
510142010-07-05189067286256.3637
510142010-07-06242431498458.3025
510142010-07-07219774389729.0344
510142010-07-08209788493912.9365
510142010-07-09202922687494.0143
510142010-07-10198929684737.8993
510142010-07-11183310980668.2048
510142010-07-12239842497892.0837
510142010-07-132365550100190.4429
510142010-07-14181092685285.4454
510142010-07-152483211100816.9027
510142010-07-16232700795540.1672
510142010-07-17175815481510.6331
530372010-06-276059117383.7422
530372010-06-286310124464.7691
530372010-06-296394115046.3884
530372010-06-304627111166.2992
530372010-07-013402101552.575
530372010-07-02311593708.7741
530372010-07-03303287156.6502
530372010-07-04299592872.0334
530372010-07-053122102125.7448
530372010-07-07304094791.2053
530372010-07-083638106615.8714
530372010-07-09376797252.8935
530372010-07-10374594446.8815
530372010-07-11357693185.0068
530372010-07-123860104106.1198
530372010-07-13312396444.6582
530372010-07-14345898725.4491
530372010-07-15351696911.039
530372010-07-16323390902.3766
530372010-07-17317690590.8028
How can I convert or Sum this data into Weekly basis?
This is how I want it:
CP CodeDateTotal PageviewsTotal Volume in MB
510142010-06-27 TotalTotal
510142010-07-04TotalTotal
530372010-06-27TotalTotal
530372010-07-04TotalTotal
"Date" is the Start Date for that week.
"Total" is the Sum of "Total Pageviews" and "Total Volume in MB" for that week for that particular "CP Code".
Thanks
July 22, 2010 at 6:21 pm
First, thank you for posting the table DDL and expected output.
Second, it would really help if you were to actually post insert statements for your data. Please read the article in the first link in my signature for how to automatically generate this. (FYI, it took me longer to get a working environment with your data set up than what it took to code a solution for you.)
And this should do what you're looking for. You may need to adjust your DATEFIRST setting.
;WITH CTE AS
(
select [CP Code],
StartDate = DateAdd(day, -DatePart(weekday, [Date])+1, [Date]),
[Total Pageviews],
MB = convert(numeric(20,4), [Total Volume in MB])
from DSA
)
SELECT [CP Code],
StartDate,
PageViews = sum([Total Pageviews]),
MB = sum(MB)
FROM CTE
GROUP BY [CP Code],StartDate
order by [CP Code],StartDate
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 22, 2010 at 6:23 pm
Just for future reference, it helps if you put your data in a usable format like this:
CREATE TABLE #Temp
(
[CP Code] varchar(50),
[Date] date,
[Total Pageviews] bigint,
[Total Volume in MB] decimal(16,4),
)
INSERT INTO #Temp
SELECT 51014, '2010-06-27', 2623221, 99636.0095 UNION ALL
SELECT 51014, '2010-06-28', 1985718, 88588.3845 UNION ALL
SELECT 51014, '2010-06-29', 1815712, 81359.7489 UNION ALL
SELECT 51014, '2010-06-30', 2126104, 93943.4801 UNION ALL
SELECT 51014, '2010-07-01', 1702672, 82257.0021 UNION ALL
SELECT 51014, '2010-07-02', 1666468, 79573.3334 UNION ALL
SELECT 51014, '2010-07-03', 2019123, 86187.5129 UNION ALL
SELECT 51014, '2010-07-04', 1783373, 82654.0857 UNION ALL
SELECT 51014, '2010-07-05', 1890672, 86256.3637 UNION ALL
SELECT 51014, '2010-07-06', 2424314, 98458.3025 UNION ALL
SELECT 51014, '2010-07-07', 2197743, 89729.0344 UNION ALL
SELECT 51014, '2010-07-08', 2097884, 93912.9365 UNION ALL
SELECT 51014, '2010-07-09', 2029226, 87494.0143 UNION ALL
SELECT 51014, '2010-07-10', 1989296, 84737.8993 UNION ALL
SELECT 51014, '2010-07-11', 1833109, 80668.2048 UNION ALL
SELECT 51014, '2010-07-12', 2398424, 97892.0837 UNION ALL
SELECT 51014, '2010-07-13', 2365550, 100190.4429 UNION ALL
SELECT 51014, '2010-07-14', 1810926, 85285.4454 UNION ALL
SELECT 51014, '2010-07-15', 2483211, 100816.9027 UNION ALL
SELECT 51014, '2010-07-16', 2327007, 95540.1672 UNION ALL
SELECT 51014, '2010-07-17', 1758154, 81510.6331 UNION ALL
SELECT 53037, '2010-06-27', 6059, 117383.7422 UNION ALL
SELECT 53037, '2010-06-28', 6310, 124464.7691 UNION ALL
SELECT 53037, '2010-06-29', 6394, 115046.3884 UNION ALL
SELECT 53037, '2010-06-30', 4627, 111166.2992 UNION ALL
SELECT 53037, '2010-07-01', 3402, 101552.575 UNION ALL
SELECT 53037, '2010-07-02', 3115, 93708.7741 UNION ALL
SELECT 53037, '2010-07-03', 3032, 87156.6502 UNION ALL
SELECT 53037, '2010-07-04', 2995, 92872.0334 UNION ALL
SELECT 53037, '2010-07-05', 3122, 102125.7448 UNION ALL
SELECT 53037, '2010-07-07', 3040, 94791.2053 UNION ALL
SELECT 53037, '2010-07-08', 3638, 106615.8714 UNION ALL
SELECT 53037, '2010-07-09', 3767, 97252.8935 UNION ALL
SELECT 53037, '2010-07-10', 3745, 94446.8815 UNION ALL
SELECT 53037, '2010-07-11', 3576, 93185.0068 UNION ALL
SELECT 53037, '2010-07-12', 3860, 104106.1198 UNION ALL
SELECT 53037, '2010-07-13', 3123, 96444.6582 UNION ALL
SELECT 53037, '2010-07-14', 3458, 98725.4491 UNION ALL
SELECT 53037, '2010-07-15', 3516, 96911.039 UNION ALL
SELECT 53037, '2010-07-16', 3233, 90902.3766 UNION ALL
SELECT 53037, '2010-07-17', 3176, 90590.8028
Also, please note that I changed the [Total Volume in MB] from a varchar to a decimal. It's really not a good idea to store numeric data in varchars.
And the query:
SELECT [CP Code],
DATEADD(dd, -1, DATEADD(wk, DATEDIFF(wk,0,[Date]), 0)) as [Date],
SUM([Total Pageviews]) as [Total Pageviews],
SUM([Total Volume in MB]) as [Total Volume in MB]
FROM #Temp
GROUP BY [CP Code], DATEDIFF(wk,0,[Date])
ORDER BY [CP Code], DATEDIFF(wk,0,[Date])
If you want more information about date functions, Seth Phelabaum wrote a nice article on this site and it can be found here[/url].
July 22, 2010 at 6:24 pm
July 22, 2010 at 7:45 pm
Wayne and BT...
Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2010 at 8:51 pm
Jeff Moden (7/22/2010)
Wayne and BT...Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂
Well, it depends! That would obviously be a problem if reporting on missing weeks is part of his requirement, but nothing a little join on a tally wouldn't fix.
July 22, 2010 at 9:00 pm
bteraberry (7/22/2010)
Jeff Moden (7/22/2010)
Wayne and BT...Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂
Well, it depends! That would obviously be a problem if reporting on missing weeks is part of his requirement, but nothing a little join on a tally wouldn't fix.
Heh... of course it depends. So let's pretend it's your neck on the line... would you make it so it catches missing weeks even if it weren't written into a requirement? I know I would. At least I would ask the question. So why not give the OP the same break and ask the question? 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2010 at 9:40 pm
Joe Celko (7/22/2010)
Thank you for the DDL, but it stinks....
I can tell you never read a book on data modeling.
Heh... there ya go being all friendly like again, Joe. 😛 You really need to work on your people skills. I'm sure there's an ISO document on the subject. :w00t:
Seriously. Technically, you're mostly correct but your habit is to kill the patient. People might actually listen to you (translation: Buy more of your books ;-)) if you left the daggers at home, Joe. I know killing the patient can be fun but it's more challenging to teach instead. 😛
pageview_date DATE DEFAULT CURRENT TIMESTAMP NOT NULL,
Just a suggestion... I wouldn't put a default on this column. It allows the user to make the huge mistake of not knowing the correct value for this important column.
You do not convert it; you aggregate it. Very different mindset. What kind of weeks? Fiscal? ISO? US? Let's ISO-8601 Standards for the week names.
CREATE TABLE WeekRanges
(week_name [font="Arial Black"]CHAR() [/font]NOT NULL PRIMARY KEY
CHECK (week_name LIKE '[12][0-9][0-9][0-9][font="Arial Black"]W[0-9][0-9][0-9][/font]'),
week_start_date DATE NOT NULL UNIQUE,
week_end_date DATE NOT NULL,
CHECK (week_start_date < week_end_date)
);
Is that check for week number correct? 3 Digits? If so, that would be a good reason to not follow ISO standards. :hehe: And CHAR()? Since when did you fall into the habit of trusting defaults?
Also, "week_end_date" should be "next_week_start_date" so that the WeekRange table works with dates AND times. Not all queries are going to be "date-only". Of course, if you actually did that, then the check would be (week_start_date = DATEADD(dd, -7, next_week_start_date) instead of allowing the user to enter ANY date less than the end date. While were at it, you should also do a check to validate the start date.
Now do a JOIN and use a BETWEEN predicate:
SELECT W.week_name, DSA.cp_code,
SUM (pageview_cnt) AS wkly_pageview_cnt,
SUM (something_volume) AS wkly_something_volume
FROM DSA, WeekRanges AS W
[font="Arial Black"]WHERE DSA.pageview_date BETWEEN W.week_start_date AND W.eeek_end_date[/font]
GROUP BY W.week_name, DSA.cp_code;
Ummmm... nope. Like I said, the WeekRange table should be able to handle times, as well. BETWEEN is a great way to get yourself in trouble with dates and times. The highlighted line above should be...
WHERE DSA.pageview_date >= W.week_start_date
AND DSA.pageview_date < W.next_week_start_date
Of course, that whole query needs a bit of rework because the requirement is to return the starting date of the week... not the week name as you have it.
If you're going to be nasty to people, you should probably get stuff like that right, ya know? 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
July 23, 2010 at 12:13 pm
Hi
First of all Thankyou very much guys for ur help. Looks like its working.
Actually I'm importing data from a CSV file with BULK INSERT and the problem is that the CSV file requires lot of cleanup before I can BULK INSERT it into Table. After I clean it up, the BULK INSERT still is not able to load any data. So I first save it as tabbed delimited .txt file and then create a table with all VARCHAR datatypes and then load data. After doing that, then I change the data types and column names. So my first priority was to just load the data first and then to do other tasks. My Table actually has 34 Columns and about 250 rows. I cut it short just to understand how it is done. My data doesnt have any unique values and hence no Primary Key. I tried using IDENTITY Column but I have to load data every week and my table keeps growing, so after adding IDENTITY Column to table, I cannot BULK INSERT more data into it as I don't have IDENTITY Column in my CSV file. I tried using file format file, but for some reason its not working. It gives me "Invalid # of Columns" error.
I'm new to all this and have really been looking to find a better way to do this. I'm using this data to build reports in report builder where weekly data is required. Also I was not sure if all these tasks can be done in Report Builder 1.0. So I decided to do it in SQL.
Thanks for your help guys.... Ill ask if I have any other problem.
July 23, 2010 at 1:00 pm
July 23, 2010 at 2:44 pm
Hi
When I Run this query As:
CREATE VIEW WeeklyData
AS
WITH CTE AS
(
select [CP Code],
StartDate = DateAdd(day, -DatePart(weekday, [Date])+1, [Date]),
[Total Pageviews],
MB = convert(numeric(20,4), [Total Volume in MB])
from DSA
)
SELECT [CP Code],
StartDate,
PageViews = sum([Total Pageviews]),
MB = sum(MB)
FROM CTE
GROUP BY [CP Code],StartDate
It gives me the required result i.e on weekly basis.
But I want to delete "MB = convert(numeric(20,4), [Total Volume in MB])" and "MB = sum(MB) from query as I dont want that Column.
Can this query be used with any Number of Columns?
Thanks
July 23, 2010 at 3:29 pm
July 23, 2010 at 3:56 pm
Novicejatt (7/23/2010)
HiFirst of all Thankyou very much guys for ur help. Looks like its working.
Actually I'm importing data from a CSV file with BULK INSERT and the problem is that the CSV file requires lot of cleanup before I can BULK INSERT it into Table. After I clean it up, the BULK INSERT still is not able to load any data. So I first save it as tabbed delimited .txt file and then create a table with all VARCHAR datatypes and then load data. After doing that, then I change the data types and column names. So my first priority was to just load the data first and then to do other tasks. My Table actually has 34 Columns and about 250 rows. I cut it short just to understand how it is done. My data doesnt have any unique values and hence no Primary Key. I tried using IDENTITY Column but I have to load data every week and my table keeps growing, so after adding IDENTITY Column to table, I cannot BULK INSERT more data into it as I don't have IDENTITY Column in my CSV file. I tried using file format file, but for some reason its not working. It gives me "Invalid # of Columns" error.
I'm new to all this and have really been looking to find a better way to do this. I'm using this data to build reports in report builder where weekly data is required. Also I was not sure if all these tasks can be done in Report Builder 1.0. So I decided to do it in SQL.
Thanks for your help guys.... Ill ask if I have any other problem.
The invalid number of columns error can be fixed in the format file but it's actually easier to create a view that looks like the file and you import into the view. So far as a PK goes, the view would allow you to use an IDENTITY column but doesn't "the code" and "the date" columns produce unique entries?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 23, 2010 at 4:01 pm
I get data daily for same code, so its repeating and I have 30 such codes which means that I have 30 codes with same date. So no Column has unique value.
Thanks
July 25, 2010 at 1:03 pm
Novicejatt (7/23/2010)
I get data daily for same code, so its repeating and I have 30 such codes which means that I have 30 codes with same date. So no Column has unique value.Thanks
Sorry for the delay...
You've lost me just a bit there... are you saying that combinations of CP_Code and Date form duplicates in the data?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply