April 19, 2006 at 9:39 am
Description: I am selecting transactions by customeres who were added within the past 10 years. Then group them by add_date in 12 month groups. Then group transactions by transaction_date. Grouping starts at current month - 12 months
Ex: Customer add_date Group
Transaction_Date Group
So, Customers with an add_date between 3/31/2005 and 3/31/2006 would be grouped together. then their transactions would be group by transaction_date between 3/31/2005 and 3/31/2006, 3/31/2004 and 3/31/2005, etc... going back 10 years. Obviously the most current Customers added would have fewer transaction groups.
Table look like this: BTW - our production transaction table has over 12 million records
Customer_Table (Customer_ID, Add_Date)
Transaction_Table (Transaction_ID, Customer_ID, Transaction_Date, Transaction_Amount)
My query looks like this: Query takes nearly 2 minutes to complete.
Select Disctinct SUM(t.Transaction_Amount) as Amount
-- This code assigns a group number. For example customers added from 3/31/2005 to 4/1/2006 would be assigned a CustomerAddGroup number of 0.
-- The customers added 2 years ago would have a CustomerAddGroup number of 1 and so on...
, FLOOR((DATEADIFF(MM, c.Add_Date, '3/31/2006')-1 / 12) as CustomerAddGroup
, FLOOR((DATEDIFF(MM, t.Transaction_Date, '3/31/2006')-1 / 12) as TransactionDateGroup
From Customer_Table c
Inner Join Transaction_Table t
on c.Customer_ID = t.Customer_ID
Where c.Add_Date <= '3/31/2006'
And c.Add_Date >= dateadd(yy, -10, '3/31/2006')
And t.Transaction_Date <= '3/31/2006'
And t.Transaction_Date >= DateAdd(yy, -10, '3/31/2006')
Group BY FLOOR((DATEADIFF(MM, c.Add_Date, '3/31/2006')-1 / 12) as CustomerAddGroup
, FLOOR((DATEDIFF(MM, t.Transaction_Date, '3/31/2006')-1 / 12) as TransactionDateGroup
April 19, 2006 at 10:00 am
Some Pointers here:
1. All Dates should be converted to datetime variable prior to query execution.
2. Make sure you have indexes in your searched (and joined) columns
3. "Distinct" is not necessary - in my opinion- for that query
4. Not sure what you need to accomplish with: FLOOR((DATEADIFF(MM, c.Add_Date, '3/31/2006')-1 / 12). Can you be more explicit here? Don't you have a missing parenthesis there ?
* Noel
April 19, 2006 at 10:14 am
here is an example of how I would go about it:
Declare @Firstdate datetime, @LastDate datetime
select @FirstDate = dateadd( mm, -12 *10, '20063103'), @LastDate = '20063103'
Select SUM(t.Transaction_Amount) as Amount
, month( c.Add_Date) as CustomerAddGroup
, convert(datetime,datediff(day,0,t.Transaction_Date)) as TransactionDateGroup
From Customer_Table c
Inner Join Transaction_Table t
on c.Customer_ID = t.Customer_ID
Where c.Add_Date <= @LastDate
And c.Add_Date >= @FirstDate
And t.Transaction_Date <= @LastDate
And t.Transaction_Date >= @FirstDate
Group BY month(c.Add_Date)
, convert(datetime,datediff(day,0,t.Transaction_Date))
* Noel
April 19, 2006 at 12:46 pm
I appreciate the response, but i think your solution wont work for us because you are grouping by month. I need to group the results by groups of customers within 12 months then their transactions over a 10 year period in groups of 12 months.
It would look like this in excel. Each group is a sliding 12 month grouping. So for all customers added between 03/06-03/05 would have there transactions listed below that column grouped by transaction group.
Customer Add Groups
Transactions 03/06-03/05 03/05-03/04 03/04-03/03 03/03-03/02
By Customer Add Group
03/06-03/05 100 200 300 400
03/05-03/04 0 100 200 300
03/04-03/03 0 0 100 200
03/03-03/02 0 0 0 100
April 19, 2006 at 2:01 pm
Ok Now I think I understand what you need.
If you can not persist the calculation some how you won't be able to use an index on the transaction_date nor on the add_date for grouping purposes.
If you have already implemeted all the ponts in my previous post all that is left is the implemetation of some sort of persistence.
Create a calulated column or an indexed view that implement those values and if you go with the calculated column don't forget the index on it. At that point your query should look like:
Select SUM(t.Transaction_Amount) as Amount
, Calculated_Customer_group as CustomerAddGroup
, Calculated_transaction_group as TransactionDateGroup
From Customer_Table c
Inner Join Transaction_Table t
on c.Customer_ID = t.Customer_ID
Where c.Add_Date <= @LastDate
And c.Add_Date >= @FirstDate
And t.Transaction_Date <= @LastDate
And t.Transaction_Date >= @FirstDate
Group BY Calculated_Customer_group
, Calculated_transaction_group
hth
* Noel
April 19, 2006 at 2:22 pm
Thanks Noeld. My original system design was to encapsulate this in a process and archive the results. Unfortunately, the project requestor wants to be able to run this for any date range using a web application. Which requires the CustomerAddGroup and TransactionGroups to slide. The 12 month window always slides, so it's not static.
April 19, 2006 at 2:37 pm
I thought about that too and you can make it static by simply defining group date 0 as the earliest date in your db and increment from that to current date. This is the opposit of what you have now, you can always increment and your 0 reference is static
Then it will be extremelly easy to persist those values all needed afterwards is to generate the translation lookup (translate from 'GroupNumber' to 'DateRange') which is also static by the way
* Noel
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply