Average number of days between orders

Question

Post reply

Average number of days between orders

kat35601

SSC Eights!

Points: 825
More actions
April 7, 2015 at 1:39 pm

#302825

I need to calculate the average number of days between customer orders.
my table has customer number and order number and the order date.
I would like to end with customer number last, order date and the average number of days between orders for each customer.
cust_idorder_numorder_date
HOLLGCAB 119482 02/27/2015
JILCO 119484 02/27/2015
KEY 119491 02/27/2015
TURNER 119496 02/27/2015
KEY 119499 02/27/2015
KEY 119500 02/27/2015
EASTSIDE 119486 02/27/2015
FE BELLA 119485 02/27/2015
EASTSIDE 119487 02/27/2015
TYCA 119488 02/27/2015
EASTSIDE 119489 02/27/2015
BOVE 119490 02/27/2015
HENRY 119492 02/27/2015
SUNLINE 119495 02/27/2015
BRUNO 119365 03/02/2015

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Eirikur Eiriksson SSC Guru Points: 182916 More actions · Answer 1

Quick suggestion, can you edit your post so the DDL and sample data is runnable? (create table and insert)

😎

pietlinden SSC Guru Points: 63512 More actions · Answer 2

Found an article at SQLAuthority that covers your options:

Simulate LEAD() and LAG() without Using SQL Server 2012 Analytic Function[/url]

Alan Burstein SSC Guru Points: 61141 More actions · Answer 3

pietlinden (4/7/2015)
Found an article at SQLAuthority that covers your options:
Simulate LEAD() and LAG() without Using SQL Server 2012 Analytic Function[/url]

This is 2008 forum 😉

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

kat35601 SSC Eights! Points: 825 More actions · Answer 4

the results should be on a per customer id bases not between customers so each customer id will have its own average.

pietlinden SSC Guru Points: 63512 More actions · Answer 5

Alan,

Yup, I noticed that... The link I posted is to an article by Pinal Dave simulating LAG/LEAD in 2008. Otherwise, I would have posted a query.

I guess he could also dig through Itzik's book on SQL 2008....

Luis Cazares SSC Guru Points: 183706 More actions · Answer 6

Alan.B (4/7/2015)
pietlinden (4/7/2015)
Found an article at SQLAuthority that covers your options:
Simulate LEAD() and LAG() without Using SQL Server 2012 Analytic Function[/url]
This is 2008 forum 😉

That's why it simulates LEAD and LAG instead of using them. 😉

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Alan Burstein SSC Guru Points: 61141 More actions · Answer 7

Luis Cazares (4/7/2015)
Alan.B (4/7/2015)
pietlinden (4/7/2015)
Found an article at SQLAuthority that covers your options:
Simulate LEAD() and LAG() without Using SQL Server 2012 Analytic Function[/url]
This is 2008 forum 😉
That's why it simulates LEAD and LAG instead of using them. 😉

I have smacked myself in the head to save everyone else the trouble. (not my day)...

Here's my solution (I have not looked at the link [obviously])

/* SAMPLE DATE */

-- create an orders table

DECLARE @orders TABLE (cust_id int,order_num varchar(36),order_date date);

-- note: don't need an orderid but I'm including one anyhow

INSERT INTO @orders VALUES

(1, newid(), '1/1/2014'),

(1, newid(), '1/6/2014'),

(1, newid(), '1/12/2014'),

(1, newid(), '1/15/2014'),

(2, newid(), '1/1/2014'),

(2, newid(), '1/10/2014'),

(2, newid(), '1/21/2014'),

(3, newid(), '1/1/2014'),

(3, newid(), '1/20/2014'),

(3, newid(), '1/30/2014');

--SELECT * FROM @orders;

/* SOLUTION */

WITH

orders AS -- add a rownumber to @orders

(

SELECT rn = ROW_NUMBER() over (partition by cust_id order by order_date), *

FROM @orders

),

prev_orders AS -- get previous orders for that customer

(

SELECT o1.*, order_date_prev = o2.order_date

FROM orders o1

JOIN orders o2

ON o1.rn = o2.rn+1 AND o1.cust_id = o2.cust_id

),

days_between AS -- calculate days between customers' orders

(

SELECT cust_id, Avg_days_between_orders = AVG(0.+DATEDIFF(DD,order_date_prev,order_date)) OVER (Partition by cust_id)

FROM prev_orders

)

-- The max technique should be better than doing a DISTINCT

SELECT cust_id, Avg_days_between_orders = MAX(Avg_days_between_orders)

FROM days_between

GROUP BY cust_id;

Edit: added 0.+ to datediff formula to deal with truncation.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Alan Burstein SSC Guru Points: 61141 More actions · Answer 8

pietlinden (4/7/2015)
Alan,
Yup, I noticed that... The link I posted is to an article by Pinal Dave simulating LAG/LEAD in 2008. Otherwise, I would have posted a query.
I guess he could also dig through Itzik's book on SQL 2008....

I just finished reading the link you posted and it's actually very good. I need to play with some of the techniques posted and comparing them to the self-join technique I just posted. The "No Joins and No Analytic Functions" solution by Geri Reshefis looks like a really excellent solution.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Eirikur Eiriksson SSC Guru Points: 182916 More actions · Answer 9

Alan.B (4/7/2015)
I have smacked myself in the head to save everyone else the trouble. (not my day)...

Happy to help, need a HAND!:-P

😎

Dwain Camps SSC Guru Points: 86978 More actions · Answer 10

Using Alan.B's table variable and sample data:

WITH orders AS -- add a rownumber to @orders

(

SELECT rn1=rn, rn2=rn+1, cust_id, order_date

FROM

(

SELECT rn = ROW_NUMBER() OVER (PARTITION BY cust_id ORDER BY order_date), *

FROM @orders

) a

)

SELECT cust_id, Avg_days_between_orders=AVG(0.+x)

FROM

(

SELECT cust_id, x=DATEDIFF(day, MIN(order_date), MAX(order_date))

FROM

(

SELECT cust_id, order_date, rn

FROM orders a

CROSS APPLY (VALUES(rn1),(rn2)) b(rn)

) a

GROUP BY cust_id, rn

HAVING COUNT(*) = 2

) a

GROUP BY cust_id;

On a heap, this is likely to perform a little better because it does a single table scan and one sort as opposed to 2 for Alan.B's solution. With Indexing however the self-join is likely to work better.

Alan.B - You did notice that your solution truncates the average number of days right?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Alan Burstein SSC Guru Points: 61141 More actions · Answer 11

Alan.B - You did notice that your solution truncates the average number of days right?

Yep. I'm out of gas for today. I'll post a correct solution in the morning.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Dwain Camps SSC Guru Points: 86978 More actions · Answer 12

Excuse my Homer Simpson moment there! Doh! :w00t:

SELECT cust_id

,Avg_days_between_orders=DATEDIFF(day, MIN(order_date), MAX(order_date))/(COUNT(*)-1.)

FROM @orders

GROUP BY cust_id

HAVING COUNT(*) > 1;

Edit: Added divide by zero protection (HAVING).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

kat35601 SSC Eights! Points: 825 More actions · Answer 13

One more question I need to add the last order date to this and then add the AVg_days_between_orders to it so that I get Last_order_Date + Avg_days_between_orders

WITH orders

AS -- add a rownumber to @orders

(

SELECT rn = (

ROW_NUMBER() OVER (

PARTITION BY ompCustomerOrganizationID ORDER BY ompcreateddate

)

,s.ompSalesOrderID

,s.ompCustomerOrganizationID

,ompcreateddate

,convert(VARCHAR(10), ompcreateddate, 101) AS order_dat

FROM SalesOrders s

WHERE ompCreatedDate >= '01-01-15'

)

,prev_orders AS -- get previous orders for that customer

(

SELECT o1.*

,order_date_prev = o2.ompcreateddate

FROM orders o1

JOIN orders o2 ON o1.rn = o2.rn + 1

AND o1.ompCustomerOrganizationID = o2.ompCustomerOrganizationID

)

,days_between

AS -- calculate days between customers' orders

(

SELECT ompCustomerOrganizationID

,Avg_days_between_orders = AVG(DATEDIFF(DD, order_date_prev, ompcreateddate)) OVER (PARTITION BY ompCustomerOrganizationID)

FROM prev_orders

)

-- The max technique should be better than doing a DISTINCT

SELECT ompCustomerOrganizationID

,Avg_days_between_orders = MAX(Avg_days_between_orders)

FROM days_between

GROUP BY ompCustomerOrganizationID;

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 14

Kat - you probably want to look at Dwain's answer just above. You really do NOT need to calculate the # of days between EACH order (as you are doing), you just need the average.

Average is easiest computed as:

( # of day between the FIRST order in your timeframe and the LAST order in your timeframe) / (total number of order within the timeframe-1)

for any customer with more than one order in the timeframe. You'd also want to clear up with the business what they intend to do when a custom has NO orders or just 1 one, since the calucation you requested could return something they're not expecting.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?