November 16, 2015 at 5:47 am
Hi all,
I have a scenario where I am dealing with assets. Every business day we get a new asset value. I need a dataset that shows the current asset value (relating to the date on the row) and the previous date's asset value (relating to the previous record of that asset by date)
This can easily be achieved with the LAG() function, but the issue arises when an asset is 'sold'. If for example, an asset is sold on 16th November, there will not be a record for the date 16th November. However, I still need to see a row for that asset for 16th November, where the previous date's asset value would relate to the value on 15th November, and the current asset value would be zero.
I hope this makes sense! I somehow need to create a row for an asset on the day it was sold π
November 16, 2015 at 5:51 am
mm7861 (11/16/2015)
Hi all,I have a scenario where I am dealing with assets. Every business day we get a new asset value. I need a dataset that shows the current asset value (relating to the date on the row) and the previous date's asset value (relating to the previous record of that asset by date)
This can easily be achieved with the LAG() function, but the issue arises when an asset is 'sold'. If for example, an asset is sold on 16th November, there will not be a record for the date 16th November. However, I still need to see a row for that asset for 16th November, where the previous date's asset value would relate to the value on 15th November, and the current asset value would be zero.
I hope this makes sense! I somehow need to create a row for an asset on the day it was sold π
How about some sample data with expected output?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
November 16, 2015 at 6:01 am
Sample data attached.
The first tab shows the raw data. Asset ID 1 was purchased on 13/11/2015 and sold on 16/11/2015. This is why the only rows that exist are between 13/11/2015 and 15/11/2015, and not 16/11/2015 because it was sold on that date.
The next tab shows what I want my results to look like. I need to see a row for the day it was sold because it still has a relevant 'Previous Value' from 15/11/2015.
Hope this helps.
November 16, 2015 at 6:07 am
mm7861 (11/16/2015)
Sample data attached.The first tab shows the raw data. Asset ID 1 was purchased on 13/11/2015 and sold on 16/11/2015. This is why the only rows that exist are between 13/11/2015 and 15/11/2015, and not 16/11/2015 because it was sold on that date.
The next tab shows what I want my results to look like. I need to see a row for the day it was sold because it still has a relevant 'Previous Value' from 15/11/2015.
Hope this helps.
Please read this article[/url], which describes how to create and present data for easy consumption by folks wishing to help you with your issue.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
November 16, 2015 at 6:37 am
Hopefully this is suitable!
Here is the table with my raw data
--===== If the test table already exists, drop it
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
DROP TABLE #mytable
--===== Create the test table with
CREATE TABLE #mytable
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
DateValue DATE,
AssetID INT,
Value INT
)
--===== Setup any special required conditions especially where dates are concerned
SET DATEFORMAT DMY
Here is some sample data that I'm working with
--===== Insert the test data into the test table
INSERT INTO #myTable
(DateValue,AssetId,Value)
VALUES ('13-Nov-2015',1,100)
,('14-Nov-2015',1,200)
,('15-Nov-2015',1,300)
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable ON
And my desired result is in the attached spreadsheet in the tab "Query Results"
November 16, 2015 at 6:48 am
maybe a start
WITH CTE
AS (
SELECT
DATEADD(day , 1 , MAX(Date)) AS Date
, Asset_ID
, 0 AS value
FROM yourtable
GROUP BY Asset_ID
UNION
SELECT
Date
, Asset_ID
, Value
FROM yourtable)
SELECT
Date
, Asset_Id
, lag(value , 1 , 0) OVER (PARTITION BY asset_id ORDER BY date) AS previousvalue
, Value AS CurrentValue
FROM CTE
ORDER BY
Asset_ID , Date;
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
November 16, 2015 at 6:48 am
Sorry to say your script wont run for the following reasons
1) the insert into #mytable doesnt match the definition of the temp table
2) the select statement has 3 columns the INSERT INTO has 5
3) You do an Identity INSERT ON but dont provide the identity column
In short the insert should read as follows, and the Identity insert on/Off can be commented out.
INSERT INTO #myTable
(DateValue,AssetId,Value)
VALUES ('13-Nov-2015',1,100)
,('14-Nov-2015',1,200)
,('15-Nov-2015',1,300)
_________________________________________________________________________
SSC Guide to Posting and Best Practices
November 16, 2015 at 6:57 am
Thanks Jason, copy and paste error! Fixed in original post π
J Livingston I'll give that a go thanks!!
November 16, 2015 at 8:03 am
Just because LAG/LEAD is shiny and new, doesn't mean that it's the best approach for the job. If you were trying to do this on SQL 2008, you would presumably know how to approach this, but you're ignoring the obvious approach, because you want to try out your shiny new toys.
SELECT ISNULL(m1.AssetID, m2.AssetID), ISNULL(m1.DateValue, DATEADD(DAY, 1, m2.DateValue)), ISNULL(m2.Value, 0) AS PrevValue, ISNULL(m1.Value, 0) AS CurValue
FROM #mytable m1
FULL OUTER JOIN #mytable m2
ON m1.AssetID = m2.AssetID
AND m1.DateValue = DATEADD(DAY, 1, m2.DateValue)
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
November 17, 2015 at 3:20 am
Thanks Drew, out of curiosity why do you think LAG is not best for the job? I tried the LAG version and your version and the LAG version runs much quicker..
J Livingston, your version works very well for me, however I've just come across an issue that I just spotted.
With the part that adds a day to the MAX(Date), I need it changing so that only reflects a Date that actually exists in the data..
So for example, if we would have sold the asset on 13/11/2015, we would want the next day to show up. However, the next day is 14/11/2015 which is a weekend which has no data, and the next Date in the data is 16/11/2015, which is what I'd need showing.
Any way this would be possible?
November 17, 2015 at 4:04 am
mm7861 (11/17/2015)
Thanks Drew, out of curiosity why do you think LAG is not best for the job? I tried the LAG version and your version and the LAG version runs much quicker..J Livingston, your version works very well for me, however I've just come across an issue that I just spotted.
With the part that adds a day to the MAX(Date), I need it changing so that only reflects a Date that actually exists in the data..
So for example, if we would have sold the asset on 13/11/2015, we would want the next day to show up. However, the next day is 14/11/2015 which is a weekend which has no data, and the next Date in the data is 16/11/2015, which is what I'd need showing.
Any way this would be possible?
this should give you an idea on how to acheive what you want
DECLARE @datevalue AS datetime = '20151113';
SELECT CASE WHEN DATENAME(dw , @datevalue) = 'Friday' THEN 3 ELSE 1 END;
I am not sure I agree with you about LAG being quicker.....I dont think it will.
if you give us an idea of the number of unique assets and date range for your data, then if I have time i will mock up some test data so that we can test the two methods provided so far.
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
November 17, 2015 at 4:58 am
At the moment we are using data from 1st September 2015, and have about 180,000 rows that are made up of 10,000 unique assets.
I'm not sure how I am supposed to use that piece of code you just posted in my current CTE setup? It's only looking at business days, and not dates that actually exist in the data (these are mostly business days, however there may be some business days that don't exist)
November 17, 2015 at 5:49 am
CASE WHEN DATENAME(dw , MAX(Date)) = 'Friday' THEN DATEADD(day , 3 , MAX(Date))
ELSE DATEADD(day , 1 , MAX(Date)) END as Date
edit...corrected typo
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
November 17, 2015 at 6:56 am
mm7861 (11/17/2015)
At the moment we are using data from 1st September 2015, and have about 180,000 rows that are made up of 10,000 unique assets.
ok....so will each unique asset have contiguous dates (Mon-Fri) until being sold?
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
November 17, 2015 at 8:13 am
mm7861 (11/17/2015)
Thanks Drew, out of curiosity why do you think LAG is not best for the job? I tried the LAG version and your version and the LAG version runs much quicker..J Livingston, your version works very well for me, however I've just come across an issue that I just spotted.
With the part that adds a day to the MAX(Date), I need it changing so that only reflects a Date that actually exists in the data..
So for example, if we would have sold the asset on 13/11/2015, we would want the next day to show up. However, the next day is 14/11/2015 which is a weekend which has no data, and the next Date in the data is 16/11/2015, which is what I'd need showing.
Any way this would be possible?
LAG/LEAD cannot add rows, so you would need some other mechanism to add the rows. Weekends do cause a problem with the FULL OUTER JOIN approach, but LAG/LEAD will cause problems with islands and gaps, because it's harder to recognize gaps.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply