July 17, 2008 at 11:46 am
Hello everyone,
To make my story shorter I will spare you the 'why' or the background info on the topic...
I have several tables that contain the log of every price variation for different item types. For my example I will use 3 tables and limit the number of columns to what we need to worry about for this problem... In real life these table each have few million rows but we don't need that many right now... here is the script to create/populate the sample tables
USE tempdb
GO
CREATE TABLE dbo.ItemAPriceLog(
ItemAPriceID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, Entered DATETIME NOT NULL DEFAULT (GETDATE())
, Cost money NOT NULL
-- , other columns are present, but no need to elaborate here
)
GO
CREATE TABLE dbo.ItemBPriceLog(
ItemBPriceID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, Entered DATETIME NOT NULL DEFAULT (GETDATE())
, Cost money NOT NULL
-- , other columns are present, but no need to elaborate here
)
GO
CREATE TABLE dbo.ItemCPriceLog(
ItemCPriceID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, Entered DATETIME NOT NULL DEFAULT (GETDATE())
, Cost money NOT NULL
-- , other columns are present, but no need to elaborate here
)
GO
-- Insert dummy entries
INSERT INTO dbo.ItemAPriceLog (Entered, Cost)
SELECT DISTINCT DATEADD(mi, ROW_NUMBER() OVER(ORDER BY o.ID * c.Type DESC), o.crdate)
, CONVERT(money, o.ID) * CONVERT(money, c.Type)
FROM dbo.sysobjects o
cross join dbo.syscolumns c
WHERE o.id < 1000 -- Needed otherwise the script will return prices in the billions...
GO
-- Insert dummy entries
INSERT INTO dbo.ItemBPriceLog (Entered, Cost)
SELECT DISTINCT DATEADD(mi, ROW_NUMBER() OVER(ORDER BY o.ID * c.Type DESC) * 2, o.crdate)
, CONVERT(money, o.ID) * CONVERT(money, o.UID) * CONVERT(money, c.xprec)
FROM dbo.sysobjects o
cross join dbo.syscolumns c
WHERE o.id < 1000 -- Needed otherwise the script will return prices in the billions...
GO
-- Insert dummy entries
INSERT INTO dbo.ItemCPriceLog (Entered, Cost)
SELECT DISTINCT DATEADD(mi, ROW_NUMBER() OVER(ORDER BY o.ID * c.TYPE * o.UID DESC) + o.ID + o.UID, o.crdate)
, CONVERT(money, o.ID) * CONVERT(money, c.Type) + CONVERT(money, o.UID) * CONVERT(money, c.Type)
FROM dbo.sysobjects o
cross join dbo.syscolumns c
WHERE o.id < 1000 -- Needed otherwise the script will return prices in the billions...
GO
I am working on a script that will populate some sort of "price mapping" table.
Each row will have a foreign key to the PriceID of each one of the price log tables, as well as an IDENTITY column and a datetime column containing the date/time the row prices were effective as of.
The table should contain one row for each time the price of one of the item changed.
Once the table is fully populated someone can easily query it to find out what were the prices of each item for any given time. (the alternative being to query each one of the individual table with a WHERE clause based on the 'Entered' value...)
Many reports will depend on the IDENTITY column of that mapping table...
Long story short the mapping table would look like something like this:
CREATE TABLE dbo.PriceInstances(
InstanceID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, ItemAPriceID INT NOT NULL CONSTRAINT FK_PriceInstances_ItemAPriceLog FOREIGN KEY REFERENCES dbo.ItemAPriceLog(ItemAPriceID)
, ItemBPriceID INT NOT NULL CONSTRAINT FK_PriceInstances_ItemBPriceLog FOREIGN KEY REFERENCES dbo.ItemBPriceLog(ItemBPriceID)
, ItemCPriceID INT NOT NULL CONSTRAINT FK_PriceInstances_ItemCPriceLog FOREIGN KEY REFERENCES dbo.ItemCPriceLog(ItemCPriceID)
, EffectiveDateTime DATETIME NOT NULL
)
GO
The challenge is to figure out an algorithm that will populate that table without relying on slow cursors or loops... It will run nightly as a job on a reporting database server.
July 17, 2008 at 12:08 pm
Is the "EffectiveDate" in PriceInstances the same as the "DateEntered" in the other three tables?
If so, and you have a DateEntered of 17 July 08 for ItemA, and a DateEntered of 16 July for ItemB, and a DateEntered of 1 Jan 08 for ItemC, and various other dates between 1 Jan and 17 July for ItemA and ItemB, what does it need to do?
Does PriceInstance end up with an entry with an EffectiveDate of 17 July 08, with the ItemA price from 17 July, the ItemB price from 16 July, and the ItemC price from 1 Jan?
On the 16th, when ItemB changed prices, does it get an entry with 16 July EffectiveDate and the price for ItemB from 16 July, the price for ItemA from whatever is most recent before that, and the price for ItemC from 1 Jan?
Or, alternately, does it get an entry at the beginning of each day, with the then-current price for each item?
On another note, why separate price log tables for each item? Why not an "ItemID" column in a generic price log all in one table?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 17, 2008 at 12:47 pm
FYI I just changed the script that populates the 3 price table a little bit that way it generates entries that are a little bit more realistic...
Thanks for your reply,
The reason why the price log table are all different is a long story... and it's outside my control. I could import all the data in a single table for the purpose of my project but it wouldn't help me in finding what all the prices were for each item type at any given time.
The EffectiveDateTime in PriceInstances will end up being equal to the most recent 'Entered' value between the 3 PriceLog tables... I know what I am explaining sounds like crazy non sense... so here is a concrete example... It should all make more sense after you read through it...
-- Populate the tables
delete from dbo.ItemAPriceLog
delete from dbo.ItemBPriceLog
delete from dbo.ItemCPriceLog
delete from dbo.PriceInstances
SET IDENTITY_INSERT dbo.ItemAPriceLog ON
INSERT INTO dbo.ItemAPriceLog (ItemAPriceID, Entered, Cost)
SELECT 1, '6/6/2004 10:28:34', 11.22
UNION ALL SELECT 2, '6/6/2004 11:05:44', 11.11
SET IDENTITY_INSERT dbo.ItemAPriceLog OFF
SET IDENTITY_INSERT dbo.ItemBPriceLog ON
INSERT INTO dbo.ItemBPriceLog (ItemBPriceID, Entered, Cost)
SELECT 1, '6/6/2004 10:25:44', 7.70
UNION ALL SELECT 2, '6/6/2004 10:26:11', 7.60
UNION ALL SELECT 3, '6/6/2004 10:44:22', 7.88
UNION ALL SELECT 4, '6/6/2004 11:00:55', 7.77
UNION ALL SELECT 5, '6/6/2004 11:45:12', 7.10
SET IDENTITY_INSERT dbo.ItemBPriceLog OFF
SET IDENTITY_INSERT dbo.ItemCPriceLog ON
INSERT INTO dbo.ItemCPriceLog (ItemCPriceID, Entered, Cost)
SELECT 1, '6/6/2004 10:25:48', 10.25
UNION ALL SELECT 2, '6/6/2004 11:25:14', 10.33
UNION ALL SELECT 3, '6/6/2004 13:5:22', 10.10
SET IDENTITY_INSERT dbo.ItemCPriceLog OFF
-- Populate PriceInstances table based on what we've got in the individual price log tables
-- FYI: Modify the table structure of PriceInstances to allow for NULL values in the ItemPrice columns...
INSERT INTO dbo.PriceInstances (ItemAPriceID, ItemBPriceID, ItemCPriceID, EffectiveDateTime)
-- At first all we've got is a price for ItemB
SELECT NULL, 1, NULL, (SELECT Entered FROM dbo.ItemBPriceLog WHERE ItemBPriceID = 1)
-- A new price instance is inserted once the price of Item C changes
UNION ALL SELECT NULL, 1, 1, (SELECT Entered FROM dbo.ItemCPriceLog WHERE ItemCPriceID = 1)
-- Price of Item B changes... no change with Item C and we still don't have a price for ItemA
UNION ALL SELECT NULL, 2, 1, (SELECT Entered FROM dbo.ItemBPriceLog WHERE ItemBPriceID = 2)
-- Price for ItemA has changed...
UNION ALL SELECT 1, 2, 1, (SELECT Entered FROM dbo.ItemAPriceLog WHERE ItemAPriceID = 1)
-- Price for ItemB has changed
UNION ALL SELECT 1, 3, 1, (SELECT Entered FROM dbo.ItemBPriceLog WHERE ItemBPriceID = 3)
-- Price for ItemB has changed again
UNION ALL SELECT 1, 4, 1, (SELECT Entered FROM dbo.ItemBPriceLog WHERE ItemBPriceID = 4)
-- Price for ItemA has changed
UNION ALL SELECT 2, 4, 1, (SELECT Entered FROM dbo.ItemAPriceLog WHERE ItemAPriceID = 2)
-- Price for ItemC has changed
UNION ALL SELECT 2, 4, 2, (SELECT Entered FROM dbo.ItemCPriceLog WHERE ItemCPriceID = 2)
-- Price for ItemB has changed
UNION ALL SELECT 2, 5, 2, (SELECT Entered FROM dbo.ItemBPriceLog WHERE ItemBPriceID = 5)
-- Price for ItemC has changed
UNION ALL SELECT 2, 5, 3, (SELECT Entered FROM dbo.ItemCPriceLog WHERE ItemCPriceID = 3)
July 17, 2008 at 12:53 pm
Here's how I'd solve this:
Create a Numbers table.
Use that to create a Calendar table, with all dates in the range you're querying against.
Do a From-To self-join for each log table, by selecting the date and the next entry's date.
Do a join between the From-To record and the calendar table, so you have the entry effective for all dates for each item.
Join those together, and join that to the original data. That will give you a list of all items, what prices were effective, on the dates in each table.
Insert that into the final table.
Is that clear enough, or should I try to write up samples of what I mean?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 17, 2008 at 12:56 pm
Select Entered as EffectiveDateTime
, (select top (1) ItemCPriceID from ItemCPriceLog C where dte.Entered >= C.Entered order by c.Entered ) IDC
, (select top (1) ItemBPriceID from ItemBPriceLog B where dte.Entered >= B.Entered order by B.Entered ) IDB
, (select top (1) ItemAPriceID from ItemAPriceLog A where dte.Entered >= A.Entered order by A.Entered ) IDA
FROM(
select Entered from ItemCPriceLog union
select Entered from ItemBPriceLog union
select Entered from ItemAPriceLog
) dte
* Noel
July 17, 2008 at 12:57 pm
Note: To speed this up you should have indices on all Datetime Columns!
CREATE NONCLUSTERED INDEX IX_ItemCPriceLog ON ItemCPriceLog( Entered )
CREATE NONCLUSTERED INDEX IX_ItemBPriceLog ON ItemBPriceLog( Entered )
CREATE NONCLUSTERED INDEX IX_ItemAPriceLog ON ItemAPriceLog( Entered )
You might as well retrieve Cost insted of the ID and then no joins will be necessary.
* Noel
July 17, 2008 at 4:16 pm
Thanks to both of you, very clever solutions 🙂 I knew there would be an easier way then what I started doing on my side before posting this...
clustered indexes -> The tables have those already, both I haven't mentioned them up there
as for getting the 'costs' instead of the ID and then do the joins... this would be good but in fact I have few other columns in each one of these sub tables (i.e. the UserID of the person who entered the new price value, etc...) so I need to maintain the relationship
thanks again for the solution!
July 17, 2008 at 4:31 pm
small change to the script proposed,
to get what I need in the right order the code has to be...
SELECT Entered AS EffectiveDateTime
, (SELECT TOP (1) ItemAPriceID FROM ItemAPriceLog A WHERE A.Entered <= dte.Entered ORDER BY A.Entered DESC) IDA
, (SELECT TOP (1) ItemBPriceID FROM ItemBPriceLog B WHERE B.Entered <= dte.Entered ORDER BY B.Entered DESC) IDB
, (SELECT TOP (1) ItemCPriceID FROM ItemCPriceLog C WHERE C.Entered <= dte.Entered ORDER BY C.Entered DESC) IDC
FROM(
SELECT Entered FROM ItemCPriceLog UNION
SELECT Entered FROM ItemBPriceLog UNION
SELECT Entered FROM ItemAPriceLog
) dte
ORDER BY Entered
Works perfectly!
thanks again
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply