July 9, 2009 at 5:46 am
Hi,
I'm new to the group and so thank you for your patience as this is my first post.
I've read the forum etiquette notice and tried to follow it but please do not hesitate to let me know where I am falling short and where I need to improve.
My situation is that I am a new developer with C# & SQL server, converting a legacy application that used Btrieve (a cursor based record management system from the 1980's). The application needs to retrieve price series from the database. A price series is a set of prices, one per day, that can come from varying times. Prices are stored in a table with a date column and a time column, multiple prices are stored by day. A price series table contains Item, Date and Time columns whenever the Time a price is taken from changes.
The script below creates and populates stripped down versions of these tables.
--===== If the test tables already exists, drop em
IF OBJECT_ID('TempDB..#myPrice','U') IS NOT NULL
DROP TABLE #myPrice
IF OBJECT_ID('TempDB..#myPriceTimes','U') IS NOT NULL
DROP TABLE #myPriceTimes
--===== Create the test tables with
CREATE TABLE #myPrice
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table
DateValue DATE,
TimeValue TIME,
Price FLOAT
)
CREATE TABLE #myPriceTimes
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table
Item INT,
DateValue DATE,
TimeValue TIME
)
--===== Setup any special required conditions especially where dates are concerned
SET DATEFORMAT DMY
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #myPrice ON
--===== Insert the test data into the test tables
INSERT INTO #myPrice
(ID, DateValue, TimeValue, Price)
SELECT '1','Jul 6 2009','14:00','1' UNION ALL
SELECT '2','Jul 7 2009','14:00','3' UNION ALL
SELECT '3','Jul 8 2009','14:00','5' UNION ALL
SELECT '4','Jul 9 2009','14:00','7' UNION ALL
SELECT '5','Jul 10 2009','14:00','9' UNION ALL
SELECT '6','Jul 6 2009','15:00','2' UNION ALL
SELECT '7','Jul 7 2009','15:00','4' UNION ALL
SELECT '8','Jul 8 2009','15:00','6' UNION ALL
SELECT '9','Jul 9 2009','15:00','8' UNION ALL
SELECT '10','Jul 10 2009','15:00','10'
SET IDENTITY_INSERT #myPrice OFF
SET IDENTITY_INSERT #myPriceTimes ON
INSERT INTO #myPriceTimes
(ID, Item, DateValue, TimeValue)
SELECT '1','1','Jul 6 2009','14:00' UNION ALL
SELECT '2','1','Jul 8 2009','15:00' UNION ALL
SELECT '3','1','Jul 10 2009','14:00' UNION ALL
SELECT '4','2','Jul 6 2009','15:00' UNION ALL
SELECT '5','3','Jul 6 2009','14:00' UNION ALL
SELECT '6','3','Jul 10 2009','15:00'
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #myPriceTimes OFF
So:
price series 1 is 1,3,6,8 & 9
price series 2 is 2,4,6,8 & 10
price series 3 is 1,3,5,7 & 10
Currently in order to get a price series I call a StoredProcedure similar to the SQL below once per day passing in the series and date I require (ie @dateReq & @seriesReq are parameters):
-- Not in my stored proc just here to setup & test (start)
DECLARE @dateReq DATE
DECLARE @seriesReq INT
SET @dateReq = 'Jul 7 2009'
SET @seriesReq = 1
-- Not in my stored proc just here to setup & test (end)
DECLARE @timeReq TIME
SELECT TOP 1 @timeReq = TimeValue
FROM #myPriceTimes
WHERE Item = @seriesReq AND
DateValue <= @dateReq
ORDER BY DateValue DESC
-- we now have the time we need a price for so get the price
SELECT Price
FROM #myPrice
WHERE DateValue = @dateReq AND
TimeValue = @timeReq
This is a straight port of the original code but is "cursor" like, is definitely RBAR and as my price series can be 20 years long it is inefficient.
I was wondering whether this could all be done by a single lump of SQL returning a results set and if so what that SQL was?
Thanks in advance
Jeremy
July 9, 2009 at 7:19 am
I think there is a problem with the query. You select the latest time for the latest date upto and including the parameter date but then use the parameter date to select the price, surely you should be using the date from the first select to access the price.
I do not see any CURSOR/RBAR in the queries unless you have hidden loop that you have not posted.
Joining the two queries into one will not necessarily give you a better query as it will still do the same as separate ones, only well chosen indexes will do that and these types of quries will suffer performance on large data sets.
Just for interest this is my solution to joining both together
SELECTp.Price
FROM#myPrice p
INNER JOIN (
SELECT TOP 1 a.Item,a.DateValue,a.TimeValue
FROM #myPriceTimes a
WHERE a.Item = @seriesReq
AND a.DateValue <= @dateReq
ORDER BY a.DateValue DESC
) pt
ON pt.Item = p.Item
AND pt.DateValue = p.DateValue
AND pt.TimeValue = p.TimeValue
Far away is close at hand in the images of elsewhere.
Anon.
July 9, 2009 at 8:01 am
David Burrows (7/9/2009)
I think there is a problem with the query. You select the latest time for the latest date upto and including the parameter date but then use the parameter date to select the price, surely you should be using the date from the first select to access the price.I do not see any CURSOR/RBAR in the queries unless you have hidden loop that you have not posted.
Sorry if I was not clear in my original post but the loop is *external* to the SQL as in the C# code I am repeatedly calling the Stored procedure once per day for every day for which I want a price, so the original query does do what I intended:-)
I am currently experimenting with the following User defined function:
IF OBJECT_ID (N'dbo.timeForPrice', N'FN') IS NOT NULL
DROP FUNCTION dbo.timeForPrice;
GO
CREATE FUNCTION dbo.timeForPrice
(@seriesReq int, @dateReq DATE)
RETURNS TIME
AS
BEGIN
DECLARE @timeReq TIME
SELECT TOP 1 @timeReq = TimeValue
FROM myPriceTimes
WHERE Item = @seriesReq AND
DateValue <= @dateReq
ORDER BY DateValue DESC
RETURN @timeReq
END
and calling it like this:
DECLARE @seriesReq INT -- stored procedure parameter in my application
SET @seriesReq = 1
SELECT * from myPrice
WHERE TimeValue = dbo.timeForPrice(@seriesReq, DateValue)
ORDER BY DateValue
This seems to do what I want from a single piece of SQL, but I believe it is still querying the priceTimes table for each day, it is this behaviour I would like to lose.
Thanks,
Jeremy
July 9, 2009 at 9:09 am
If you want to do several days then this will give you prices for July 2009. It will produce NULL for any date where no price is found.
SELECTb.[Date],p.Price
FROM(
SELECTc.[Date],a.Item,MAX(a.DateValue) AS [DateValue]
FROMdbo.Calendar c
LEFT JOIN #myPriceTimes a
ON a.Item = @seriesReq
AND a.DateValue <= c.[Date]
WHEREc.[Date] BETWEEEN '20090701' AND '20090731'
) b
LEFT JOIN #myPriceTimes pt
ON pt.Item = b.Item
AND pt.DateValue = b.DateValue
LEFT JOIN #myPrice p
ON p.Item = pt.Item
AND p.DateValue = pt.DateValue
AND p.TimeValue = pt.TimeValue
Far away is close at hand in the images of elsewhere.
Anon.
July 9, 2009 at 1:06 pm
Thanks for this. I think I understand what you are trying to do: generate a Date sequence from the Calendar table, join it to the priceTimes to generate our time sequence and then join both to the price to get the price series.
After creating the calendar as a table full of dates and trying your SQL I get a compilation error:
Msg 8120, Level 16, State 1, Line 71
Column 'myPriceTimes.DateValue' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
If I take out the Max function it compiles but I get multiple entries per day for each of the priceTimes that satisfy the <= date comparison. Presumably this is what you wanted the max to stop. If I leave the max in but put in a GROUP BY clause I get the same.
Any thoughts?
Thanks,
Jeremy
July 10, 2009 at 1:52 am
I thought about this some more overnight and realised I had used the wrong GROUP BY Clause yesterday.
Correcting this the following:
SELECT b.[Date],p.Price
FROM (
SELECT c.[Date] AS [Date], a.Item AS [Item], Max(a.DateValue) AS [DateValue]
FROM Calendar c
LEFT JOIN #myPriceTimes AS a
ON a.Item = @seriesReq AND
a.DateValue <= c.[Date]
WHERE c.[Date] BETWEEN '20090706' AND '20090710'
GROUP BY c.Date, a.Item
) b
LEFT JOIN #myPriceTimes AS pt
ON pt.Item = b.ItemAND
pt.DateValue = b.DateValue
LEFT JOIN #myPrice AS p
ONp.DateValue = b.Date AND
p.TimeValue = pt.TimeValue
Does exactly what I was after!
Thank you for the Calendar suggestion, I certainly would never of thought to approach this problem that way.
Jeremy
July 10, 2009 at 2:10 am
Sorry for the problems, my bad. I did not test my last query as I posted in a hurry, I do not have 2008 and was mucking about on 2000/2005.
Your solution is what I should have posted :blush:
As an alternative to creating a Calendar table and if your date range is small, eg a month then you could use Jeff Moden's Tally Table[/url] to generate the dates.
Far away is close at hand in the images of elsewhere.
Anon.
July 10, 2009 at 2:28 am
No problem, It was the Calendar idea that was key. As ever I think you learn more by having to overcome issues yourself rather than having things put on a plate, this exercise got me to look at GROUP BY in detail and has improved my understanding no end.
Thank you!
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply