Replace cursor like querying please

  • Hi,

    I'm new to the group and so thank you for your patience as this is my first post.

    I've read the forum etiquette notice and tried to follow it but please do not hesitate to let me know where I am falling short and where I need to improve.

    My situation is that I am a new developer with C# & SQL server, converting a legacy application that used Btrieve (a cursor based record management system from the 1980's). The application needs to retrieve price series from the database. A price series is a set of prices, one per day, that can come from varying times. Prices are stored in a table with a date column and a time column, multiple prices are stored by day. A price series table contains Item, Date and Time columns whenever the Time a price is taken from changes.

    The script below creates and populates stripped down versions of these tables.

    --===== If the test tables already exists, drop em

    IF OBJECT_ID('TempDB..#myPrice','U') IS NOT NULL

    DROP TABLE #myPrice

    IF OBJECT_ID('TempDB..#myPriceTimes','U') IS NOT NULL

    DROP TABLE #myPriceTimes

    --===== Create the test tables with

    CREATE TABLE #myPrice

    (

    ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table

    DateValue DATE,

    TimeValue TIME,

    Price FLOAT

    )

    CREATE TABLE #myPriceTimes

    (

    ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table

    Item INT,

    DateValue DATE,

    TimeValue TIME

    )

    --===== Setup any special required conditions especially where dates are concerned

    SET DATEFORMAT DMY

    --===== All Inserts into the IDENTITY column

    SET IDENTITY_INSERT #myPrice ON

    --===== Insert the test data into the test tables

    INSERT INTO #myPrice

    (ID, DateValue, TimeValue, Price)

    SELECT '1','Jul 6 2009','14:00','1' UNION ALL

    SELECT '2','Jul 7 2009','14:00','3' UNION ALL

    SELECT '3','Jul 8 2009','14:00','5' UNION ALL

    SELECT '4','Jul 9 2009','14:00','7' UNION ALL

    SELECT '5','Jul 10 2009','14:00','9' UNION ALL

    SELECT '6','Jul 6 2009','15:00','2' UNION ALL

    SELECT '7','Jul 7 2009','15:00','4' UNION ALL

    SELECT '8','Jul 8 2009','15:00','6' UNION ALL

    SELECT '9','Jul 9 2009','15:00','8' UNION ALL

    SELECT '10','Jul 10 2009','15:00','10'

    SET IDENTITY_INSERT #myPrice OFF

    SET IDENTITY_INSERT #myPriceTimes ON

    INSERT INTO #myPriceTimes

    (ID, Item, DateValue, TimeValue)

    SELECT '1','1','Jul 6 2009','14:00' UNION ALL

    SELECT '2','1','Jul 8 2009','15:00' UNION ALL

    SELECT '3','1','Jul 10 2009','14:00' UNION ALL

    SELECT '4','2','Jul 6 2009','15:00' UNION ALL

    SELECT '5','3','Jul 6 2009','14:00' UNION ALL

    SELECT '6','3','Jul 10 2009','15:00'

    --===== Set the identity insert back to normal

    SET IDENTITY_INSERT #myPriceTimes OFF

    So:

    price series 1 is 1,3,6,8 & 9

    price series 2 is 2,4,6,8 & 10

    price series 3 is 1,3,5,7 & 10

    Currently in order to get a price series I call a StoredProcedure similar to the SQL below once per day passing in the series and date I require (ie @dateReq & @seriesReq are parameters):

    -- Not in my stored proc just here to setup & test (start)

    DECLARE @dateReq DATE

    DECLARE @seriesReq INT

    SET @dateReq = 'Jul 7 2009'

    SET @seriesReq = 1

    -- Not in my stored proc just here to setup & test (end)

    DECLARE @timeReq TIME

    SELECT TOP 1 @timeReq = TimeValue

    FROM #myPriceTimes

    WHERE Item = @seriesReq AND

    DateValue <= @dateReq
    ORDER BY DateValue DESC

    -- we now have the time we need a price for so get the price

    SELECT Price
    FROM #myPrice
    WHERE DateValue = @dateReq AND
    TimeValue = @timeReq

    This is a straight port of the original code but is "cursor" like, is definitely RBAR and as my price series can be 20 years long it is inefficient.

    I was wondering whether this could all be done by a single lump of SQL returning a results set and if so what that SQL was?

    Thanks in advance

    Jeremy

  • I think there is a problem with the query. You select the latest time for the latest date upto and including the parameter date but then use the parameter date to select the price, surely you should be using the date from the first select to access the price.

    I do not see any CURSOR/RBAR in the queries unless you have hidden loop that you have not posted.

    Joining the two queries into one will not necessarily give you a better query as it will still do the same as separate ones, only well chosen indexes will do that and these types of quries will suffer performance on large data sets.

    Just for interest this is my solution to joining both together

    SELECTp.Price

    FROM#myPrice p

    INNER JOIN (

    SELECT TOP 1 a.Item,a.DateValue,a.TimeValue

    FROM #myPriceTimes a

    WHERE a.Item = @seriesReq

    AND a.DateValue <= @dateReq

    ORDER BY a.DateValue DESC

    ) pt

    ON pt.Item = p.Item

    AND pt.DateValue = p.DateValue

    AND pt.TimeValue = p.TimeValue

    Far away is close at hand in the images of elsewhere.
    Anon.

  • David Burrows (7/9/2009)


    I think there is a problem with the query. You select the latest time for the latest date upto and including the parameter date but then use the parameter date to select the price, surely you should be using the date from the first select to access the price.

    I do not see any CURSOR/RBAR in the queries unless you have hidden loop that you have not posted.

    Sorry if I was not clear in my original post but the loop is *external* to the SQL as in the C# code I am repeatedly calling the Stored procedure once per day for every day for which I want a price, so the original query does do what I intended:-)

    I am currently experimenting with the following User defined function:

    IF OBJECT_ID (N'dbo.timeForPrice', N'FN') IS NOT NULL

    DROP FUNCTION dbo.timeForPrice;

    GO

    CREATE FUNCTION dbo.timeForPrice

    (@seriesReq int, @dateReq DATE)

    RETURNS TIME

    AS

    BEGIN

    DECLARE @timeReq TIME

    SELECT TOP 1 @timeReq = TimeValue

    FROM myPriceTimes

    WHERE Item = @seriesReq AND

    DateValue <= @dateReq

    ORDER BY DateValue DESC

    RETURN @timeReq

    END

    and calling it like this:

    DECLARE @seriesReq INT -- stored procedure parameter in my application

    SET @seriesReq = 1

    SELECT * from myPrice

    WHERE TimeValue = dbo.timeForPrice(@seriesReq, DateValue)

    ORDER BY DateValue

    This seems to do what I want from a single piece of SQL, but I believe it is still querying the priceTimes table for each day, it is this behaviour I would like to lose.

    Thanks,

    Jeremy

  • If you want to do several days then this will give you prices for July 2009. It will produce NULL for any date where no price is found.

    SELECTb.[Date],p.Price

    FROM(

    SELECTc.[Date],a.Item,MAX(a.DateValue) AS [DateValue]

    FROMdbo.Calendar c

    LEFT JOIN #myPriceTimes a

    ON a.Item = @seriesReq

    AND a.DateValue <= c.[Date]

    WHEREc.[Date] BETWEEEN '20090701' AND '20090731'

    ) b

    LEFT JOIN #myPriceTimes pt

    ON pt.Item = b.Item

    AND pt.DateValue = b.DateValue

    LEFT JOIN #myPrice p

    ON p.Item = pt.Item

    AND p.DateValue = pt.DateValue

    AND p.TimeValue = pt.TimeValue

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Thanks for this. I think I understand what you are trying to do: generate a Date sequence from the Calendar table, join it to the priceTimes to generate our time sequence and then join both to the price to get the price series.

    After creating the calendar as a table full of dates and trying your SQL I get a compilation error:

    Msg 8120, Level 16, State 1, Line 71

    Column 'myPriceTimes.DateValue' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

    If I take out the Max function it compiles but I get multiple entries per day for each of the priceTimes that satisfy the <= date comparison. Presumably this is what you wanted the max to stop. If I leave the max in but put in a GROUP BY clause I get the same.

    Any thoughts?

    Thanks,

    Jeremy

  • I thought about this some more overnight and realised I had used the wrong GROUP BY Clause yesterday.

    Correcting this the following:

    SELECT b.[Date],p.Price

    FROM (

    SELECT c.[Date] AS [Date], a.Item AS [Item], Max(a.DateValue) AS [DateValue]

    FROM Calendar c

    LEFT JOIN #myPriceTimes AS a

    ON a.Item = @seriesReq AND

    a.DateValue <= c.[Date]

    WHERE c.[Date] BETWEEN '20090706' AND '20090710'

    GROUP BY c.Date, a.Item

    ) b

    LEFT JOIN #myPriceTimes AS pt

    ON pt.Item = b.ItemAND

    pt.DateValue = b.DateValue

    LEFT JOIN #myPrice AS p

    ONp.DateValue = b.Date AND

    p.TimeValue = pt.TimeValue

    Does exactly what I was after!

    Thank you for the Calendar suggestion, I certainly would never of thought to approach this problem that way.

    Jeremy

  • Sorry for the problems, my bad. I did not test my last query as I posted in a hurry, I do not have 2008 and was mucking about on 2000/2005.

    Your solution is what I should have posted :blush:

    As an alternative to creating a Calendar table and if your date range is small, eg a month then you could use Jeff Moden's Tally Table[/url] to generate the dates.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • No problem, It was the Calendar idea that was key. As ever I think you learn more by having to overcome issues yourself rather than having things put on a plate, this exercise got me to look at GROUP BY in detail and has improved my understanding no end.

    Thank you!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply