August 20, 2010 at 8:06 am
Hi
while the forum does try to aid things, some of us (me) are quite new to this, and would appreciate help. I have tried to tweak the code from the initial article but having problems, I know the next statement would be get the structure of thwe query online but that only works on tables created.
I really would appreciate help.
August 20, 2010 at 8:14 am
niyinks (8/20/2010)
Hiwhile the forum does try to aid things, some of us (me) are quite new to this, and would appreciate help. I have tried to tweak the code from the initial article but having problems, I know the next statement would be get the structure of thwe query online but that only works on tables created.
I really would appreciate help.
I am not sure if I follow you correctly, but do you say that you got the original sample code to work, but not a variation of it and cannot post the code here due to this issue? I kind of lost you where you said: but that only works on tables created
Why not post the code for the table creation here then, if that seems to fail for you? It might clarify on what you try to do!
August 20, 2010 at 8:23 am
The code only takes into account one coulmn for dates, while I have two columns for date the first being an entry date the other being an exit date, I am seeking a way to create a query to account for multiple re-entries i.e readmissions. so I could not go beyond the first part of the article. I also know the rules of the forum advice you should post the structure of the query you are working on but I am unable to do that as I only know how to copy structure of created tables
my query is rather flawed but it is the best I could do
with readmissions
as
(select a.PSEUDO_HESID,a.provspno,test,DISCH_DATE,ADMIN_DATE,a.PROCODE3,a.diag1,a.RowNumber,tally
from
(select PSEUDO_HESID,provspno,DISCH_DATE,PROCODE3,diag1,RowNumber =
Row_Number() OVER(PARTITION BY PSEUDO_HESID ORDER BY DISCH_DATE ASC)
from dbo.HES_APC_200809)a
left join
(select PSEUDO_HESID,provspno test,ADMIN_DATE,PROCODE3,diag1,tally =
Row_Number() OVER(PARTITION BY PSEUDO_HESID ORDER BY ADMIN_DATE ASC)
from dbo.HES_APC_200809)b
on a.procode3=b.procode3
and a.PSEUDO_HESID=b.PSEUDO_HESID
and a.diag1=b.diag1
and a.provspno < test
group by a.PSEUDO_HESID,a.provspno,test,DISCH_DATE,ADMIN_DATE,a.PROCODE3,a.RowNumber,a.diag1,tally)
select distinct*
from readmissions
where DISCH_DATE < ADMIN_DATE
August 20, 2010 at 8:36 am
Obviously, this is an old post and one I would have been interested in 2 years ago. So great article! One thing I'd like to mention is that with databases, naming is everything. When you talk about "some databases would store both the Start and End date" in the table, I think we need to understand the purpose of those columns. When they are named "Start" and "End", it doesn't make much sense in the table and your example is great is reducing redundancy! However, rename those columns to "EffectiveDate" and "ExpirationDate" and add a "PurposeTypeID" (relating back to a look up table that maybe has "PriceChange", "Promotion", etc), and then we can see how Prices can be added for future Promotions and then automatically removed (using a query that takes into account the "ExpirationDate") on the fly.
So, let's say you have an eCommerce solution and the marketing/sales department wants to run a promotion next month for 15 days. You'd be able to enter the data today, and it would "take effect" (EffectiveDate = next month) later. Also, as you're looking for the Price of an Item, you would be able to show both the "base" price (where PurposeTypeID = {BasePrice?}) as well as the current "promotional" price to the user. Once the "ExpirationDate" has elapsed, then the UI no longer displays this promotional price to the user.
Hope this makes, it's Friday and my thoughts might not be flowing correctly yet this morning...
August 20, 2010 at 8:38 am
The article does present a good educational example on the use of CTEs and the RANK function. However, I don't think ranking is the most practical and efficient solution to the specific business problem of linking a history of price changes. The problem is that row numbers are logical so joining on a ranked row number makes usage of index impossible. I doubt the solution would perform well at all on a moderately sized table, and price history tables tend to be quite large.
Ideally related price history rows could be tied based on a clustered primary key like I've done below. Combining ItemID with another column containing an incremented sequence would be ideal, because you know what ID preceeded and follows without performing a group by or ranking operation, and the sequence combined with the ItemID makes a unique key. This makes joining the previous and next price change efficient and simple.
declare @PriceHistory table
(
primary key clustered ( ItemId, ItemSeqID ),
ItemId int not null,
ItemSeqID smallint not null,
PriceStartDate smalldatetime not null,
Price smallmoney not null
);
insert into @PriceHistory
select 1, 0, '2004-03-01', 250.00 union all
select 1, 1, '2005-06-15', 219.99 union all
select 1, 2, '2007-01-03', 189.99 union all
select 1, 3, '2007-02-03', 200.00 union all
select 2, 0, '2006-07-12', 650.00 union all
select 2, 1, '2007-01-03', 550.00 union all
select 3, 0, '2005-01-01', 1.99 union all
select 3, 1, '2006-01-01', 1.79 union all
select 3, 2, '2007-01-01', 1.59 union all
select 3, 3, '2008-01-01', 1.49;
select
PH.ItemID, PHP.Price OldPrice, PH.Price RangePrice, PH.PriceStartDate StartDate, PHN.PriceStartDate EndDate
from @PriceHistory PH
left join @PriceHistory PHP on PHP.ItemID = PH.ItemID and PHP.ItemSeqID = (PH.ItemSeqID - 1)
left join @PriceHistory PHN on PHN.ItemID = PH.ItemID and PHN.ItemSeqID = (PH.ItemSeqID + 1);
ItemID OldPrice RangePrice StartDate EndDate
----------- --------------------- --------------------- ----------------------- -----------------------
1 NULL 250.00 2004-03-01 00:00:00 2005-06-15 00:00:00
1 250.00 219.99 2005-06-15 00:00:00 2007-01-03 00:00:00
1 219.99 189.99 2007-01-03 00:00:00 2007-02-03 00:00:00
1 189.99 200.00 2007-02-03 00:00:00 NULL
2 NULL 650.00 2006-07-12 00:00:00 2007-01-03 00:00:00
2 650.00 550.00 2007-01-03 00:00:00 NULL
3 NULL 1.99 2005-01-01 00:00:00 2006-01-01 00:00:00
3 1.99 1.79 2006-01-01 00:00:00 2007-01-01 00:00:00
3 1.79 1.59 2007-01-01 00:00:00 2008-01-01 00:00:00
3 1.59 1.49 2008-01-01 00:00:00 NULL
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 20, 2010 at 8:56 am
niyinks (8/20/2010)
The code only takes into account one coulmn for dates, while I have two columns for date the first being an entry date the other being an exit date, I am seeking a way to create a query to account for multiple re-entries i.e readmissions. so I could not go beyond the first part of the article. I also know the rules of the forum advice you should post the structure of the query you are working on but I am unable to do that as I only know how to copy structure of created tablesmy query is rather flawed but it is the best I could do
with readmissions
as
(select a.PSEUDO_HESID,a.provspno,test,DISCH_DATE,ADMIN_DATE,a.PROCODE3,a.diag1,a.RowNumber,tally
from
(select PSEUDO_HESID,provspno,DISCH_DATE,PROCODE3,diag1,RowNumber =
Row_Number() OVER(PARTITION BY PSEUDO_HESID ORDER BY DISCH_DATE ASC)
from dbo.HES_APC_200809)a
left join
(select PSEUDO_HESID,provspno test,ADMIN_DATE,PROCODE3,diag1,tally =
Row_Number() OVER(PARTITION BY PSEUDO_HESID ORDER BY ADMIN_DATE ASC)
from dbo.HES_APC_200809)b
on a.procode3=b.procode3
and a.PSEUDO_HESID=b.PSEUDO_HESID
and a.diag1=b.diag1
and a.provspno < test
group by a.PSEUDO_HESID,a.provspno,test,DISCH_DATE,ADMIN_DATE,a.PROCODE3,a.RowNumber,a.diag1,tally)
select distinct*
from readmissions
where DISCH_DATE < ADMIN_DATE
I am stil not 100% sure what you try to do, I get there are two dates:
* admin_date is likely a start date
* disch_date is likely the end date
Would it be reasonable the ranges these dates form do never overlap per partition (pseudo_hesid)?
If so (likely), you can ignore the end date with respect to determine previous and next, simply by ordering on admin_date. You do not even have to find the next record to determine an implicit end date!
I did restructured your original code (so I and others can more easily make sense of it):
;
with
readMissionsQ as
(
select
a.pseudo_hesid
, a.procode3
, a.diag1
, a.provspno
, b.provspno as test
, a.disch_date
, b.admin_date
, a.RowNumber
, b.RowNumber as tally
from
(
select
pseudo_hesid
, procode3
, diag1
, provspno
, disch_date
, RowNumber = Row_Number() over( partition by pseudo_hesid order by disch_date asc )
from
dbo.HES_APC_200809
) as a
left join
(
select
procode3
, pseudo_hesid
, diag1
, provspno
, admin_date
, RowNumber = Row_Number() over( partition by pseudo_hesid order by admin_date asc )
from
dbo.HES_APC_200809
) as b
on
a.pseudo_hesid = b.pseudo_hesid and
a.procode3 = b.procode3 and
a.diag1 = b.diag1 and
a.provspno < b.provspno
group by
a.pseudo_hesid
, a.procode3
, a.diag1
, a.provspno
, b.provspno
, a.disch_date
, b.admin_date
, a.RowNumber
, b.RowNumber
)
select distinct
*
from
readMissionsQ
where
a.disch_date < b.admin_date
;
August 20, 2010 at 9:24 am
A precomputed column for the sequence can lead to problems when someone modifies the data. Say a date changes then you'd need to resequence all the rows at the time of the update. This makes bulk inserts impossible without triggers or some other sequencing mechanism.
I posted an OUTER APPLY method earlier that should give good index use without requiring you to use an intermediate temp table or having a precomputed column.
http://www.sqlservercentral.com/Forums/FindPost972419.aspx
Charlie.
August 20, 2010 at 9:24 am
Cade Roux (8/20/2010)
I scrolled through the discussion and I didn't see any mention that CTEs can be stacked. This makes some operations a lot easier to read and maintain (format appropriately into a stack):WITH A AS (), B AS (SELECT * FROM A), C AS (SELECT * FROM B INNER JOIN A) SELECT * FROM C
Yes, it may be slower than intermediate tables, but of course if you tweak something in an intermediate result, it's a lot less work since you don't have to go change the table structure.
If you use SELECT INTO to build your Temp tables, you don't need to change the table structure... it just happens.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 20, 2010 at 9:53 am
charles.gildawie (8/20/2010)
A precomputed column for the sequence can lead to problems when someone modifies the data. Say a date changes then you'd need to resequence all the rows at the time of the update. This makes bulk inserts impossible without triggers or some other sequencing mechanism.
An audit table of price history changes doesn't sound like something that needs to get updated or bulk loaded, only inserted when a price actually changes. If a data entry mistake has been made and the user needs to go back and change the effective StartDate of an item price, then you just need a procedure that accepts @ItemID as a parameter and then updates the ItemSeqID for all rows keyed on that item, which can leverage the row number and ranking function. I don't see why such a table would need to be bulk loaded, but whenever it is, have the procedrue accept @ItemID as NULL and then re-update the ItemSeqID for all items.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 20, 2010 at 10:28 am
Jeff Moden (8/20/2010)If you use SELECT INTO to build your Temp tables, you don't need to change the table structure... it just happens.
Are there limitations surrounding that? (besides just not being able to be used in views or inline table-valued functions)
I know I've had problems in the past - obviously it doesn't work with EXEC, so I know I built tables manually for that.
August 20, 2010 at 10:54 am
Jeff Moden (8/20/2010)
If you use SELECT INTO to build your Temp tables, you don't need to change the table structure... it just happens.
When choosing between ranking the item changes on the fly, leveraging a temporary table, or implementing a precalculated sequence column, the best solution would depend on how this resultset would be filtered and how often it's consumed. For example, would the end user use the query frequently in a dataview but only for a specific item, or would this be a year end report that dumps out the price history for every item in stock? If so, then doing this on the fly might make sense.
However, if the user needs to routinely view the price history for multiple items, then I think it makes sense to have a precalculated sequence column that combines with ItemID to form a primary key. Without a sequence column, then the primary key would be ItemID and StartDate.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 20, 2010 at 10:56 am
Cade Roux (8/20/2010)
Jeff Moden (8/20/2010)If you use SELECT INTO to build your Temp tables, you don't need to change the table structure... it just happens.
Are there limitations surrounding that? (besides just not being able to be used in views or inline table-valued functions)
I know I've had problems in the past - obviously it doesn't work with EXEC, so I know I built tables manually for that.
Heh... of course there are limitations surrounding that. Just like everything else in T-SQL. You say you've had problems in the past with it... Then you cite the problems with EXEC. You could use OPENROWSET but then you have the limitation of allowing "ad hoc" queries to be executed.
Everything has a limitation. What type of limitiation are you looking for with SELECT INTO?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 20, 2010 at 11:02 am
Eric Russell 13013 (8/20/2010)
Jeff Moden (8/20/2010)
If you use SELECT INTO to build your Temp tables, you don't need to change the table structure... it just happens.[font="Arial Black"]When choosing between ranking the item changes on the fly, leveraging a temporary table, or implementing a precalculated sequence column, the best solution would depend on how this resultset would be filtered and how often it's consumed[/font]. For example, would the end user use the query frequently in a dataview but only for a specific item, or would this be a year end report that dumps out the price history for every item in stock? If so, then doing this on the fly might make sense.
However, if the user needs to routinely view the price history for multiple items, then I think it makes sense to have a precalculated sequence column that combines with ItemID to form a primary key. Without a sequence column, then the primary key would be ItemID and StartDate.
If you take my comment out of context like you have, then I agree. BUT, I was explaining how it IS possible to NOT have to make changes to Temp table code by using SELECT INTO.
Shifting gears, if, as you say, the user needs to routinely view the price history for multiple items, it makes even better sense to have a fully separate table. That way the historical lookup wouldn't need have to read past the current data and current lookups wouldn't have to read past historical data. Of course, it depends. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
August 20, 2010 at 11:18 am
As a side bar, doing a presentation of the daily price changes should probably be left to the GUI. Doing them in SQL Server is going to be a waste of one resource or the other... the "pipe" will suffer a bit because of the extra data that needs to be transmitted to the user's desktop and either some extra batch time will be needed to keep the price changes up to date or extra resources will be used for on-the-fly calculations.
If the GUI does it, the client resource is being used and the "pipe" is being freed from having to transmit the extra bit of data. Considering that you only really need Date and Price instead of Date/Price/Change, you'd be saving the "pipe" and the server IO system a full 33% on the work it had to do and the load it had to carry. Multiply that by a couple thousand users and you'll see the real savings.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 20, 2010 at 1:22 pm
Jeff Moden (8/20/2010)
Cade Roux (8/20/2010)
Jeff Moden (8/20/2010)If you use SELECT INTO to build your Temp tables, you don't need to change the table structure... it just happens.
Are there limitations surrounding that? (besides just not being able to be used in views or inline table-valued functions)
I know I've had problems in the past - obviously it doesn't work with EXEC, so I know I built tables manually for that.
Heh... of course there are limitations surrounding that. Just like everything else in T-SQL. You say you've had problems in the past with it... Then you cite the problems with EXEC. You could use OPENROWSET but then you have the limitation of allowing "ad hoc" queries to be executed.
Everything has a limitation. What type of limitiation are you looking for with SELECT INTO?
I don't recall - it was probably an issue of creation-timing and visibility of the temp table. Enough so that it's not my first choice of design technique.
I think I typically design with CTEs because I like to use views and inline table-valued functions to be able to re-use things easily. And it's easy to convert an ad hoc query using CTEs into either of those and using a temp table does imply a side-effect that could stick around.
Viewing 15 posts - 76 through 90 (of 147 total)
You must be logged in to reply to this topic. Login to reply