October 22, 2003 at 9:27 am
I am in need of some Tsql advice. I have a table with about 200,000 records. The Table is for means of tracking transactions daily. Each record has A Unique Identifier(SSN)but this Unique Indentifier is not distinct in the table there are duplicates. The problem I am encountering is that I need to distinguish the type of differences between like records with the same Unique Identifier. Another field in each record named HBMID_CODE defines the type of transaction that Occurred. I want to loop through the entire table grouping on all duplicate unique identifiers and then update a blank field with a code based on the change (if any) of the HBMID_CODE FIELD.
I hope this question is understandable and Not to confusuing.
Thanks
Anthony
Anthony Malone
Anthony Malone
October 22, 2003 at 11:31 am
Can you post small sample of initial data and a list of results you expect after the update.
Far away is close at hand in the images of elsewhere.
Anon.
October 23, 2003 at 8:22 am
A sample of the data would be like this:
Create Table Tracking
SSN Varchar(9),
HBMID_CODE Char(1),
Transaction_Change Char(1),
ENTRY_DATE Datetime,
TERM_DATE Datetime
TIMESTAMP DATETIME
first Transaction
Insert into Tracking
Values('123-34-5678','P','',# 10/01/2003 #,# 10/10/2003 #,TIMESTAMP)
Second Transaction
Insert into Tracking
Values('123-34-5678','R','',# 10/25/2003 #,# 10/29/2003 #,TIMESTAMP)
Third Transaction
Insert into Tracking
Values('123-34-5678','R','',# 10/30/2003 #,# 11/15/03 #,TIMESTAMP)
Fourth Transaction
Insert into Tracking
Values('123-34-5678','S','',# 11/21/2003 #,# 11/30/03 #,TIMESTAMP)
I Need to update the field Transaction_Change
to a 'Y' When The HBMID_CODE Changes From one code to another, the entries are sequenced by the Timestamp. e.g. (transaction 1 HBMID_CODE = P, transaction 2 HBMID_CODE = R),(transaction 3 HBMID_CODE = R, transaction 4 HBMID_CODE = S,) or when no change has occurred update the field Transaction_Change
to a 'N' e.g. (transaction 2 HBMID_CODE = R, transaction 3 HBMID_CODE = R,). Again there are duplicate Unique Identifiers (SSN) in the Database And then there are ones with single entries. I need to loop through The entire Database Grouping on The duplicates and Updating that Transaction_Change Field
Whenever The HBMID_CODE Changes.
Anthony Malone
Anthony Malone
October 23, 2003 at 2:41 pm
This should be faster than a cursor:
UPDATE Tracking
SET Transaction_Change =
CASE WHEN HBMID_CODE =
(SELECT TOP 1 HBMID_CODE
FROM Tracking t
WHERE t.TimeStamp < Tracking.TimeStamp AND t.SSN = Tracking.SSN
ORDER BY t.TimeStamp DESC)
THEN 'N' ELSE 'Y' END
I'm assuming you want the change code to be "Y" for the first (or only) occurence of a SSN. If not, just use something like this:
UPDATE Tracking
SET Transaction_Change =
CASE WHEN HBMID_CODE <>
(SELECT TOP 1 HBMID_CODE
FROM Tracking t
WHERE t.TimeStamp < Tracking.TimeStamp AND t.SSN = Tracking.SSN
ORDER BY t.TimeStamp DESC)
THEN 'Y' ELSE 'N' END
--Jonathan
--Jonathan
October 24, 2003 at 2:01 am
My solution was similar to Jonathan's but I had reservations about performance using it against 200,000 records. If it is a one off then... as long as it works. I'd be interested in how long it takes?
Far away is close at hand in the images of elsewhere.
Anon.
October 24, 2003 at 4:57 am
You can also use trigger incombination with the deleted and inserted table
ROnald
October 24, 2003 at 5:58 am
quote:
My solution was similar to Jonathan's but I had reservations about performance using it against 200,000 records. If it is a one off then... as long as it works. I'd be interested in how long it takes?
I had the same worry. I wonder if this might be more realistic:
CREATE FUNCTION dbo.Changed(@SSN char(11), @HBMID_CODE char, @ts binary(8)) RETURNS char AS
BEGIN
DECLARE @Changed char
SELECT @Changed = CASE WHEN @HBMID_CODE =
(SELECT TOP 1 HBMID_CODE
FROM Tracking t
WHERE t.TimeStamp < @ts AND t.SSN = @SSN
ORDER BY t.TimeStamp DESC) THEN 'N' ELSE 'Y' END
RETURN @Changed
END
CREATE TABLE Tracking(
SSN char(11),
HBMID_CODE char,
Transaction_Change AS dbo.Changed(SSN, HBMID_CODE, TIMESTAMP),
ENTRY_DATE datetime,
TERM_DATE datetime,
TIMESTAMP)
CREATE CLUSTERED INDEX ix_Tracking_SSN ON Tracking(SSN)
--Jonathan
Edited by - Jonathan on 10/24/2003 07:29:02 AM
--Jonathan
October 28, 2003 at 9:20 am
Thanks For Your Help Guys. But I have a problem, the solution of using the update statement which included the case statement has been running for almost 20 hrs with no end sight. 200,000 records are a ton load of records to update with an update statement of that magnitude. Also I am taking a risk running this lengthy update statement because i have put that table in single user mode And this is suppose to be a production table. Secondly I did not understand the second solution to the problem. I created the UDF, the table has an index for the SSN already. The only way to update the informaion is going foward where new records are inserted. What about the records that are currently in the table how does the UDF help to update those records? Explain the second option once more maybe i am missing something.
Thanks,
Anthony
Anthony Malone
Anthony Malone
October 28, 2003 at 9:35 am
quote:
Thanks For Your Help Guys. But I have a problem, the solution of using the update statement which included the case statement has been running for almost 20 hrs with no end sight. 200,000 records are a ton load of records to update with an update statement of that magnitude. Also I am taking a risk running this lengthy update statement because i have put that table in single user mode And this is suppose to be a production table. Secondly I did not understand the second solution to the problem. I created the UDF, the table has an index for the SSN already. The only way to update the informaion is going foward where new records are inserted. What about the records that are currently in the table how does the UDF help to update those records? Explain the second option once more maybe i am missing something.
That's what David and I were worried about. You really need a clustered index on SSN to get good performance; that is the last line in the DDL I posted.
If this is not a one-time update, then the idea behind using a UDF (with or without its use in a computed column) is to calculate the Transaction_Change values "on the fly" only when needed. Otherwise you'll have to run the update again whenever the table is changed. By using the UDF, you can limit the calculations involved to just those rows being queried and only when the Transaction_Change column is needed in the result set, with no logging.
If it's any consolation, the time required almost certainly would have been longer if you had used a cursor to update each row.
Watch your tran log size...
--Jonathan
--Jonathan
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply