October 23, 2008 at 8:58 pm
Guys,
I am looking out for suggestions wherein I can avoid using cursors and can update the values in the Table. As of now I am not sure if the kind of approach I am looking for will give any performace benefits or nor but just looking for some suggestions to try it out.
I have table that looks something like this
ColAColBColCColD(Identity Column)
001D1V11
NULL D2 V22
NULLD3V33
002D1V14
NULL D2 V25
NULLD3V36
NULL D4 V47
NULLD5V58
003D1V19
NULL D2 V210
NULLD3V311
NULLD4V412
Now i need to update these NULL values. The logic is to populate the NULL values with previous value in ColA unless you find the next NOT NULL value in the columns so the data will look like this
ColAColBColCColD(Identity Column)
001D1V11
001D2 V22
001D3V33
002D1V14
002D2 V25
002D3V36
002D4 V47
002D5V58
003D1V19
003D2 V210
003D3V311
003D4V412
So far I am able to find out the range for each value in ColA that I have to update i.e. I am to get this range
ColAMinValueMaxValue
00113
00248
003912
My Upadate statement will look like this
UPDATE Table
Set ColA='001'
Where Cold between MinRange(1 here) and MaxRange(3 here).
Is there any way i can achieve for all the values in ColA without using cursor.
Thanks in advance
PS
October 23, 2008 at 10:10 pm
This can provide the solution you are looking for:
http://www.sqlservercentral.com/articles/Advanced+Querying/61716/
If you need me to help you get it working, I'd be happy to do so if you'd provide some sample data / table structure according to the link in my signature.
October 24, 2008 at 1:22 am
Hi Seth
Thanks for sharing the link of the articale. I tried understanding but I was not able to understand it to the extent to implement in a way I want.:(
. I have created some dummy data here. Please see if you can help me implementing the UPDATE without using cursor.
--===== If the test table already exists, drop it
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
DROP TABLE #mytable
--===== Create the test table with
CREATE TABLE #mytable
(
CoID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
ColA Varchar(5)
,ColB Char(2)
,ColC Char(2)
)
INSERT INTO #mytable
SELECT
'001','D1','V1'
UNION ALL
SELECT
NULL,'D2', 'V2'
UNION ALL
SELECT
NULL,'D3','V3'
UNION ALL
SELECT
'002','D1','V1'
UNION ALL
SELECT
NULL,'D2', 'V2'
UNION ALL
SELECT
NULL,'D3','V3'
UNION ALL
SELECT
NULL,'D4', 'V4'
UNION ALL
SELECT
NULL,'D5','V5'
UNION ALL
SELECT
'003','D1','V1'
UNION ALL
SELECT
NULL,'D2', 'V2'
UNION ALL
SELECT
NULL,'D3','V3'
select * from #mytable
PS
October 24, 2008 at 1:51 am
either one of the following does what you are asking. The Second query is marginally faster
Update Tgt
Set ColA = Src.ColA
From #MyTable as tgt
Left join(
Select
TgtID = t.CoID
,Src.*
, RowNum = Row_Number() Over( Partition by t.CoID Order by Src.CoId Desc )
From #mytable as t
Inner Join #mytable as Src
on src.CoID < t.CoID
And src.ColA is Not Null
WHERE T.ColA is Null
) as Src
On Src.TgtId = tgt.CoID
and RowNum =1
--Where t.CoId = Tgt.CoId
Update Tgt
Set ColA = Src.ColA
From #MyTable as tgt
Left Join(
Select
TgtID = t.CoID
,SrcId = Max( Src.CoID )
From #mytable as t
Inner Join #mytable as Src
on src.CoID < t.CoID
And src.ColA is Not Null
WHERE T.ColA is Null
group by
t.CoID
) as X
on x.TgtId = tgt.CoID
Left Join #mytable as Src
on Src.CoID = X.SrcID
October 24, 2008 at 5:56 am
No problem. This is the method I was referring to:
[font="Courier New"]DECLARE @ColA VARCHAR(5),
@ColID INT
UPDATE #MyTable
SET @ColA = ISNULL(ColA, @ColA),
ColA = @ColA,
@ColID = ColID
FROM #MyTable WITH (INDEX(0))
[/font]
The key here is the clustered index. For this to work, your clustered index must be on ColID (assuming you want to follow the identity field as the order of which to search through the rows and update the numbers). It is in your test data, so that's good. Also, the WITH (INDEX(0)) can be changed to be WITH(INDEX(Clustered_Index_Name))
October 24, 2008 at 6:26 am
a big thank you to each one of you guys. All the solution provided are working for me. Only thing I have to check is to find the best performing one as my tables has got millions of records.
Thanks Again!!!
Parul
October 24, 2008 at 6:31 am
October 25, 2008 at 4:59 pm
AnzioBake (10/24/2008)
either one of the following does what you are asking. The Second query is marginally faster...
Both methods use a "triangular join" which, because of the way it's used, generates a full "Square join"... with only the original 11 rows, there are 121 internal rows generated. Try that with just 10,000 original rows and you'll end up generating 100,000,000 internal rows which, in turn, will make the code horribly slow. Look at the actual execution plan for proof.
See the following article for more information on "Square" and "Triangular" joins and how that throw performance and scalability on the floor... 🙂
http://www.sqlservercentral.com/articles/T-SQL/61539/
Also, haven't tested the first bit of code, but the second one gives a wrong answer in that it wipes out the original value of ColA in rows 1, 4, and 9.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 25, 2008 at 5:01 pm
Parul Sharma (10/24/2008)
a big thank you to each one of you guys. All the solution provided are working for me. Only thing I have to check is to find the best performing one as my tables has got millions of records.Thanks Again!!!
Parul
I can tell you the answer to that... Seth's will handle a million rows in less than 7 seconds... the others that use a "triangular join" will complete sometime in less than 7 YEARS... 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
October 25, 2008 at 5:21 pm
Jeff Moden (10/25/2008)
Parul Sharma (10/24/2008)
I can tell you the answer to that... Seth's will handle a million rows in less than 7 seconds... the others that use a "triangular join" will complete sometime in less than 7 YEARS... 😉
Patience, I hear, is a virtue. 😀
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
October 25, 2008 at 8:04 pm
rbarryyoung (10/25/2008)
Jeff Moden (10/25/2008)
Parul Sharma (10/24/2008)
I can tell you the answer to that... Seth's will handle a million rows in less than 7 seconds... the others that use a "triangular join" will complete sometime in less than 7 YEARS... 😉Patience, I hear, is a virtue. 😀
I bet THEY told you that. They did... didn't they!
October 26, 2008 at 7:22 pm
Garadin (10/25/2008)
rbarryyoung (10/25/2008)
Jeff Moden (10/25/2008)
Parul Sharma (10/24/2008)
I can tell you the answer to that... Seth's will handle a million rows in less than 7 seconds... the others that use a "triangular join" will complete sometime in less than 7 YEARS... 😉Patience, I hear, is a virtue. 😀
I bet THEY told you that. They did... didn't they!
Yeah... just like the T-Shirt says... "Patience by a$$... I'm goin' ta kill somesthin'" 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
December 18, 2008 at 2:42 am
You are correct Jeff.Seth's approach was the fastest.. I had some 280000000 + records in my tables wherein I have to populate the NULL values. Using seth's appraoch I was able to update in some 7-8 mins all my data.Got appreciation from my Lead as well for using this approach. Ofcourse I told him the source ;).
Thanks again!!!
Parul
December 18, 2008 at 7:47 am
Now there's a milestone... thanks for the feedback on this, Parul...
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply