Updating 566787 Records takes forever

Question

Updating 566787 Records takes forever

KavitaA

Right there with Babe

Points: 794
More actions
November 1, 2012 at 4:30 am

#266496

Hi,
I am writing a script to update bad data in a column of a table. The Table has 566787 rows and this script is taking forever to complete. It updated 1000 records in 5 mins which I think is too much.
There are two triggers and 4 indexes on this table but for now I have deleted the indexes and disabled the triggers to speed up the process
-- The script updates the FullPath column of the table according to the values
-- in Title column and hirNodeParent Column
DECLARE
@hirNode int,
@hirNodeParent int,
@Title varchar(500),
@Return varchar(500),
@LasthirNode int,
@Node int,
@FullPath varchar(1000)
Select hirNode into #hirNodes from hirNodes
While exists (Select * From #hirNodes)
Begin
Set @hirNode = (Select Top 1 hirnode from #hirNodes)
Set @Node = @hirNode
Set @FullPath = (select FullPath from hirNodes where hirNode = @hirNode)
SET @Return = ''
WHILE @hirNode > 0
BEGIN
SELECT @hirNodeParent = hirNodeParent, @Title = Title
FROM hirNodes
WHERE hirNode = @hirNode
SET @LasthirNode = @hirNode
SET @Return = @Title + '\' + @Return
SET @hirNode = @hirNodeParent
END
SET @Return = '\\' + @Return
if(@FullPath <> @Return)
begin
Update hirNodes set FullPath = @Return where hirNode = @Node
end
Delete from #hirNodes where hirNode = @Node
print(@Node)
End
Drop table #hirNodes
--Table Script
CREATE TABLE [dbo].[hirNodes](
[hirNode] [int] IDENTITY(1,1) NOT NULL,
[hirNodeParent] [int] NULL,
[hirLevel] [int] NOT NULL,
[Title] [varchar](500) NULL,
[Brief] [varchar](16) NULL,
[Description] [varchar](256) NULL,
[Active] [bit] NOT NULL,
[DisplayOrder] [varchar](50) NULL,
[FullPath] [varchar](500) NULL,
CONSTRAINT [PK_hirNodes_1] PRIMARY KEY CLUSTERED
(
[hirNode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Ant-Green SSC Guru Points: 113520 More actions · Answer 1

The problem is that this is RBAR as your only ever updating 1 row at a time instead of trying to update 566787 records in one set based operation, meaning that the triggers fire 566787 times instead of once etc etc.

If you would provide some sample data prior to the update and what the expected result after the update should be along with the trigger and index definitions there might be a quicker set based operation to performing the task.

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 2

This looks like you're trying to build a hierachy list using a while loop.

Can you provide some sample data for #HirNodes, as I think you're using a sledgehammer to crack nut, and a Recursive CTE may help but with out sample data its guesswork.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

yuvipoy SSChampion Points: 10469 More actions · Answer 3

What you are trying to do will obviously takes time since you have used loop statement which means it will consume more time.

can you give some sample records at-least 5 records for this table.

Thanks!

KavitaA Right there with Babe Points: 794 More actions · Answer 4

I just added the Sample data..

Do you want the code for Triggers?

The triggers only fire on Update of hirNodeParent column and Title column and I am only updating the full path column and anyway I have disabled the triggers for now.

I am attaching the code for indexes but for now I have deleted them too

/****** Object: Index [idx_hirNodes_fullPath] Script Date: 10/31/2012 19:39:16 ******/

CREATE NONCLUSTERED INDEX [idx_hirNodes_fullPath] ON [dbo].[hirNodes]

(

[FullPath] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

GO

/****** Object: Index [idx_hirNodes_hirLevel] Script Date: 10/31/2012 19:39:52 ******/

CREATE NONCLUSTERED INDEX [idx_hirNodes_hirLevel] ON [dbo].[hirNodes]

(

[hirLevel] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

GO

/****** Object: Index [idx_hirNodes_hirNodeParent] Script Date: 10/31/2012 19:41:47 ******/

CREATE NONCLUSTERED INDEX [idx_hirNodes_hirNodeParent] ON [dbo].[hirNodes]

(

[hirNodeParent] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

GO

/****** Object: Index [idx_hirNodes_Title] Script Date: 10/31/2012 19:47:55 ******/

CREATE NONCLUSTERED INDEX [idx_hirNodes_Title] ON [dbo].[hirNodes]

(

[Title] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

GO

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 5

Definately looks like a job for a recursive CTE in order to build the Path then a simple update after, this is a simple test rig for a select, but the update isnt too bad.

drop table #hirNode

go

Create Table #hirNode(

hirNode Int

,hirNodeParent int

,Title varchar(100)

)

Insert into #hirNode

values (1,NULL,'Start')

,(2,1,'Second')

,(3,2,'Third')

,(4,NULL,'Start2')

,(5,4,'Second2');

WITH BuildPath(hirNode,hirNodeParent, NodeLevel, NodePath)

AS (SELECT hirNode,

hirNodeParent,

1,

CONVERT(varchar(255), '\\'+Title)

FROM #hirNode

WHERE hirNodeParent IS NULL --Start at the Top

UNION ALL

SELECT

h.hirNode,

h.hirNodeParent,

NodeLevel + 1,

CONVERT (varchar(255), RTRIM(NodePath) + '\' + Title)

FROM #hirNode AS h

JOIN BuildPath AS p ON h.hirNodeParent = p.hirNode

)

Select * from BuildPath

The output for NodePath is

\\Start

\\Start\Second

\\Start\Second\Third

\\Start2

\\Start2\Second2

It might need ordering but its just an example, is that what you are looking for?

_________________________________________________________________________
SSC Guide to Posting and Best Practices

KavitaA Right there with Babe Points: 794 More actions · Answer 6

Jason this looks right. But how to update FullPath column using Recursive CTE? I have to update all the rows of the table for FullPath column. Actually the right way would be to check if the FullPath column has correct value if not then update it.

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 7

this should work

drop table #hirNode

go

Create Table #hirNode(

hirNode Int

,hirNodeParent int

,Title varchar(100)

)

Insert into #hirNode

values (1,NULL,'Start')

,(2,1,'Second')

,(3,2,'Third')

,(4,NULL,'Start2')

,(5,4,'Second2');

WITH BuildPath(hirNode,hirNodeParent, NodeLevel, NodePath)

AS (SELECT hirNode,

hirNodeParent,

1,

CONVERT(varchar(255), '\\'+Title)

FROM #hirNode

WHERE hirNodeParent IS NULL --Start at the Top

UNION ALL

SELECT

h.hirNode,

h.hirNodeParent,

NodeLevel + 1,

CONVERT (varchar(255), RTRIM(NodePath) + '\' + Title)

FROM #hirNode AS h

JOIN BuildPath AS p ON h.hirNodeParent = p.hirNode

)

Update

hr

Set

FullPath=bp.NodePath

From

hirNodes hr

JOIN BuildPath bp on hr.hirNode=bp.hirNode

but you should check it

_________________________________________________________________________
SSC Guide to Posting and Best Practices

KavitaA Right there with Babe Points: 794 More actions · Answer 8

This works like a charm.. updated 70k records in 24 sec.

Thanks Jason

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 9

Glad to Help,

You could add a where clause to check the NodePath with the FullPath column eg

WHERE hr.FullPath!=NodePath

Which should reduce the dataset, and save you having unnecessary updates.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

KavitaA Right there with Babe Points: 794 More actions · Answer 10

I just found out that the live server is SQL 2000 and so I can't use Recursive CTE :crying:

I have created this new script that would work on SQL Server 2000. This is way slower than recursive CTE. Please suggest any way of optimizing it.

CREATE TABLE #tbl([hirNode]INT,

[hirNodeParent]INT,

[FullPath]varchar(1000))

INSERT INTO #tbl

SELECT hirNode, hirNodeParent, CONVERT(varchar(1000), '\\'+Title + '\')

FROM hirNodes

WHERE hirNodeParent IS NULL

WHILE EXISTS(SELECT * FROM hirNodes AS t2 join #tbl t1

on t2.hirNodeParent = t1.hirNode

where t2.hirNode NOT IN (SELECT hirNode FROM #tbl))

BEGIN

INSERT INTO #tbl

SELECT t2.hirNode,

t2.hirNodeParent,

CONVERT (varchar(255), RTRIM(t1.FullPath) + Title + '\')

FROM hirNodes AS t2,

#tbl t1

WHERE t2.hirNodeParent = t1.hirNode

AND t2.hirNode NOT IN (SELECT hirNode

FROM #tbl)

END

SELECT hirNodes.hirNode, hirNodes.FullPath, #tbl.FullPath FROM hirNodes

join #tbl on hirNodes.hirNode = #tbl.hirNode

where hirNodes.FullPath <> #tbl.FullPath

Drop Table #tbl

Thanks,

Kavita

Sam S Kolli SSC Eights! Points: 873 More actions · Answer 11

Kavita,

I know this is not a perfect solution, but it might work for you on SQL Server 2000. I am assuming your hierarchies are at most 7 levels deep. (If they are deeper, you can alter the script to accomodate for additional depth; though this would increase the processing time too)

The idea is to build the list of nodes involved in each hierarchy and concatenate their titles upfront and then use this info to do an update.

I ran this script on my laptop's sql server 2008 for around 540000 rows and it took around 40-45 seconds.

Let me know how this works out for you.

----Build Hierarchy

SELECT IDENTITY(INT, 1, 1) AS RowId, H1.hirNode As L1, H2.hirnode AS L2, H3.hirNode AS L3, H4.hirNode AS L4, H5.hirNode AS L5, H6.hirNode As L6, H7.hirNode AS L7,

H1.Title AS LT1,

H1.Title + '\\' + H2.Title AS LT2,

H1.Title + '\\' + H2.Title + '\\' + H3.Title AS LT3,

H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title As LT4,

H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title + '\\' + H5.Title AS LT5,

H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title + '\\' + H5.Title + '\\' + H6.Title AS LT6,

H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title + '\\' + H5.Title + '\\' + H6.Title + '\\' + H7.Title AS LT7

INTO #a

from dbo.HirNodes H1

LEFT JOIN dbo.HirNodes H2 ON (H1.hirNode = H2.hirNodeParent)

LEFT JOIN dbo.HirNodes H3 ON (H2.hirNode = H3.hirNodeParent)

LEFT JOIN dbo.HirNodes H4 ON (H3.hirNode = H4.hirNodeParent)

LEFT JOIN dbo.HirNodes H5 ON (H4.hirNode = H5.hirNodeParent)

LEFT JOIN dbo.HirNodes H6 ON (H5.hirNode = H6.hirNodeParent)

LEFT JOIN dbo.HirNodes H7 ON (H6.hirNode = H7.hirNodeParent)

--SELECT * FROM #a

----From the above, use only those rows where the complete hierarchy is represented

SELECT A1.*

INTO #b

FROM #a A1

LEFT JOIN #a A2 ON (A1.L1 = A2.L2)

WHERE A2.L1 IS NULL

--SELECT * FROM #b

SELECT DISTINCT A.Node, A.Title AS FullPath

INTO #c

FROM (

SELECT B1.L1 AS Node, B1.LT1 AS Title FROM #b B1

UNION

SELECT B2.L2 AS Node, B2.LT2 AS Title FROM #b B1 INNER JOIN #b B2 ON (B1.L1 = B2.L1)

UNION

SELECT B3.L3 AS Node, B3.LT3 AS Title FROM #b B2 INNER JOIN #b B3 ON (B2.L2 = B3.L2)

UNION

SELECT B4.L4 AS Node, B4.LT4 AS Title FROM #b B3 INNER JOIN #b B4 ON (B3.L3 = B4.L3)

UNION

SELECT B5.L5 As Node, B5.LT5 AS Title FROM #b B4 INNER JOIN #b B5 ON (B4.L4 = B5.L4)

UNION

SELECT B6.L6 AS Node, B6.LT6 AS Title FROM #b B5 INNER JOIN #b B6 ON (B5.L5 = B6.L5)

UNION

SELECT B7.L7 AS Node, B7.LT7 AS Title FROM #b B6 INNER JOIN #b B7 ON (B6.L6 = B7.L6)

) AS A

WHERE A.Node IS NOT NULL

/*

---- Use this to get a complete list of hierarchy for each node

SELECT A.RowId, A.R1, A.Node, A.Title AS FullPath, A.Lvl

INTO #c

FROM (

SELECT B1.RowId, B1.L1 AS R1, B1.L1 AS Node, B1.LT1 AS Title, 1 As Lvl FROM #b B1

UNION

SELECT B1.RowId, B1.L1 AS R1, B2.L2 AS Node, B2.LT2 AS Title, 2 As Lvl FROM #b B1 INNER JOIN #b B2 ON (B1.L1 = B2.L1)

UNION

SELECT B2.RowId, B2.L1 AS R1, B3.L3 AS Node, B3.LT3 AS Title, 3 As Lvl FROM #b B2 INNER JOIN #b B3 ON (B2.L2 = B3.L2)

UNION

SELECT B3.RowId, B3.L1 AS R1, B4.L4 AS Node, B4.LT4 AS Title, 4 As Lvl FROM #b B3 INNER JOIN #b B4 ON (B3.L3 = B4.L3)

UNION

SELECT B4.RowId, B4.L1 AS R1, B5.L5 As Node, B5.LT5 AS Title, 5 As Lvl FROM #b B4 INNER JOIN #b B5 ON (B4.L4 = B5.L4)

UNION

SELECT B5.RowId, B5.L1 AS R1, B6.L6 AS Node, B6.LT6 AS Title, 6 As Lvl FROM #b B5 INNER JOIN #b B6 ON (B5.L5 = B6.L5)

UNION

SELECT B6.RowId, B6.L1 AS R1, B7.L7 AS Node, B7.LT7 AS Title, 7 As Lvl FROM #b B6 INNER JOIN #b B7 ON (B6.L6 = B7.L6)

) AS A

WHERE A.Node IS NOT NULL

ORDER BY RowId, R1, Lvl

*/

--SELECT COUNT(*) FROM #c

--SELECT COUNT(*) FROM dbo.HirNodes

----Use Hierarchy

UPDATE H SET H.FullPath = C.FullPath

--SELECT H.hirNode, H.hirNodeParent, H.hirLevel, H.Title, H.FullPath, C.FullPath

FROM dbo.HirNodes H

INNER JOIN #c C ON (H.hirNode = C.Node)

DROP TABLE #a, #b

DROP TABLE #c

KavitaA Right there with Babe Points: 794 More actions · Answer 12

Thanks Sam.

It IS faster though its not returning the correct fullPath.. I am working on it.

I'll let you know how it goes

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 13

Hi Kavita,

Sorry the CTE didnt help on the client site due to having SQL 2000 installed, just file it away for future reference, and beat the client with a big stick until they upgrade to SQL2008 or later.:-D

_________________________________________________________________________
SSC Guide to Posting and Best Practices

KavitaA Right there with Babe Points: 794 More actions · Answer 14

I apologize Sam. Its working perfectly fine.

I ran it for 460K records and it took ~5mins which is really not bad.

Thank You!

Jason - I wish but it was a great learning experience. I am excited to use it in the future.

Updating 566787 Records takes forever

Cookies on SQLServerCentral