Index performance problem...

Question

Index performance problem...

mayank jarsaniya

Hall of Fame

Points: 3532
More actions
March 31, 2010 at 9:16 am

#139065

CREATE TABLE [dbo].[Soft](
[FileID] [int] NOT NULL,
[DocID] [nvarchar](50) NOT NULL,
[PageID] [nvarchar](50) ,
[Name] [nvarchar](50) ,
CONSTRAINT [PK_Soft] PRIMARY KEY CLUSTERED
(
[FileID] ASC,
[DocID] ASC,
[PageID] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
insert into soft (fileid,docid,pageid,name)
values(1,'00200557_00000','0200557_00000_00001','29050215')
insert into soft (fileid,docid,pageid,name)
values(1,'00200558_00000','00200558_00000_00001','129050215')
insert into soft (fileid,docid,pageid,name)
values(1,'00200762_00000','00200762_00000_00001','129050215')
insert into soft (fileid,docid,pageid,name)
values(1,'00200821_00000','00200821_00000_00001','129050215')
insert into soft (fileid,docid,pageid,name)
values(1,'00205072_00000','00205072_00000_00001','129050215')
-- now i have update query:
update Soft
set name ='MAYANK'
WHERE FileID=1 AND PageID='0200557_00000_00001'
-- I need to update 5 millions records one by one.It works slow.....takes 5 hrs to finish........so i have created on index as follow:
CREATE UNIQUE NONCLUSTERED INDEX [IndexForfilepage]
ON [dbo].[Soft]
(
[PageID] ASC,
[FileID] ASC
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
-- it boost the performance...........max 1 hrs to update 5 million......
-- Now My question is in this index seek got faster than clustered index scan...........why?
--can any one explain me which index is faster.................
--and in my case can any one explain me why index seek is faster than clusterd index seek.
--Thanks in advance..........

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

J-F Bergeron SSCrazy Eights Points: 8593 More actions · Answer 1

From what I can understand, your update query hits 5 million rows.

The only filter that is in your update is the PageID, and FileID, and you said you were able to create a UNIQUE nonclustered index on those exact 2 columns, therefore, limiting the numbers of rows to be updated to 1.

Now, if the PageID, and FileID are unique, how can you update 5 million rows with that query? Or, did I not get your point, and you issue 5 million update statements? Then that would be the issue...

Cheers,

J-F

mayank jarsaniya Hall of Fame Points: 3532 More actions · Answer 2

ya u r right sorry for that ...........

not a one update query ....there are 5 million update queries are fired against this table.

declare @name varchar(50)

declare @fileid varchar(50)

declare @pageid varchar(50)

update Soft

set name =@name

WHERE FileID=@fileid AND PageID=@pageid;

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 3

An index seek is like using the telephone directory to go straight to "Mr M Brown", because it's ordered by surname. A clustered index scan is like reading the entire telephone directory cover to cover.

http://sqlinthewild.co.za/index.php/2007/11/15/execution-plan-operations-scans-and-seeks/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

J-F Bergeron SSCrazy Eights Points: 8593 More actions · Answer 4

How do you build your update queries? From what field do you base your statement to be able to generate the update queries?

Because we could change that, and therefore run just 1 update statement, that would join to a table where you have all the documents to update.

That would be far better then executing 5 million updates, which makes no sense, in a set based solution. This is where you have a performance problem.

Cheers,

J-F

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 5

mjarsaniya (3/31/2010)
not a one update query ....there are 5 million update queries are fired against this table.

Why are you doing 5 million updates one at a time?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

J-F Bergeron SSCrazy Eights Points: 8593 More actions · Answer 6

GilaMonster (3/31/2010)
mjarsaniya (3/31/2010)
not a one update query ....there are 5 million update queries are fired against this table.
Why are you doing 5 million updates one at a time?

Good Question! :w00t::w00t:

Cheers,

J-F

mayank jarsaniya Hall of Fame Points: 3532 More actions · Answer 7

I have one file processing application in which

developer read row by row from database (physical file path,file name)then process source file on some conditions.

after processing its generates a new value so each record needs one update to sql server record to update a new value.

Ex.

(fileid,docid,pageid,name,filepath)

while (fileid<=5000000)

BEGIN

--first developer reads fileid and location of file from sql server db...

--go to file and process that file using c# code.

--it gets key words,no.of pages ,and other info from this file.

--this info need to updated in my table soft

--there is one column for each information.......

I have used column 'name ' in previous example which to be updated but there may be 20 to 30 records i have to update.

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 8

mjarsaniya (3/31/2010)
-- now i have update query:
update Soft
set name ='MAYANK'
WHERE FileID=1 AND PageID='0200557_00000_00001'

Is it from any Stored proc ? if NO then i think u can incorporate this Update in any SP and it will call by application (c#); one update will get executed every time.

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

J-F Bergeron SSCrazy Eights Points: 8593 More actions · Answer 9

You have a big problem with the client application. It must NOT call once per row to get the information, you have to find a way to revamp the procedure so it returns sets of data to the C# application, and the application must work within a dataset, then return the data back to you, in a temp table, or anything, so the data can be updated in the database.

It is a really big design problem you are talking about, and you should talk to your developpers about working in sets, within a C# dataset, that is where your answer lies...

If you have problems with certain area of the transmission or with the concept, then we can help.

Cheers,

J-F