1st SELECT after INSERT

Question

Post reply

1st SELECT after INSERT

dballard

SSC Journeyman

Points: 80
More actions
August 1, 2003 at 6:30 pm

#106309

Hello,
I create a business entity (consisting of about 20 records inserted into various tables) into our MSSQL 7 database. I immediately retrieve the business entity with a series of SELECT statements. The 1st time I retrieve the business entity, it takes about 10x longer than subsequent retrievals. What gives? One would think that after the set of INSERTs, the data and index pages in question would still be in the buffer pool. I would guess that MSSQL is going out to disk on the 1st retrieval, but I'm not sure. Can anyone shed some light on why the 1st retrieval of this data after an INSERT would be 10x slower than subsequent retrievals?
Thanks,
Dave

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

hbkdba Ten Centuries Points: 1224 More actions · Answer 1

you could run profiler and check exactly what SQL server is doing / where the bottlenecks are.

dballard SSC Journeyman Points: 80 More actions · Answer 2

Hi hbkdba,

What should I be looking for in particular with the profiler? Running the profiler, I see that the SELECTs being run are identical. The only difference is that for the 1st retrieval, the duration for each individual SELECT is much longer than for subsequent retrievals of the business entity. How can I get more detail as to what is happening under the covers?

Thanks,

Dave

Andy Warren SSC Guru Points: 119902 More actions · Answer 3

Does it happen the first time every time, or the first time per session? If the later it could be the time to build your query plan.

Why would you insert the data and immediately retrieve it?

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter

dballard SSC Journeyman Points: 80 More actions · Answer 4

Hi Andy,

Yeah, yeah. The app shouldn't be inserting the data and then immediately retrieving it. But that's an app problem to be solved another day.

Nonetheless, Here are some timings using the MSSQL profiler. Created the business entity, then fetched the business entity for the 1st time after the inserts (1900ms). Fetch the business entity a second time (160ms). Restart my app server, and fetch the business entity again (80ms). Restart MSSQL, restart app server, fetch business entity (2994 ms). Fetch the business entity again (100ms). Restart app server, and fetch the business entity again (110 ms).

So the 1st fetch after the inserts is on par timing-wise with a 1st time fetch after restarting MSSQL. This leads me to believe that MSSQL is doing some IO for that 1st fetch for some reason.

I though perhaps there was some contention occurring on MSSQL because the retrieval of the business entity was occurring so soon after the insert. Perhaps MSSQL was still busy doing something as a result of the INSERTs. Anyway, I added a 10 second delay on my app server prior to retrieving the entity. This had no effect. The 1st retrieval still took 1800ms on MSSQL.

Dave

Andy Warren SSC Guru Points: 119902 More actions · Answer 5

Is the select back in a proc? How about adding with recompile to force it to build a new plan each time, see what happens?

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter

dballard SSC Journeyman Points: 80 More actions · Answer 6

It's not in a proc. It is a series of SELECTs wrapped in a BEGIN TRAN/ROLLBACK TRAN.

Dave

stax68 SSChampion Points: 11711 More actions · Answer 7

It sounds like it is the query plan. If your app is sending simple textual SQL statements with constants as SARGs, then there won't be a match in the procedure cache until after that exact statement has been (compiled, optimised and ) executed for the first time.

You can confirm that the statement is being cached by examining master.dbo.syscacheobjects after the first execution. Look for the 1st 128 chars of the sql text in syscacheobjects.[sql].

If I am right, then to avoid the problem, you will have to change your app, I think.

You would need to parameterize the query in some way (stored proc, sp_executesql...) so that one plan can be reused for different instances of your entity, and you don't get this problem on first execution.

See Using Stored Procedures and Caching Mechanisms in BOL for info on this topic.

Tim

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

stax68 SSChampion Points: 11711 More actions · Answer 8

I wrote 'procedure cache' - meant 'plan cache'.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

mccork SSC-Insane Points: 22288 More actions · Answer 9

I think you might be expecting too much of the SQL engine. YOU know that the 20 various rows just inserted will satisfy the query you subsequently execute, but how is SQL to know that? I think it does what it thinks is the right thing and goes off to the database and looks for all rows that satisfy your SELECT criteria.

Of course, I have no facts to back up my hypothesis, but just stating what I'd do if I was a sql server.

Cheers,

- Mark

Cheers,
- Mark

dballard SSC Journeyman Points: 80 More actions · Answer 10

Hi Tim,

I will take a look at the query plan, statement caching and stats, and see if this is an issue.

Mark, the issue I think is at a somewhat lower level. The records were just inserted, so it seems to me that they should be reflected in pages that are still in the buffer pool (in memory) when I run the query. So even though MSSQL doesn't "know" that the records I just inserted would satisfy the query, the pages that contain those records should already be in memory, so the retrieval should be fast. I'm guessing that the 10x diff in SELECT performance could be explained by IO occuring in the 1st retrieval, and no IO occuring in the next, but I'm not sure. I'm just looking for a way to prove it one way or another.

Dave

Calvin Lawson SSChampion Points: 11030 More actions · Answer 11

Profiler allows you to look at IO and CPU costs per statement; I would think that would be proof enough.

Signature is NULL

dballard SSC Journeyman Points: 80 More actions · Answer 12

Hi Calvin,

I agree but I've been unable to get any data in the MSSQL7 profiler for the Duration, Reads, Writes and CPU columns for statements. The column cells are blank for statement rows. I do get values for database connection rows. If I could get these values for each statement, that would be perfect. Any idea how to get the profiler to give me these values for statements (INSERT, SELECT, etc)?

Thanks,

Dave

Calvin Lawson SSChampion Points: 11030 More actions · Answer 13

Yeah, that's weird. Honestly, I never used the 70 Profiler; I'm a newbie and have always used 2K. Never had problems like that.

There is an answer, but it's a little complicated. Not to hard to implement, though. It consists of two tables; one called Batch and one called BatchDetails. Batch is the parent and holds "Batch" level data. BatchDetails holds data about Queries within a "Batch".

First you need to Create a batch:

declare @iBDid int,

@iBatchID

---------------------------------------------

Insert Batch (BatchSTartDate, BatchDescription) Values (getdate(), 'Test1')

set @iBatchID = @@identity

---------------------------------------------

Then you can use this for as many statements as you want:

---------------------------------------------

exec @iBDid = sp_BatchDetailInsert @iBatchID = @iBatchID,

@vcDescription = 'Describe Query uniquely'

Print 'Execute Some Statement'

exec sp_BatchDetailUpdate@iBDid = @iBDid,

@iRowCnt = @@RowCount

---------------------------------------------

Below is the db schema and the sprocs:

-------------------------------------

CREATE TABLE [Batch] (

[BatchID] [int] IDENTITY (1, 1) NOT NULL ,

[BatchStartDate] [datetime] NOT NULL ,

[BatchEndDate] [datetime] NULL ,

[BatchDescription] [varchar] (1000) NULL ,

[BatchOwner] [varchar] (100) NULL ,

PRIMARY KEY CLUSTERED

(

[BatchID]

) ON [PRIMARY]

GO

CREATE TABLE [BatchDetail] (

[BDid] [int] IDENTITY (1, 1) NOT NULL ,

[BatchID] [int] NOT NULL ,

[Type] [varchar] (50) NULL CONSTRAINT [DF__BatchDetail_Type] DEFAULT ('Details'),

[Description] [varchar] (100) NULL ,

[DML] [varchar] (100) NULL ,

[TableName] [varchar] (100) NULL ,

[RowCnt] [int] NULL ,

[StartDate] [datetime] NULL ,

[EndDate] [datetime] NULL ,

[TotalSecond] [decimal](18, 0) NULL ,

[RowBySecond] [decimal](18, 0) NULL ,

[IO] [decimal](18, 4) NULL ,

[CPU] [decimal](18, 4) NULL ,

[SPID] [int] NULL ,

CONSTRAINT [FK_Batch_BatchID] FOREIGN KEY

(

[BatchID]

) REFERENCES [Batch] (

[BatchID]

)

) ON [PRIMARY]

GO

SET QUOTED_IDENTIFIER OFF

GO

SET ANSI_NULLS ON

GO

CREATE procedure sp_BatchDetailInsert

@iBatchID int,

@vcTypevarchar(100) = 'Detail',

@vcDescription varchar(100) = 'Not Provided',

@vcDMLvarchar(100) = NULL,

@vcTableNamevarchar(100)= 'Not Provided'

AS

set nocount on

declare @identity int

Insert BatchDetail

(BatchID,Type, [Description], DML ,TableName, StartDate, [IO], [CPU])

Select top 1 @iBatchID, @vcType, @vcDescription , @vcDML, @vcTableName, getdate(),

s.cpu,

s.physical_io

from master.dbo.sysprocesses s

Wheres.spid = @@spid

set @identity = @@identity

set nocount off

return @identity

GO

SET QUOTED_IDENTIFIER OFF

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER OFF

GO

SET ANSI_NULLS ON

GO

--> See next POST

Signature is NULL

Calvin Lawson SSChampion Points: 11030 More actions · Answer 14

And here's the last Sproc. Hope this is helpful...give's you IO and CPU based on what's showing for the Spid. I've found it to be pretty dependable.

create procedure sp_BatchDetailUpdate

@iBDid decimal = NULL,

@iRowCnt decimal = NULL

AS

Declare @decIO decimal (18,4),

@decCPU decimal (18,4)

Select@decIO = s.cpu,

@decCPU= s.physical_io

from master.dbo.sysprocesses s

Wheres.spid = @@spid

UpdateBatchDetail

SetRowCnt = cast(@iRowCnt as decimal),

EndDate = Getdate(),

TotalSecond = datediff(ss, StartDate, Getdate()),

RowBySecond = Case datediff(ss, StartDate, Getdate())

When 0 then @iRowCnt

Else cast(@iRowCnt as decimal)/cast(datediff(ss, StartDate, Getdate()) as decimal)

END,

[IO] = @decIO - [IO],

[CPU] = @decCPU - [CPU]

WhereBDid = @iBDid

GO

SET QUOTED_IDENTIFIER OFF

GO

SET ANSI_NULLS ON

GO

Signature is NULL