1st SELECT after INSERT

  • Hello,

    I create a business entity (consisting of about 20 records inserted into various tables) into our MSSQL 7 database. I immediately retrieve the business entity with a series of SELECT statements. The 1st time I retrieve the business entity, it takes about 10x longer than subsequent retrievals. What gives? One would think that after the set of INSERTs, the data and index pages in question would still be in the buffer pool. I would guess that MSSQL is going out to disk on the 1st retrieval, but I'm not sure. Can anyone shed some light on why the 1st retrieval of this data after an INSERT would be 10x slower than subsequent retrievals?

    Thanks,

    Dave

  • you could run profiler and check exactly what SQL server is doing / where the bottlenecks are.

  • Hi hbkdba,

    What should I be looking for in particular with the profiler? Running the profiler, I see that the SELECTs being run are identical. The only difference is that for the 1st retrieval, the duration for each individual SELECT is much longer than for subsequent retrievals of the business entity. How can I get more detail as to what is happening under the covers?

    Thanks,

    Dave

  • Does it happen the first time every time, or the first time per session? If the later it could be the time to build your query plan.

    Why would you insert the data and immediately retrieve it?

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • Hi Andy,

    Yeah, yeah. The app shouldn't be inserting the data and then immediately retrieving it. But that's an app problem to be solved another day.

    Nonetheless, Here are some timings using the MSSQL profiler. Created the business entity, then fetched the business entity for the 1st time after the inserts (1900ms). Fetch the business entity a second time (160ms). Restart my app server, and fetch the business entity again (80ms). Restart MSSQL, restart app server, fetch business entity (2994 ms). Fetch the business entity again (100ms). Restart app server, and fetch the business entity again (110 ms).

    So the 1st fetch after the inserts is on par timing-wise with a 1st time fetch after restarting MSSQL. This leads me to believe that MSSQL is doing some IO for that 1st fetch for some reason.

    I though perhaps there was some contention occurring on MSSQL because the retrieval of the business entity was occurring so soon after the insert. Perhaps MSSQL was still busy doing something as a result of the INSERTs. Anyway, I added a 10 second delay on my app server prior to retrieving the entity. This had no effect. The 1st retrieval still took 1800ms on MSSQL.

    Dave

  • Is the select back in a proc? How about adding with recompile to force it to build a new plan each time, see what happens?

    Andy

    http://www.sqlservercentral.com/columnists/awarren/

  • It's not in a proc. It is a series of SELECTs wrapped in a BEGIN TRAN/ROLLBACK TRAN.

    Dave

  • It sounds like it is the query plan. If your app is sending simple textual SQL statements with constants as SARGs, then there won't be a match in the procedure cache until after that exact statement has been (compiled, optimised and ) executed for the first time.

    You can confirm that the statement is being cached by examining master.dbo.syscacheobjects after the first execution. Look for the 1st 128 chars of the sql text in syscacheobjects.[sql].

    If I am right, then to avoid the problem, you will have to change your app, I think.

    You would need to parameterize the query in some way (stored proc, sp_executesql...) so that one plan can be reused for different instances of your entity, and you don't get this problem on first execution.

    See Using Stored Procedures and Caching Mechanisms in BOL for info on this topic.

    Tim

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • I wrote 'procedure cache' - meant 'plan cache'.

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • I think you might be expecting too much of the SQL engine. YOU know that the 20 various rows just inserted will satisfy the query you subsequently execute, but how is SQL to know that? I think it does what it thinks is the right thing and goes off to the database and looks for all rows that satisfy your SELECT criteria.

    Of course, I have no facts to back up my hypothesis, but just stating what I'd do if I was a sql server.

    Cheers,

    - Mark


    Cheers,
    - Mark

  • Hi Tim,

    I will take a look at the query plan, statement caching and stats, and see if this is an issue.

    Mark, the issue I think is at a somewhat lower level. The records were just inserted, so it seems to me that they should be reflected in pages that are still in the buffer pool (in memory) when I run the query. So even though MSSQL doesn't "know" that the records I just inserted would satisfy the query, the pages that contain those records should already be in memory, so the retrieval should be fast. I'm guessing that the 10x diff in SELECT performance could be explained by IO occuring in the 1st retrieval, and no IO occuring in the next, but I'm not sure. I'm just looking for a way to prove it one way or another.

    Dave

  • Profiler allows you to look at IO and CPU costs per statement; I would think that would be proof enough.

    Signature is NULL

  • Hi Calvin,

    I agree but I've been unable to get any data in the MSSQL7 profiler for the Duration, Reads, Writes and CPU columns for statements. The column cells are blank for statement rows. I do get values for database connection rows. If I could get these values for each statement, that would be perfect. Any idea how to get the profiler to give me these values for statements (INSERT, SELECT, etc)?

    Thanks,

    Dave

  • Yeah, that's weird. Honestly, I never used the 70 Profiler; I'm a newbie and have always used 2K. Never had problems like that.

    There is an answer, but it's a little complicated. Not to hard to implement, though. It consists of two tables; one called Batch and one called BatchDetails. Batch is the parent and holds "Batch" level data. BatchDetails holds data about Queries within a "Batch".

    First you need to Create a batch:

    declare @iBDid int,

    @iBatchID

    ---------------------------------------------

    Insert Batch (BatchSTartDate, BatchDescription) Values (getdate(), 'Test1')

    set @iBatchID = @@identity

    ---------------------------------------------

    Then you can use this for as many statements as you want:

    ---------------------------------------------

    exec @iBDid = sp_BatchDetailInsert @iBatchID = @iBatchID,

    @vcDescription = 'Describe Query uniquely'

    Print 'Execute Some Statement'

    exec sp_BatchDetailUpdate@iBDid = @iBDid,

    @iRowCnt = @@RowCount

    ---------------------------------------------

    Below is the db schema and the sprocs:

    -------------------------------------

    CREATE TABLE [Batch] (

    [BatchID] [int] IDENTITY (1, 1) NOT NULL ,

    [BatchStartDate] [datetime] NOT NULL ,

    [BatchEndDate] [datetime] NULL ,

    [BatchDescription] [varchar] (1000) NULL ,

    [BatchOwner] [varchar] (100) NULL ,

    PRIMARY KEY CLUSTERED

    (

    [BatchID]

    ) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    CREATE TABLE [BatchDetail] (

    [BDid] [int] IDENTITY (1, 1) NOT NULL ,

    [BatchID] [int] NOT NULL ,

    [Type] [varchar] (50) NULL CONSTRAINT [DF__BatchDetail_Type] DEFAULT ('Details'),

    [Description] [varchar] (100) NULL ,

    [DML] [varchar] (100) NULL ,

    [TableName] [varchar] (100) NULL ,

    [RowCnt] [int] NULL ,

    [StartDate] [datetime] NULL ,

    [EndDate] [datetime] NULL ,

    [TotalSecond] [decimal](18, 0) NULL ,

    [RowBySecond] [decimal](18, 0) NULL ,

    [IO] [decimal](18, 4) NULL ,

    [CPU] [decimal](18, 4) NULL ,

    [SPID] [int] NULL ,

    CONSTRAINT [FK_Batch_BatchID] FOREIGN KEY

    (

    [BatchID]

    ) REFERENCES [Batch] (

    [BatchID]

    )

    ) ON [PRIMARY]

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO

    CREATE procedure sp_BatchDetailInsert

    @iBatchID int,

    @vcTypevarchar(100) = 'Detail',

    @vcDescription varchar(100) = 'Not Provided',

    @vcDMLvarchar(100) = NULL,

    @vcTableNamevarchar(100)= 'Not Provided'

    AS

    set nocount on

    declare @identity int

    Insert BatchDetail

    (BatchID,Type, [Description], DML ,TableName, StartDate, [IO], [CPU])

    Select top 1 @iBatchID, @vcType, @vcDescription , @vcDML, @vcTableName, getdate(),

    s.cpu,

    s.physical_io

    from master.dbo.sysprocesses s

    Wheres.spid = @@spid

    set @identity = @@identity

    set nocount off

    return @identity

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO

    --> See next POST

    Signature is NULL

  • And here's the last Sproc. Hope this is helpful...give's you IO and CPU based on what's showing for the Spid. I've found it to be pretty dependable.

    create procedure sp_BatchDetailUpdate

    @iBDid decimal = NULL,

    @iRowCnt decimal = NULL

    AS

    Declare @decIO decimal (18,4),

    @decCPU decimal (18,4)

    Select@decIO = s.cpu,

    @decCPU= s.physical_io

    from master.dbo.sysprocesses s

    Wheres.spid = @@spid

    UpdateBatchDetail

    SetRowCnt = cast(@iRowCnt as decimal),

    EndDate = Getdate(),

    TotalSecond = datediff(ss, StartDate, Getdate()),

    RowBySecond = Case datediff(ss, StartDate, Getdate())

    When 0 then @iRowCnt

    Else cast(@iRowCnt as decimal)/cast(datediff(ss, StartDate, Getdate()) as decimal)

    END,

    [IO] = @decIO - [IO],

    [CPU] = @decCPU - [CPU]

    WhereBDid = @iBDid

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO

    Signature is NULL

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply