April 7, 2016 at 2:33 am
Hi all
I've got a real head-thrasher on my hands.
I've got a stored procedure that takes a parameter and runs an insert (I'll post the code at the bottom).
If I run the base code in SSMS it takes around 30 seconds to insert the data into a table.
If I run the stored procedure from a task in SSMS or Visual Studio (just a normal execute SQL task calling the stored procedure with the parameter) it takes anything up to 20 minutes.
We're using all the indexes that are available to us (and we can't add any more as it's a 3rd-party application).
Has anyone got any ideas as to why there's a difference?
The stored procedure is here:-
USE [DataWarehouse_PreStaging]
GO
/****** Object: StoredProcedure [dbo].[usp_Load_pl_MT_APC_Episode] Script Date: 07/04/2016 09:29:25 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
/*=====================================================================================================================
Create a process list for the DW staging update process by getting updated records from all tables used to populate the
NAC Attendance table
NOTE:This procedure is to be created on the same server as the snapshots otherwise performance is
severely impeded
=======================================================================================================================
InitialsDate/Time of ChangeChange Made
SL11/02/2016Initial Coding
SL09/03/2016Added in additional tables
=======================================================================================================================
Example Call: EXEC usp_Load_pl_MT_APC_Episode '20150909 00:00:00'
=======================================================================================================================*/
CREATE PROCEDURE [dbo].[usp_Load_pl_MT_APC_Episode]
@LastUpdateDateTime AS DATETIME
AS
BEGIN
SET DATEFORMAT dmy
TRUNCATE TABLE pl_MT_APC_Episode
INSERT INTO pl_MT_APC_Episode
(
SourceID
,AbstractID
,LatestRowUpdateDateTime
)
SELECT
SourceID
,AbstractID
,MAX(RowUpdateDateTime) AS LatestRowUpdateDateTime
FROM
(
--We need a select statement below to get updated records
--for each MT table used to populate the working table
--AbstractData
SELECT
SourceID
,AbstractID
,RowUpdateDateTime
FROM livedb_daily.dbo.AbstractData AD
WHERE
AD.PtStatus = 'IN'
ANDRowUpdateDateTime > @LastUpdateDateTime
--AdmPatUks
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,ADMPUK.RowUpdateDateTime
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AdmPatUks ADMPUK
ONAD.SourceID= ADMPUK.SourceID
ANDAD.VisitID= ADMPUK.VisitID
WHERE
AD.PtStatus = 'IN'
ANDADMPUK.RowUpdateDateTime > @LastUpdateDateTime
--AbsServices
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,SER.RowUpdateDateTime
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AbsServices SER
ONAD.SourceID= SER.SourceID
ANDAD.AbstractID= SER.AbstractID
WHERE
AD.PtStatus = 'IN'
ANDSER.RowUpdateDateTime > @LastUpdateDateTime
--AbsPatUkConsultantEpisodes
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,EP.RowUpdateDateTime
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AbsPatUkConsultantEpisodes EP
ONAD.SourceID= EP.SourceID
ANDAD.AbstractID= EP.AbstractID
WHERE
AD.PtStatus = 'IN'
ANDEP.RowUpdateDateTime > @LastUpdateDateTime
--MisSpec_Main & MisSpec_Codes (TFC)
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,RowUpdateDateTime= (
SELECT
CASE
WHEN MAX(TREATFUNC.RowUpdateDateTime) > MAX(TREATFUNCTC.RowUpdateDateTime)
THEN MAX(TREATFUNC.RowUpdateDateTime)
ELSE
MAX(TREATFUNCTC.RowUpdateDateTime)
END
FROM
livefocdb_daily.dbo.MisSpec_Main TREATFUNC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSpec_Codes TREATFUNCTC
ONTREATFUNC.SourceID= TREATFUNCTC.SourceID
ANDTREATFUNC.MisSpecID= TREATFUNCTC.MisSpecID
AND TREATFUNCTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
TREATFUNC.SourceID= ADMPUK.SourceID
ANDTREATFUNC.MisSpecID= ADMPUK.SubSpecialty
)
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AdmPatUks ADMPUK
ONAD.SourceID= ADMPUK.SourceID
ANDAD.VisitID= ADMPUK.VisitID
WHERE
AD.PtStatus = 'IN'
ANDAD.AdmitDateTime IS NOT NULL
ANDADMPUK.SubSpecialtyIN (
SELECT
TREATFUNC.MisSpecID
FROM
livefocdb_daily.dbo.MisSpec_Main TREATFUNC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSpec_Codes TREATFUNCTC
ONTREATFUNC.SourceID= TREATFUNCTC.SourceID
ANDTREATFUNC.MisSpecID= TREATFUNCTC.MisSpecID
AND TREATFUNCTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
TREATFUNC.RowUpdateDateTime > @LastUpdateDateTime
ORTREATFUNCTC.RowUpdateDateTime > @LastUpdateDateTime
)
--MisSvc_Main & MisSvc_Codes (Specialty)
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,RowUpdateDateTime= (
SELECT
CASE
WHEN MAX(MAINSPEC.RowUpdateDateTime) > MAX(MAINSPECTC.RowUpdateDateTime)
THEN MAX(MAINSPEC.RowUpdateDateTime)
ELSE
MAX(MAINSPECTC.RowUpdateDateTime)
END
FROM
livefocdb_daily.dbo.MisSvc_Main MAINSPEC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSvc_Codes MAINSPECTC
ONMAINSPEC.SourceID= MAINSPECTC.SourceID
ANDMAINSPEC.MisSvcID= MAINSPECTC.MisSvcID
AND MAINSPECTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
MAINSPEC.SourceID= SER.SourceID
ANDMAINSPEC.MisSvcID= SER.AbsServiceID
)
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AbsServices SER
ONAD.SourceID= SER.SourceID
ANDAD.AbstractID= SER.AbstractID
WHERE
AD.PtStatus = 'IN'
ANDSER.AbsServiceIDIN (
SELECT
MAINSPEC.MisSvcID
FROM
livefocdb_daily.dbo.MisSvc_Main MAINSPEC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSvc_Codes MAINSPECTC
ONMAINSPEC.SourceID= MAINSPECTC.SourceID
ANDMAINSPEC.MisSvcID= MAINSPECTC.MisSvcID
AND MAINSPECTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
MAINSPEC.RowUpdateDateTime > @LastUpdateDateTime
ORMAINSPECTC.RowUpdateDateTime > @LastUpdateDateTime
)
) UpdatedRecords
GROUP BY
SourceID
,AbstractID
END
GO
I've had a look at the Activity Monitor while the procedure is running in both scenarios and they appear to be identical. I've even tinkered with the Cost Threshold for Parallelism but it doesn't seem to make any difference.
All we want to do is get the details of any record that has changed from any of the above tables since the last load (hence the parameter).
Any help gratefully received.
::edit::
I've just added the execution plan as well.
::edit2::
Just noticed an error in my original post so updated it.
April 7, 2016 at 2:43 am
Have you tried looking at the wait types? You can use sp_whoisactive to make that easier (search for that and download it if you don't already have it).
John
April 7, 2016 at 2:49 am
The only wait type I can see is CXPACKET which I tried to get rid of by altering the threshold value but no joy there.
As far as I can see, there's no-one else running anything on the server while I'm trying to sort this out.
April 7, 2016 at 7:02 am
I am at this time not able to look at your problem in detail. But one thing in your description points to a possible explanation.
Cached execution plans are stored along with various settings (such as ANSI_NULLS, ARITHABORT, etc). When the same query is executed with the same settings, the cached plan will be reused. When the settings are different, a new plan will be used.
It is possible that your SSMS default settings are not the same as those used by SQL Agent or Visual Studio. In that case, the latter would use a plan that is not optimal, whereas the first uses a plan that is just fine - perhaps because they were generated based on another sniffed parameter?
April 7, 2016 at 7:20 am
@hugo - I've tried clearing the plan cache and then running the code again but it was still giving me the same issue.
I can run the basic SELECT or the INSERT INTO in SSMS in around 30 seconds but the stored procedure (with the same parameter) also run in SSMS (running it using EXEC and passing the parameter) still takes a lot longer.
I'm really scratching my head on this one.
For example.....
Running this code
DECLARE
@LastUpdateDateTime AS DATETIME
SET @LastUpdateDateTime='20150601'
SET DATEFORMAT dmy
TRUNCATE TABLE pl_MT_APC_Episode
INSERT INTO pl_MT_APC_Episode
(
SourceID
,AbstractID
,LatestRowUpdateDateTime
)
SELECT
SourceID
,AbstractID
,MAX(RowUpdateDateTime) AS LatestRowUpdateDateTime
FROM
(
--We need a select statement below to get updated records
--for each MT table used to populate the working table
--AbstractData
SELECT
SourceID
,AbstractID
,RowUpdateDateTime
FROM livedb_daily.dbo.AbstractData AD
WHERE
AD.PtStatus = 'IN'
ANDRowUpdateDateTime > @LastUpdateDateTime
--AdmPatUks
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,ADMPUK.RowUpdateDateTime
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AdmPatUks ADMPUK
ONAD.SourceID= ADMPUK.SourceID
ANDAD.VisitID= ADMPUK.VisitID
WHERE
AD.PtStatus = 'IN'
ANDADMPUK.RowUpdateDateTime > @LastUpdateDateTime
--AbsServices
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,SER.RowUpdateDateTime
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AbsServices SER
ONAD.SourceID= SER.SourceID
ANDAD.AbstractID= SER.AbstractID
WHERE
AD.PtStatus = 'IN'
ANDSER.RowUpdateDateTime > @LastUpdateDateTime
--AbsPatUkConsultantEpisodes
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,EP.RowUpdateDateTime
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AbsPatUkConsultantEpisodes EP
ONAD.SourceID= EP.SourceID
ANDAD.AbstractID= EP.AbstractID
WHERE
AD.PtStatus = 'IN'
ANDEP.RowUpdateDateTime > @LastUpdateDateTime
--MisSpec_Main & MisSpec_Codes (TFC)
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,RowUpdateDateTime= (
SELECT
CASE
WHEN MAX(TREATFUNC.RowUpdateDateTime) > MAX(TREATFUNCTC.RowUpdateDateTime)
THEN MAX(TREATFUNC.RowUpdateDateTime)
ELSE
MAX(TREATFUNCTC.RowUpdateDateTime)
END
FROM
livefocdb_daily.dbo.MisSpec_Main TREATFUNC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSpec_Codes TREATFUNCTC
ONTREATFUNC.SourceID= TREATFUNCTC.SourceID
ANDTREATFUNC.MisSpecID= TREATFUNCTC.MisSpecID
AND TREATFUNCTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
TREATFUNC.SourceID= ADMPUK.SourceID
ANDTREATFUNC.MisSpecID= ADMPUK.SubSpecialty
)
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AdmPatUks ADMPUK
ONAD.SourceID= ADMPUK.SourceID
ANDAD.VisitID= ADMPUK.VisitID
WHERE
AD.PtStatus = 'IN'
ANDAD.AdmitDateTime IS NOT NULL
ANDADMPUK.SubSpecialtyIN (
SELECT
TREATFUNC.MisSpecID
FROM
livefocdb_daily.dbo.MisSpec_Main TREATFUNC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSpec_Codes TREATFUNCTC
ONTREATFUNC.SourceID= TREATFUNCTC.SourceID
ANDTREATFUNC.MisSpecID= TREATFUNCTC.MisSpecID
AND TREATFUNCTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
TREATFUNC.RowUpdateDateTime > @LastUpdateDateTime
ORTREATFUNCTC.RowUpdateDateTime > @LastUpdateDateTime
)
--MisSvc_Main & MisSvc_Codes (Specialty)
UNION ALL SELECT
AD.SourceID
,AD.AbstractID
,RowUpdateDateTime= (
SELECT
CASE
WHEN MAX(MAINSPEC.RowUpdateDateTime) > MAX(MAINSPECTC.RowUpdateDateTime)
THEN MAX(MAINSPEC.RowUpdateDateTime)
ELSE
MAX(MAINSPECTC.RowUpdateDateTime)
END
FROM
livefocdb_daily.dbo.MisSvc_Main MAINSPEC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSvc_Codes MAINSPECTC
ONMAINSPEC.SourceID= MAINSPECTC.SourceID
ANDMAINSPEC.MisSvcID= MAINSPECTC.MisSvcID
AND MAINSPECTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
MAINSPEC.SourceID= SER.SourceID
ANDMAINSPEC.MisSvcID= SER.AbsServiceID
)
FROM
livedb_daily.dbo.AbstractData AD
INNER JOIN livedb_daily.dbo.AbsServices SER
ONAD.SourceID= SER.SourceID
ANDAD.AbstractID= SER.AbstractID
WHERE
AD.PtStatus = 'IN'
ANDSER.AbsServiceIDIN (
SELECT
MAINSPEC.MisSvcID
FROM
livefocdb_daily.dbo.MisSvc_Main MAINSPEC
LEFT JOIN [MEDITECHDR01-M1].livefocdb_daily.dbo.MisSvc_Codes MAINSPECTC
ONMAINSPEC.SourceID= MAINSPECTC.SourceID
ANDMAINSPEC.MisSvcID= MAINSPECTC.MisSvcID
AND MAINSPECTC.SubmissionType_MisSubmTypeID= 'NHS CDS'
WHERE
MAINSPEC.RowUpdateDateTime > @LastUpdateDateTime
ORMAINSPECTC.RowUpdateDateTime > @LastUpdateDateTime
)
) UpdatedRecords
GROUP BY
SourceID
,AbstractID
returns 280969 records in 29 seconds.
Running the stored proc with the same date as above in SSMS has just run in 31 seconds (no problem so far).
Running the stored proc from Visual Studio (again with the same parameter) has just taken 22 seconds.
What is going on?????
April 7, 2016 at 7:51 am
In your execution plan, you've got some very large discrepancies between the estimated rows and the actual rows. Are your statistics up to date? You might also consider rewriting the query to eliminate some of those UNION ALLs - you may be able to get the same results with fewer index scans by using LEFT JOINs instead. And since they are clustered index scans, adding one or two carefully chose non-clustered indexes to the table might help.
You mentioned execution times of 22, 29 and 31 seconds. That's all in the same order of magnitude. Do you consider such variation to be excessive? Did you clear the data cache after each one?
John
April 7, 2016 at 8:38 am
@john-2 - we can't add any indexes at all to any of the tables as it's a 3rd-party app that pushes its data through to SQL. The only thing we can (and do) do is bring the statistics up to date every night and do index maintenance at the same time.
I used DBCC FREEPROCCACHE to clear the cache (but I think I might have used the wrong command) but only just before the first run of the second set (if that makes sense).
Even after that, the VS version and executing the proc itself (instead of just the INSERT INTO) was taking over 10 minutes against 30 seconds.
Now everything is back to something resembling normality (without me doing anything else) so I'll have to keep an eye on it.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply