How to identify query causing huge drop in Page Life Expectancy

Question

How to identify query causing huge drop in Page Life Expectancy

winston Smith

SSCoach

Points: 19500
More actions
April 6, 2012 at 2:38 am

#256040

I have a database server that has a page life expectancy drop to almost 0 each morning. intuition is telling me that it is caused by a sproc called in a job.
I need more substantial proof than intuition though. Are there any DMV's i can use to help identify what may be causing pages to be pushed out of the buffer?
Note I cannot use profiler on this server.
Thanks!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 398886 More actions · Answer 1

You shouldn't run Profiler on a production system anyway. But by that do you mean you can't run a trace script?

Since you're posting in a 2008 forum what about using extended events? They're more lightweight than trace. You can script the start and stop time to coincide with the approximate time of your issue and then you'll know what caused it.

You can try running queries against sys.dm_exec_query_stats. That has an aggregate view of all queries in cache. Assuming your query is only run as part of a data load or something at night, you might be able to get it that way. But, if that query is aged out of cache, you won't be able to tell. Also, if that query is run more than once, you won't be able to tell which execution of it is the severe one. You also won't be able to check parameters.

All those limitations are why capturing the actual execution is the best way to go. I'd suggest using extended events.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 2

winston Smith (4/6/2012)
Note I cannot use profiler on this server.

Why's that?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Perry Whittle SSC Guru Points: 234013 More actions · Answer 3

I'd start by runnuning through and verifying any agent jobs that run or have run around that time just to know what they're all doing.

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

winston Smith SSCoach Points: 19500 More actions · Answer 4

Hey guys.

I can run profiler, but there are a lot of restrictions on long running traces, profiler or server side on certain servers in this particular company and is just a whole lot of bother getting authorization.

I had first looked at the query plan cache DMV, but the aggregated results didn't help, so I tout I must be missing out on knowledge of a DMV to help break Down what queries take what duration and have what IO. looks like there are no other helpful DMV's in this situation. That's still good knowledge to have.

Regarding the jobs, that was one of the first things I checked and I did narrow it down to 1 job that kicked off 1 min before the PLE drop. Problem is that intuition won't be listened to here and I need hard facts!

Thanks for the help guys. Il use what I have and see if I can get a server side trace run. Or perhaps create a job that grabs a snapshot of e plan cache stats at regular intervals to allow me to narrow down the IO on individual plans.

Thanks!

Perry Whittle SSC Guru Points: 234013 More actions · Answer 5

Server side traces have much less impact apart the potential for required disk space depending on what you are monitoring

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

winston Smith SSCoach Points: 19500 More actions · Answer 6

Perry Whittle (4/7/2012)
Server side traces have much less impact apart the potential for required disk space depending on what you are monitoring

I know, I use server side traces a lot in other companies. The red tape is ridiculous, but it's there and so I have to work with it/around it. It's not a rule I am imposing in the system!

Grant Fritchey SSC Guru Points: 398886 More actions · Answer 7

winston Smith (4/7/2012)
Perry Whittle (4/7/2012)
Server side traces have much less impact apart the potential for required disk space depending on what you are monitoring
I know, I use server side traces a lot in other companies. The red tape is ridiculous, but it's there and so I have to work with it/around it. It's not a rule I am imposing in the system!

So try with the extended events. They offer a much lower footprint and still capture the same information (and more) with better abilities to filter what's captured. It should be an easier sell.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 8

winston Smith (4/6/2012)
I have a database server that has a page life expectancy drop to almost 0 each morning. intuition is telling me that it is caused by a sproc called in a job.
I need more substantial proof than intuition though. Are there any DMV's i can use to help identify what may be causing pages to be pushed out of the buffer?
Note I cannot use profiler on this server.
Thanks!

Since it happens each morning, check the scheduled jobs and see what's happening around that time frame. You can severely limit what a server side trace would need to look at by knowing the job name. As a side bar, most things that cause PLE to tank are jobs that have very large rowcounts and a longer than usual duration. You might even be able to figure this out just by looking at the duration in the jobs log or by looking at one of the standard reports available when you select "Reports" on a database (including the rowcounts).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Perry Whittle SSC Guru Points: 234013 More actions · Answer 9

Jeff Moden (4/7/2012)
check the scheduled jobs and see what's happening around that time frame.

Great minds think alike 🙂 😉

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 10

Perry Whittle (4/7/2012)
Jeff Moden (4/7/2012)
check the scheduled jobs and see what's happening around that time frame.
Great minds think alike 🙂 😉

Perry Whittle (4/7/2012)
I'd start by runnuning through and verifying any agent jobs that run or have run around that time just to know what they're all doing.

They do indeed. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)