June 16, 2011 at 12:28 pm
Has anyone encountered the issue where you; have a job, the job step advanced options are set to log to table, the job step runs, and in the step executes a select that logs returned data... however, in that select there is a varchar data type and what is then recorded into MSDB..sysjobstepslogs is padded as though it was selecting a char data type?
To reproduce, I create a job, create a step on the job put in the below code, ran the step, and checked the table for log data growth.
set ansi_padding off
set nocount on
declare @varcharmax varchar(max), @counter int
set @counter = 1
while @counter < 100
begin
set @varcharmax = 'moo'
select rtrim(@varcharmax)
set @counter = @counter + 1
end
Checking data amount that was stored:
select
log_size,
sl.date_created,
sl.date_modified,
log_id,
name,
step_name,
step_id,
database_name,
last_run_date,
last_run_time,
last_run_duration,
description
from
sysjobstepslogs sl
join sysjobsteps ss on sl.step_uid = ss.step_uid
join sysjobs sj on sj.job_id = ss.job_id
order by 1 desc
The log size or len(log) data returned a massive amount of whitespace. I have been unable to search and find answers on how to limit this padding amount. My MSDB database is growing very quickly due to many jobs logging to table and the space is all padded in as though I wanted to see it and used a char data type in the select. Putting settings for padding and explicit trimming do not seem to help.
June 22, 2011 at 1:39 pm
I see a few references to http://social.msdn.microsoft.com/forums/en-US/sqldatabaseengine/thread/ac5e64f7-01ee-4412-8482-9468b1e99bf4/ but no solution. If I truncate the table that keeps growing, would that solve the problem?
I do not know how to hunt for ghosts. Curious if anyone has run into this before and hunted ghosts.
June 22, 2011 at 2:45 pm
In case this comes up for anyone else, truncate to the table msdb..sysjobstepslogs will clean up the space. Just was able to free up 36 GB after doing a quick select into to a logs temp table. Only had 113 rows, but a lot of "hidden" space that was not getting reclaimed properly. Have not found the root of the problem, other than mentioned by the post linked to, but the solution I did not need to rebuild the MSDB database as some other findings stated to do.
Bam. Dun. Now just need to find the root.
June 22, 2011 at 3:01 pm
matt.newman (6/22/2011)
In case this comes up for anyone else, truncate to the table msdb..sysjobstepslogs will clean up the space. Just was able to free up 36 GB after doing a quick select into to a logs temp table. Only had 113 rows, but a lot of "hidden" space that was not getting reclaimed properly. Have not found the root of the problem, other than mentioned by the post linked to, but the solution I did not need to rebuild the MSDB database as some other findings stated to do.Bam. Dun. Now just need to find the root.
Can you check if that table has a clustered index?.. nothing else comming to mind atm.
June 22, 2011 at 3:44 pm
It does, and it was pretty trashed at the time. Though, it was on an int that supposedly only had 113 rows. I saw some other postings that rebuilding the clustered index did not have a successful result. Due to size held, I did not want to rebuild and lock jobs from executing, causing a pile up on resources. If I locked that table and has 23 jobs running simultaneously that wanted to log, more problems. The truncate after backup was the cleanest for this table from what the reported issue was.
June 22, 2011 at 5:47 pm
Did you consider doing this sequence?
select * from base into ##temp
truncate base
insert into base select * from ##temp
Might have solved it for you.
June 23, 2011 at 8:19 am
Not sure what "base" refers to here, but I did
select * into [|TempStorageTableInCaseNeededLater|] from sysjobstepslogs
truncate sysjobstepslogs
jobstepslogs was then already refreshed soon after due to job processes running (within 10 seconds new records were present). I would not "correctly" be able to insert back into without some time and effort of which I need more time for other things. As was the issue explained in the above link regarding non-reclaimed space from the table, Snapshot Isolation, if this was the issue, was retaining records due to timestamp validation rules. It appears the rule was broken sometime late last month. I had done an audit to remove varhcar(max) data from being logged to the table as it appears that data is not stored well at all (whitespace). Then, I did an assement of growth in mail queue and attachements. Although there was growth from early April, not enough to merit the space used for msdb. Then, and audit of sysjobstepslogs where I did a len of log * 8 / 1024 to get data size from all records using sum. Then a sp_spaceused as well as a couple other audit checks because the numbers were not adding up, then Google the link listed above, then the decision was truncate and hope it works based on what the issue was. I have multiple jobs running at 1 minute intervals for various tasks. To complete the operation you suggessted I would need to stop SQL Agent serverices. Although feesable, It was best to just do a temp store of the data elsewhere, truncate, and let the job queue resume as needed.
There was a combination of problems leading to msdb growth that I needed resolved. New jobs doing new things, logging varchar(max) to sysjobstepslogs logged it with massive whitespace, then the log records when they were supposed to be deleted for the next log because append was not used did not reclaim space. Isolation on, I needed to flush the table. Everything is clean now and I have 34 GB padding. I will be putting an audit check in place nightly to verify log data size and compare to freespace to the table itself.
Long winded, I was never really good at online explainations. Sorry. I think we're on the same page, but I had a few more things going on. I just listed them all out though I believe you have an understanding in case someone else runs into any of the issues will be able to search and find what I had found and hopefully get a solution in place using the above.
Thanks!
June 23, 2011 at 10:15 am
Looks like this may go into a recurring issue for me and did not resolve the final underlaying problem.
select sum(len(log) * 8 / 1024) from sysjobstepslogs
go
sp_spaceused sysjobstepslogs
A rough comparison on these two results should show a growth in line with one another, but one number has leveled off while the other is showing an increase. Not sure why that space would not release but will keep tabs and any findings for later.
June 23, 2011 at 10:21 am
Switch to datalength instead of len so that it includes trailing spaces.
Also why the *8 / 1024? It's the 8 that I don't understand.
sp_spaceused returns kbs so to covert bytes to kb all you need is / 1024.
June 23, 2011 at 10:30 am
Oops... len * 2... 2 bytes per character.
Thanks for the catch! Looks like the issue then is most definitely recurring! I will have to look into turning snapshot isolation off, though don't yet know the costs of that. Don't want to have to rebuild MSDB, may need to look at restarting the agent service.
June 23, 2011 at 10:39 am
matt.newman (6/23/2011)
Oops... len * 2... 2 bytes per character.Thanks for the catch! Looks like the issue then is most definitely recurring! I will have to look into turning snapshot isolation off, though don't yet know the costs of that. Don't want to have to rebuild MSDB, may need to look at restarting the agent service.
I don't see why snapshot isolation would affect this. The only that makes sense is a lob that gets constantly updated. Then IIIRC you have to clean the table with rebuild index or dbcc cleantable (not sure about that part).
If you say you only have 113 rows, then what's the space used by the temp table where you dumped the records to?
June 23, 2011 at 10:39 am
P.S. datalenght will give you the correct size with trailing spaces and datatype. This is what I'd use to make sure I have the correct value there.
June 23, 2011 at 10:45 am
Temp Table: 113 rows
namerowsreserveddataindex_sizeunused
TMPsysJobStepsLogs062211113 219064 KB219000 KB16 KB48 KB
sum datalength = 214144
Prod Table: 86 rows
namerowsreserveddataindex_sizeunused
sysjobstepslogs86 917344 KB916936 KB24 KB384 KB
sum datalength = 179032
Snapshot isolation was mentioned in: http://social.msdn.microsoft.com/forums/en-US/sqldatabaseengine/thread/ac5e64f7-01ee-4412-8482-9468b1e99bf4/
June 23, 2011 at 10:57 am
Wow you learn something new everyday!
Do you still have the temp table laying around?
You could try all the usual tricks (drop / rebuild clustered index, lob compaction on, etc).
If that works you could create another job that does this little dance maybe once a week (select into, repair, truncate live table and reinsert).
If the final table only has a couplds mbs the whole thing "should" run in only a few seconds...
June 23, 2011 at 11:37 am
If I can't find an online solution my plan is try and select top 0 * into sysjobstepslogs2 from sysjobstepslogs, mod the table to reflect correct constraint on pkey, create clustered on pkey, stop the agent service, rename old table filling up to old suffix, rename new table to drop 2 suffix, and restart the service. If the issue is still apparent after that, I will need to know what turning off snapshot isolation will provide for benefit. Then, need to later assess if having jobs running at 1 minute aprt is attributing, and if all else fails, schedule something to truncate the table nightly since it does reclaim the space. Sadly it's all shoot from the hip stuff with no real solution found.
Thanks for the pointers!
Viewing 15 posts - 1 through 15 (of 26 total)
You must be logged in to reply to this topic. Login to reply