October 20, 2008 at 7:37 am
Hi all-
I have a very large distribution database (12.5GB) and have found that my sp_MSdistribution_cleanup job doesn't appear to be doing it's job (at least not like I believe it should).
While reviewing the details:
-SQLVersion 9.00.3054
-EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 336
-Repication in our environment is generating about 500K transactions per day.
-The cleanup job is removing approximately that same amount (about 500K transactions per day).
-It appears that each transaction remains in the tables until the @max_distretention is reached.
-The job (sp_MSdistribution_cleanup) runs for 1-2 hours.
my understanding is that @max_distretention is the maximum amount of time commands will be stored in the distribution database IF subscribers have not picked them up, or if you are using anonymous subscribers.
-We don't use Anonymous subscriptions
Any ideas?
Thanks in Advance;
Ben
October 20, 2008 at 10:29 am
btw- We use Transactional replication with Pull subscriptions
Ben
October 20, 2008 at 11:02 am
Are you sure that all subscribers are getting the data? Might be one out there that's missing things.
Also, what's the recovery model? Log backups running on a full mode db?
October 21, 2008 at 12:58 pm
Hmm. That's a mighty good question! How does one tell?
Unfortunately, we don't have vanilla replication table => table... We've got some pretty ugly custom stored procedures - sometimes updating a completely different table than the publisher
Ben
October 21, 2008 at 12:58 pm
Expand the publications then right click on the subscriptions and select "View Synchronization Status". You can also use Replication Monitor (right click on riplication > select "Launch Replication Monitor").
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
October 21, 2008 at 1:19 pm
well- looking at all the publications (after forcing synchs) we have none pending.
I'll see if I can drill down to some particulars via xact_seqno to determine from there.
Any more thoughts I'd appreaciate it
Thanks
Ben
October 21, 2008 at 1:52 pm
Is your Distribution Clean up job running?
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
October 21, 2008 at 2:05 pm
yes- (See the 1st msg in the thread)
It's typically running from 50 minutes to 2 hours.
I started tracking the size of MSrepl_commands on 10/14
(running sp_spaceused MSrepl_commands) each time the job runs.)
and currently the rowcount is 9883164 (a net difference of -281530 rows)
Today- between the run at 2:55am and 3:55am I have a gain of 431459 rows (heavy replication activity)
but then the run at 4:55 shows a loss of -415129 rows, so it appears that it's working, but I can still see transactions with xact_seqno that shows up in MSdistribution_history as having completed.
Hopefully I didn't just add a bunch of confusion to this
Thanks for your help
Ben
October 21, 2008 at 3:38 pm
Yeah, sorry, didn't read the full post only the part prior to my first response.
bpflanzer (10/21/2008)
Today- between the run at 2:55am and 3:55am I have a gain of 431459 rows (heavy replication activity)but then the run at 4:55 shows a loss of -415129 rows, so it appears that it's working, but I can still see transactions with xact_seqno that shows up in MSdistribution_history as having completed.
In the above statment I am assuming that you are stating the commands still exist in MSrepl_commands even though they are showing up in the MSdistribution_history table. Is that correct? This definitely seems odd, I think. 🙂
Why does the database size concern you? Is it growing continually or is it staying approximately the same size? Ours is actually bigger but we do have a LOT of replication. However, if the database is staying the same size and you haven't increased nor descreased the amount of activity, that may be what it requires in order to handle all the processing.
Sorry, more questions than help.... 🙂
David
@SQLTentmaker“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
October 21, 2008 at 4:02 pm
no apology needed-- You're helping 🙂
Your assumption is correct-
if I look at the history and identify a xact_seqno, then run sp_browsereplcmds for it-- it's still there
I'm mostly concerned with size because it seems sudden (we had to add more SAN space) and I don't know of changes. I unfortunately, don't have any specs from before that.
Now - I'm just confused/concerned as to why the data is staying out there!
Many Thanks!
October 22, 2008 at 8:14 am
hmm- this isn't making sense based upon my analysis to date, but yesterday I changed the retention to 13 days rather than the 14 it was set for and dropped 1.5M rows. (what's odd about it is that everything so far as said we're appoximately 500K rows per day.)
At any rate- it proved that the transactions/commands are lingering until the expiration date occurs.
I also realize that I never pointed out that the publishers and distibutor were upgrades from SQL2000 but that was awhile ago (months).
To be continued...
Thanks for your thoughts
November 7, 2008 at 9:46 am
It's been awhile since I logged this, and actually it's been resolved for awhile too, but I thought I'd add the resolution.
It turned out that I DID have one anonymous publication (not intended) on 4 different servers. These total to the 500K rows per day I was retaining in the distribution database.
Once I change it to NOT Anonymous- they were all gone the next day 🙂
Now the task ahead it to identify if it's even appropriate for all that data to move. I'm thinking thats the entire table in each case (like development left off a where clause :angry:
November 7, 2008 at 11:10 am
Interesting .... Thanks for sharing the solution.
* Noel
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply