Yesterday (6/22/2006) was the day to send out our monthly vendor mailing. Last year we added a newsletter a month that we'd sell to the vendors to send out a message just from them, nothing else inside. The first few weren't successful, but they've gotten better and they bring in some much appreciated revenue for the site.
I try to get these sent out the 2nd week of the month because the last week has the end of month newsletter. This month was Idera and they were slow in getting me materials, so I was in a hurry to get it out yesterday. With next Saturday being the EOM newsletter, I didn't want to deluge people with too many newslettets in too short a period of time. I'm sensitive to the overflow of information as well as the need to generate revenue to make this company continue to grow.
So I'm hurrying and as I'm going through the Idera content and adding our unsubscribe stuff and have Andy on the phone. We send out the newsletters for the vendors through our own email system. Usually I dread this because we dump all emails into a table and then blast them out. Since the sign up confirmations go out through the same system, that's the day I usually babysit the system as well as respond to lots of "I signed up and didn't get a confirmation email" messages.
I know I'm going about this a little roundabout, but stay with me. This is good and embaressing for me.
So while back I asked Andy to work on getting a priority column setup so the confirms would go out high priority and everything else low priority. He's a better .NET programmer for an exe, so I let him mess with it. I'm rusty and I'd probably spend twice the time doing it and forget something.
So like most projects we talked about it on the phone, debated about how many priorities, using tokens to send emails, storing the message once v once per email recipient, etc. One of the things we did decide to implement is a sending machine column so we could run this from multiple machines and scale it out. We had a few issues, and with some delays in the training center, we were working on the final version of this.
What does this have to do with yesterday? Well, as I was working on getting the emailer exe on a second machine, I started loading the emails for the vendor send. With some great growth this year and no bounce checker, I loaded 199k email addresses into a table. With a cope of a 20k message with each. Needless to say that's quite a load of data.
Enough that as I'm talking to Andy about how this works, I see my insert into statement is taking a long time. Like minutes instead of seconds. I'm distracted and confused, so I'm wondering what's going on when I get the dreaded "the log file is full for database 'sqlservercentral'. I was a little distracted with a .NET 2.0 upgrade on my second machine and I'm not sure how long that was up there, but it probably wasn't more than 5 minutes.
Immediately I know what to do and truncate the log, then check the space, allocate a touch more to the log, and rerun the insert. It runs this time
and I start the emailer. A success there as we send about 40k message in the first hour from 2 machines. Typically that's about 3 times the speed it used to run at, so in addition to adding some features, the .NET 2.0 upgrade added some speed.
However it was the middle of the day and a number of you sent me notes about the error, which appeared on the site. I know we should trap the error, mail it to ourselves, etc. but I just haven't bothered to work the website error system. I'm not a web expert, though I am a database expert, which you might doubt after seeing the error. However I did a root cause analysis and determined...
I'm an idiot.
Well not completely, but I certainly wasn't paying attention. This was one of the largest data loads and even though there's GB of disk space for the log to grow, it takes time and freezes things and even then might cause issues since this is one large transaction. We have hourly T-log backups running, but I didn't think about it and mistimed the load. If I'd waited 5 minutes, I'd have had the next backup run and the log would have been mostly empty and it would have succeeded.
It was kind if a conglomeration of things happening, but the bottom line is that I shold have paid more attention to the load.
But I'm not the only one having issues. The SQLJunkies and MSDN blogs have been having fits the last two weeks as I've been scanning them for Database Daily content. About half the time they don't respond. 🙂