The company I work has been running SQL forever it seems, about three years
ago we upgraded to v7 and later to v2K. Over that time we've gone from one db
server to three, and upgraded all of those a couple times in some way or the
other. Recently we finally got we needed to cluster everything, which called for
a little rearranging. Going into the move, we had this:
Server 1
- Log files in C:\SQL Log Files
- Data in D:\SQLData\Data
Server 2
- Log files in D:\MSSQL7\Data
- Data files in D:\MSSQL7\Data
Server 3
- Brand new, no data yet
I know you'll be wondering why the unusual file placement for Server 1.
Basically at one point we were disk bound, we had available capacity on the OS
mirrored set so we moved the log files to it to leverage IO that was barely
being used.
As part of the clustering we were also moving to a SAN (see more on that here).
Because we have 200 plus db's on a server, we decided to put all the files on
the same RAID set and to use RAID-5. Again, I'm assuming you're wondering why?
Conventional wisdom is to use separate mirrored drives for the log files since
most IO is sequential, compared to more random for db access. With 200 log
files, I need 400 drives - which seems excessive! I've had all 200 on a mirror
set for a while and average utilization is astoundingly low. We're not heavily
using all 200 at once, on any given day we might access 50-60 of them. Wouldn't
it be nice to use those available IO's for something else, like normal db
access? This is not to say that what I did here will work for everyone, but
based on the usage patterns we've had I was comfortable putting everything onto
the same disk set (knowing in the back of my head that the cache on the SAN
would help too). RAID-5 vs 10 is a pure cost issue - twice as many drives, twice
as many enclosures, have to have power, cooling, room in the server room.
Because we like to make simple things complicated, we decided to rename our
servers to fall in line with our newest naming convention and to standardize
file placement. The first seems simple enough, just put an alias in DNS so that
anything pointed to the old server name resolves to the new server name, then
gradually update the connection strings in our apps as part of routine
maintenance. It worked, but we missed a couple places where we either had the
server name in a table, or we were using @@Servername. Fairly easily resolved.
For file placement, our standard is quorum drive for the cluster is always Q,
data goes on Z. That meant every mdf and ldf had to be moved. Consider that for
a moment, what's an easy way to do that? Detach? Did I mention the databases are
replicated and you can't detach those? Or you could backup/restore all of them,
using the very handy move option to change the file placement. Too bad you can't
do an alter table to adjust it. Not seeing a good option, I dropped all my
publications, detached, copied to new location, reattached, restored
replication. Lot's of time, but writing some code can reduce the drudgery some.
Our standard for files is now:
Z:\SQL | Main container folder |
Z:\SqlData\MSSQL | When we install SQL, we point it to Z:\sqldata, it creates MSSQL and all the sub folders (data, backup, etc) under that |
Z:\SqlData\MSSQL\Data | All mdf and ldf. Only one of each per db. |
Z:\SqlData\MSSQL\Backup | All backup files, we back up to disk first, then to tape, and we don't use devices |
Z:\SqlData\CompanyName | Container for anything we add to the server that is SQL specific |
Z:\SqlData\CompanyName\Scripts | Any script that we might need as part of a process, typically these are post snapshot scripts for us. We share this as SCRIPTS. |
Z:\SqlData\CompanyName\DLL | We have a couple internal DLL's we use for jobs, these are stored here |
Z:\SqlData\CompanyName\EXE | Sometimes we also have exe's for jobs, they go here |
Over time I expect to add more folders under the CompanyName subfolder. The
goal is to keep everything organized and manageable, whether it works - we'll
see!
So was it worth the work? I think so. Hard to put a price tag on keeping
things neat (and yes, you could argue I should have from the start - the truth
is sometimes you don't see a need for a pattern with one server, or have the
time to implement one if you do). Backup and restore scripts can be the same on
all servers. Installing a cluster is one step simpler (as far as I know SQL only
sets up the main data drive as a cluster resource, if you use a different drive
for the logs you have to set that resource yourself). Training a new DBA should
be easier, it's not a hard pattern to learn.
Wrapping up, I hope you'll take away a couple things from my ramblings. One
is that you have to know your data access patterns to make good decisions, it's
not always as simple as following the generalized best practices. The other is
that failing to implement standards for file placement (and everything else)
hurts more the longer it goes on. If you can stop it from happening to start
with, terrific. If it's already happened, start working on fixing it, you'll be
glad you did!