tl;dr; Add PRINT and/or SELECT statements. Comment/uncomment out pieces of code to isolate problems. Change INSERT/UPDATE/DELETE statements to SELECTs.
Disclaimer: These are techniques I use all the time. There are almost certainly others. Use what you find works best for you.
Debugging stored procedures, functions, even views is something I end up doing quite a bit. Now, if you are a developer you are probably used to being able to step through your code, set watch values to see the contents of variables, etc. And technically, you can do this with SQL Server if you have access to the debugger. Of course, that requires sysadmin access and getting that can be problematic in anything but your own personal box. Not to mention if you have to figure out why this query isn’t pulling the data you expect in production .. well .. you’re out of luck there.
So what can we do instead? I have three major categories of debugging techniques.
Add PRINT and/or SELECT statements
This is a great way to find out what’s in a variable, or maybe what the query for an EXISTS statement is going to return. It’s also a great way to see which paths your code is taking. PRINT vs SELECT is pretty subjective and frequently you can use either or.
Kind of a contrived example, but let’s say we need to know what’s going on here.
DECLARE @MyDate datetime = getdate(); DECLARE @MyCounter int = 0, @MaxCounter int = 100; WHILE @MyCounter < @MaxCounter BEGIN IF @MyCounter % 3 = 0 SET @MyDate = @MyDate + @MyCounter; ELSE IF @MyCounter BETWEEN 20 AND 30 SET @MyDate = DATEADD(second, @MyCounter, @MyDate); IF @MyCounter BETWEEN 70 AND 100 SET @MyDate = DATEADD(hour, @MyCounter, @MyDate); END
Printing out @MyDate could be useful but most likely the first thing we want to do is find out why this is in a never ending loop. So at the end of the loop print out @MyCounter and @MaxCounter.
DECLARE @MyDate datetime = getdate(); DECLARE @MyCounter int = 0, @MaxCounter int = 100; WHILE @MyCounter < @MaxCounter BEGIN IF @MyCounter % 3 = 0 SET @MyDate = @MyDate + @MyCounter; ELSE IF @MyCounter BETWEEN 20 AND 30 SET @MyDate = DATEADD(second, @MyCounter, @MyDate); IF @MyCounter BETWEEN 70 AND 100 SET @MyDate = DATEADD(hour, @MyCounter, @MyDate); PRINT 'MyCounter ' + CAST(@MyCounter as varchar(20)); PRINT 'MaxCounter ' + CAST(@MaxCounter as varchar(20)); END
A few things of note here: I break formatting for these things. I always put them on the far left so that they are easy for me to find and remove. Either that or put something you can search on like maybe –Ken’s debug print before each of them. This can become very important when you have a few thousand lines of code and you’ve been working on it for a while. It can be embarrassing to have odd text printing out when it makes it into production.
Next, I printed out a label to go with the variable so I know what it is. Also, because of the way I’m handling this, if the variable is null the label won’t show up. I could use a SELECT @MyCounter, @MaxCounter; instead and neither of those would have been necessary. Anyway, obviously, I’m not incrementing @MyCounter since it’s always coming back as 0. So we fix that.
Now I want to know if/when it’s going into each of the possible if conditions.
DECLARE @MyDate datetime = getdate(); DECLARE @MyCounter int = 0, @MaxCounter int = 100; WHILE @MyCounter < @MaxCounter BEGIN IF @MyCounter % 3 = 0 SET @MyDate = @MyDate + @MyCounter; ELSE IF @MyCounter BETWEEN 20 AND 30 BEGIN PRINT 'IF @MyCounter BETWEEN 20 AND 30' PRINT 'MyCounter ' + CAST(@MyCounter as varchar(20)); SET @MyDate = DATEADD(second, @MyCounter, @MyDate); END IF @MyCounter BETWEEN 70 AND 100 BEGIN PRINT 'IF @MyCounter BETWEEN 70 AND 100' PRINT 'MyCounter ' + CAST(@MyCounter as varchar(20)); SET @MyDate = DATEADD(hour, @MyCounter, @MyDate); END SET @MyCounter = @MyCounter + 1; END
You’ll notice I had to add BEGIN .. END statements. That’s because IF only effects the next line and I want to run multiple lines of code within the condition. I’ve done prints that are just numbers (1, 2, 3 etc) but it’s generally best to do something descriptive. Also, while I’m at it I’m printing out the appropriate variable. It could have been @MyCounter, it could have been @MyDate, it could have even been a piece of code like DATEADD(hour, @MyCounter, @MyDate).
In a particularly long piece of code where I’m getting a fairly non-descript error, I’ve just thrown those PRINT 1, PRINT 2 etc statements all through the SP just to narrow down where the error is coming from.
Anyway, you get the idea.
Comment/uncomment out pieces of code to isolate problems.
I use this to simplify my code. For example, if I’m not getting enough rows of data back from a SELECT statement. The first thing I do is figure out one or two rows that aren’t there but should be. Next, I start removing pieces of the where clause until it shows up (I might add an additional condition to ONLY pull the rows I want). Then I can toggle back and forth (commenting and uncommenting conditions) until I’m certain what’s the problem. From there I can decide if it’s a logic problem or a data problem.
For example
SELECT name, number, type, low, high, status FROM spt_values WHERE low IS NOT NULL AND name LIKE '%data%';
On my machine (I’ll explain why in a minute) this returns nothing. But if I run this:
SELECT name, number, type, low, high, status FROM spt_values WHERE low IS NOT NULL -- AND name LIKE '%data%';
I now get back quite a bit of data. I’m going to pick one row that I’m not getting, but I think I should be.
So let’s add to the WHERE clause so pull back just the row I want.
SELECT name, number, type, low, high, status FROM spt_values WHERE low IS NOT NULL AND name = 'DB Data Reader' -- AND name LIKE '%data%'
Now I get back that one row. And if I uncomment out the LIKE condition I get back nothing. Interestingly, if I uncomment the LIKE condition and comment the IS NOT NULL condition I still get nothing back.
SELECT name, number, type, low, high, status FROM spt_values WHERE -- low IS NOT NULL --AND name = 'DB Data Reader' AND name like '%data%';
After reviewing the data, and the condition it looks like it’s not a problem with the data. So what’s wrong with the WHERE condition? In the end, the answer is that my instance is case sensitive. So putting it all back together I end up with this:
SELECT name, number, type, low, high, status FROM spt_values WHERE low IS NOT NULL AND name = 'DB Data Reader' AND name like '%Data%';
And I get back the one row I’m looking for. At this point, I can remove my testing condition and I have the query I wanted.
SELECT name, number, type, low, high, status FROM spt_values WHERE low IS NOT NULL AND name like '%Data%';
Yes, I realize this is a somewhat simplistic example but it does demonstrate the technique pretty well. This get’s even more powerful when you have a dozen conditions (some of them fairly complicated with subqueries etc).
Change INSERT/UPDATE/DELETE statements to SELECTs.
When you are debugging you want to make as few actual data changes as you can since you are changing the data you are querying against, adding complexity when you try to re-run the statement and see what happens. So where possible/necessary change things like UPDATE, INSERT, DELETE to SELECT. Comment out the command line and change it to SELECT. Typically you’ll grab any columns in the WHERE clause and then for an UPDATE you’ll then want to grab the column you are changing and what you are changing it to. For a DELETE the WHERE clause info is usually enough and for an INSERT grab the WHERE clause info and the columns you are inserting.
Other
Know what your results should be!
If you don’t know what your output is supposed to be then you won’t know when you get it correct will you? This is surprisingly more of a problem than you would think. I read a lot of forum posts where the poster doesn’t really give you an accurate idea of what kind of output they want. As people start to question them it becomes obvious that even they don’t really know what they want in the first place.
Work in a development environment.
You’d think this would be obvious but even testing/debugging a relatively simple query should be done in a development/test environment. Your query may not be doing much but it’s still going to require some resources and take some locks. This can (and probably will) adversely affect production queries.
Have good representative data
This is the hardest part of the whole process. In order to make that development environment worthwhile, you need to make sure that you not only have data that matches your production data (although probably quite a bit less of it) but it should also have some good edge cases.