February 12, 2009 at 3:46 am
Nice feature to get trapped on migration from SQL 2000 to 2005.
February 12, 2009 at 5:03 am
archie flockhart (2/12/2009)
Here's an interesting combination:This code works perfectly OK ( the unmatched block comment marker causes no problems because it is part of an inline comment):
print '1'
--/*note unmatched block comment marker on this line, following an inline comment marker
print '2'
print '3'
result: 1 2 3
Now, suppose we are asked to disable the part of the code that prints 1 and 2 ... easy, we say 🙂 ... just put block comment markers around it ...
/*
print '1'
--/*note unmatched block comment marker on this line, following an inline comment marker
print '2'
*/
print '3'
result : Server: Msg 113, Level 15, State 1, Line 1
Missing end comment mark '*/'.
oops !
My attempt to explain the behaviour is: as soon as you are withiin a BLOCK COMMENT, the system ignores all INLINE COMMENT markers. For most statements, this doesn't matter, because the statement will not be executed anyway since it is part of the BLOCK COMMENT.
But START BLOCK COMMENT /* or END BLOCK COMMENT */ instructions which look as though they should be ignored because they follow a "--" , are actually processed, and may open a nested comment block or close the existing block or cause a syntax error.
Does that sound about right ?
Yeah. But you gotta try this. You'll love it!
/*
print '1'
--/**/note unmatched block comment marker on this line, following an inline comment marker
print '2'
*/
print '3'
February 12, 2009 at 5:19 am
Actually that just about makes sense, based on the assumption that "--" is ignored within block comments
so in this code:
/*
print '1'
--/**/note unmatched block comment marker on this line, following an inline comment marker
print '2'
*/
print '3'
... line 1 opens a comment block
print '1' is within the comment block so it doesn't print
on the line starting with "--" , the system ignores "--" , it then finds a complete (but empty) nested block comment ; followed by some text. The text is outside the nested comment, but INSIDE the original comment block that was started on line 1, so it does not cause an error.
print '2' is still within the original comment block
the next line ends the original comment block
and print '3' then executes
February 12, 2009 at 11:36 am
This problem originally came to mind when I started work on a tool for comparing the text from stored procedures (http://jessesql.blogspot.com/2009/02/comparing-stored-procedures-part-1.html). I wanted to allow the option of omitting all comments from the comparison, which of course would require me to identify and parse out comments from the text. In my research, I came across the question of how to treat nested comments - the "order of operation" of inline versus block comments, nested blocks, etc., and that's when I came across the scenario that formed this question of the day. I have to say that I was surprised, almost to the point of shock, that commenting functioned this way.
Anyways, glad everyone seemed to like the question (maybe not getting it wrong, but in terms of learing something surprising).
February 12, 2009 at 11:51 am
Jesse McLain (2/12/2009)
...Anyways, glad everyone seemed to like the question (maybe not getting it wrong, but in terms of learing something surprising).
Thanks Jesse McLain for this question.
I surely learned something :D. But I want my developers away from writing comments like this. Comments are there to help you, not to confuse you :P.
KISS is the key -> Keep It Simple Stupid. Doing so, any dba or developer can easily understand your code.
D. Couturier
Database Administrator & Architect
February 13, 2009 at 4:43 am
Regarding your problem of stored procedure comparison, Windiff is still available (just Google "windiff download", one of the places is Microsoft) so I'd have just used that (and regularly do!).
Of course,if you start writing your own tool, so you can ignore comments, then you are starting on a complex path since, ideally, you need to build a complete parser to check thatselect a,b,c from tbl
is equivalent toSELECT
d.a ,
d.b ,
d.c
FROM
tbl AS dor even thatselect a.a, b.b from a join b on a.c=b.c
is the same asselect a.a, b.b from b join a on b.c=a.c
And that's before comments have been introduced!
Sometimes the solution is just to use a simpler tool like Windiff along with the mk.1 Eyeball as used by an experience SQL writer. 😀
Derek
February 13, 2009 at 9:34 am
Derek Dongray (2/13/2009)
Regarding your problem of stored procedure comparison, Windiff is still available (just Google "windiff download", one of the places is Microsoft) so I'd have just used that (and regularly do!)....
...
Sometimes the solution is just to use a simpler tool like Windiff along with the mk.1 Eyeball as used by an experience SQL writer. 😀
I agree with your admonition against reinventing the wheel, as Windiff does a great job. But my original intention was to provide quantitative matching results, not necessarily the visual representation of the matching. I imagined this as being part of a larger tool to compare databases (for example, between development, testing, and production versions of a database, or instances of independently maintained databases that began as identical copies), which would involve automatic comparisons of many objects.
The more subtle differences you mentioned above make the problem even more interesting, and in my comparison of the two spds of 1000+ lines each, I found many instances of such. You're right, the complexity goes upwards from there. This sort of "fuzzy matching" is exactly what's involved in name/address deduplication in direct marketing. For example, how do we match the two addresses:
Jon G. Smith
100 E. Main St
Baltimore, MD
J. Smyth
100 East Main Street
Balt. Maryland
These sorts of problems have been tackled (in fact, I've written parsing, scoring, and matching algorithms myself to handle it). I think the SQL parser would be of similar complexity. Using the current method of capturing lines of code from the spds, we would have to parse those lines into "words" based on white space, etc. Taking the example of the SELECT you made, we would have to standardize the JOIN sequence and data nicknames, compare the metadata of the results, etc. I'm not sure I would want to go in that direction, but it's possible.
February 15, 2009 at 7:34 am
this question really got me. Amazing question. Really good one....:)
February 16, 2009 at 2:55 am
Jesse McLain (2/13/2009)I agree with your admonition against reinventing the wheel, as Windiff does a great job. But my original intention was to provide quantitative matching results, not necessarily the visual representation of the matching.
I've found in the past that sometimes it's necessary to reinvent the wheel to add an extra bit of functionality that a tool doesn't provide. It depends how much cost is involved.
This sort of "fuzzy matching" is exactly what's involved in name/address deduplication in direct marketing.
I too have had to write routines to match and format addresses when loading data from legacy business systems into newer replacements. Since, usually, these are only going to be used over a short period, I've found that the last step often comes down to a manual/visual check. Obviously, when it's finance data, finding duplicates is much more important thatn when it's a mailshot!
Using the current method of capturing lines of code from the spds, we would have to parse those lines into "words" based on white space, etc. Taking the example of the SELECT you made, we would have to standardize the JOIN sequence and data nicknames, compare the metadata of the results, etc. I'm not sure I would want to go in that direction, but it's possible.
Actually, with tools like Flex and Bison it might not be too hard to build something.The metadata comparison might get quite complicated (consider re-ordered equijoins) and, of course, may still fail to identify functionally equivalent code.
Derek
Viewing 9 posts - 46 through 53 (of 53 total)
You must be logged in to reply to this topic. Login to reply