Parsing SQL

  • I am currently working on a program I have named SqlWrite, a light-weight SQL editor. I have implemented limited autocomplete logic and am currently working on parsing algorithms. (See the website for details.) Ultimately, I would like to be able to generate a full-fledged [T-]SQL parse tree. However, for now I would be satisfied with grabbing any trigger declarations, variable declarations, the FROM/INNER JOIN tables, and function calls, so that I can provide Intellisense-like information, as well as pull up the arguments for system functions (I find myself looking up arguments much too often). With slightly more detail, I'd be able to beautify code.

    I did a bit of looking around online, and it seems that a complete solution would involve something like generating a full BNF/EBNF specification for T-SQL, or at least some sort of grammar. Then, I would use something like Grammatica to generate a parser. Now, I am not sure if generating a parser like this is what I want, as it needs to be fast enough to work real-time. I know I can limit how much it parses to the node currently being edited, although managing that might be a bit tricky. I'm not sure if the first version should implement a full-blown parser. In addition, I'd want the complete parser to produce very nice error messages, something Grammatica claims to do.

    An alternative I have come up with is to make a skeleton parser which grabs trigger declarations, variable declarations, control statements (or perhaps just begin/end pairs), and select/insert/update/delete statements. I'd probably also want to grab function calls and stored procedure execs. I'm thinking that I would identify the beginning and ends of nodes, and do a bit of extra processing on certain nodes. It seems that this will work and be fast enough to work real-time, as one is writing SQL.

    Having only done non-traditional parsing in the past (I've never written up a proper grammar), I want to ensure I haven't missed anything. Is the above reasonable? I'm assuming that my skeleton parser wouldn't be nearly as much work as a full-blown one, since I'd only be parsing a small subset of SQL grammar. I'll still be generating a parse tree; it'll just be a sparse one.

    Thanks in advance for any advice or comments.

  • Sorry Luke, I can't help with your reqest, but I have had a look at your app, and it looks interesting.  Just thought I'd let you know. Look forward to seeing a release which lets me use it against one of my DB's.

    Cheers,

    Martin

  • The current version let's you specify the DB, but it doesn't parse select statements for intellisense (that being the current project). If you want, I can still pack it up for you, but it doesn't really do much besides populate the tree and have (yet-to-be-fully-debugged) infinite undo/redo, a welcome reprieve from Query Analyzer. The intellisense you see in the screen dump comes from the database, but is prompted by strings in a text file, unless you type out a full table name before the period.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply