The CLR

  • Matt Miller (5/13/2008)


    All right - let me wipe the egg off my face for a second. You're right - I overlooked the SQLChars as an output parameter for Functions. I confused this with the User-defined Aggregates limit, which DOES only put out 8000 (thanks to its MaxByteSize property).

    Don't worry about it ;). It took me long enough to figure that out originally and there is very little documentation on handling NVARCHAR(MAX) and SqlChars in any of the documentation or forums; at least when I was trying to figure it out.

    Regarding the memory limitation of CLR UDAs and UDTs, you might want to look at:

    http://www.sqlservercentral.com/articles/SS2K5+-+CLR+Integration/3208/

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • developmentalmadness (5/13/2008)


    - the Table-valued functions need to materialize their entire recordset before being able to return anything, so they can quickly "swamp" the memory area.

    This is true for T-SQL TVFs but not for CLR TVFs. The CLR actually "streams" the results. From BOL:

    Differences Between Transact-SQL and CLR Table-Valued Functions

    Transact-SQL TVFs materialize the results of calling the function into an intermediate table. Since they use an intermediate table, they can support constraints and unique indexes over the results. These features can be extremely useful when large results are returned.

    In contrast, CLR TVFs represent a streaming alternative. There is no requirement that the entire set of results be materialized in a single table. The IEnumerable object returned by the managed function is directly called by the execution plan of the query that calls the TVF, and the results are consumed in an incremental manner. This streaming model ensures that results can be consumed immediately after the first row is available, instead of waiting for the entire table to be populated. It is also a better alternative if you have very large numbers of rows returned, because they do not have to be materialized in memory as a whole. For example, a managed TVF could be used to parse a text file and return each line as a row.

    http://msdn.microsoft.com/en-us/library/ms131103.aspx%5B/quote%5D

    I understand what BOL states, but the reality doesn't seem to match up to their description. Every table-valued function in CLR implements ienumerable, which SHOULD implicitly stream, but in every implementation I've managed - the entire array/collection is built FIRST, and then the fillRow method kicks in. Perhaps I've been running across nothing but bad examples - but it's been something I just can't seem to make happen.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • jfmccabe (5/13/2008)


    Depends what you mean by complex problems. I'm involved in educational admin and we have a standard allocation problem at the start of each academic year.

    Have solved precisely this problem using T-SQL; implemented a genetic algorithm to tackle it - all in sprocs & functions & with no need to use CLR. Fast too, though I've no idea what the relative speed would have been had I gone with a CLR solution - gut feel is slower though since most of the processing is set-based.

  • Regexp Expressions

    Compression of Blobs

    Concatenate() aggregate function

    All the above have proved useful.

    **** The 8K limit is an urban myth!!! ****

    see the following for a sample:

    http://www.codeproject.com/KB/database/blob_compress.aspx

  • Renato Buda (5/14/2008)


    Regexp Expressions

    Compression of Blobs

    Concatenate() aggregate function

    All the above have proved useful.

    **** The 8K limit is an urban myth!!! ****

    see the following for a sample:

    http://www.codeproject.com/KB/database/blob_compress.aspx%5B/quote%5D

    What kind of concatenation function? Concatenation of a column in like rows or ???

    Most people believe concatenation in T-SQL to be slow because they make a very simple mistake which is flushed out in the following article...

    http://www.sqlservercentral.com/articles/Test+Data/61572/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Renato Buda (5/14/2008)


    Regexp Expressions

    Compression of Blobs

    Concatenate() aggregate function

    All the above have proved useful.

    **** The 8K limit is an urban myth!!! ****

    see the following for a sample:

    http://www.codeproject.com/KB/database/blob_compress.aspx%5B/quote%5D

    Yup - I've already been corrected on the function side, but I don't think the 8K is negotiable on the aggregate side. The code won't compile with a MaxByteSize greater than 8000, or if you skip putting it in.

    Not that I wouldn't love to be wrong on this one, 'cause i think it's a bit wrong-headed to be limited like that.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Solomon Rutzky (5/13/2008)


    developmentalmadness (5/13/2008)


    - the Table-valued functions need to materialize their entire recordset before being able to return anything, so they can quickly "swamp" the memory area.

    This is true for T-SQL TVFs but not for CLR TVFs. The CLR actually "streams" the results. From BOL:

    Differences Between Transact-SQL and CLR Table-Valued Functions

    Transact-SQL TVFs materialize the results of calling the function into an intermediate table. Since they use an intermediate table, they can support constraints and unique indexes over the results. These features can be extremely useful when large results are returned.

    In contrast, CLR TVFs represent a streaming alternative. There is no requirement that the entire set of results be materialized in a single table. The IEnumerable object returned by the managed function is directly called by the execution plan of the query that calls the TVF, and the results are consumed in an incremental manner. This streaming model ensures that results can be consumed immediately after the first row is available, instead of waiting for the entire table to be populated. It is also a better alternative if you have very large numbers of rows returned, because they do not have to be materialized in memory as a whole. For example, a managed TVF could be used to parse a text file and return each line as a row.

    http://msdn.microsoft.com/en-us/library/ms131103.aspx%5B/quote%5D

    I do read the above documentation as you do, but it is somewhat true (maybe just in a technical sense) that the CLR TVFs DO unfortunately need to build the entire result-set before returning it. Now, this might be semantics in terms of how the internals of SQL server merge those results with the calling query, but in the .Net code, you do need to store your entire data structure while the function is executing and only at the very end of the function does it return the results. So, yes, it does consume as much memory as the result set will need before it can return it even if it does it more efficiently than a T-SQL TVF. For this reason I have sometimes used a CLR Proc which can send a result row as you create it and hence consume very little memory.

    Better explanation than I could manage...:) Agreed.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I personally like the idea of bringing managed code inside SQL Server. Sometimes saving a few milliseconds is not the most important thing: having cleaner and more manageable code than the old-fashioned T-SQL makes it worth.

    I agree that you can do nearly anything with T-SQL (although I have yet to see a universal string concat function, not tied to a single table, to which you would just pass your column name in any of your SELECT statements, as you can do with a CLR aggregate), but sometimes you have to do nearly illegible tricks that are very hard to maintain in time.

    Yes, if I have to say where I have used SQL-CLR yet, guess what, the answer is in string concatenation and in string manipulation with Regex. This means that probably MS (or ANSI) should do something about this in T-SQL, as they seem to be very widespread requirements.

    The bad thing about the CLR I have found so far is how poorely security is implemented and documented. It is becoming a nightmare to me. I still have to find out how to grant the execute privilege on my custom aggregates that worked until last Friday, but now that I have redeployed my solution to the server from within VS2008 only work when run by an administrative account. BOL are poorly organized and confusing on this topic. It has taken me few minutes to write down the aggregate functions. It is taking days to find out how to have them work. And the incredible thing is that they used to work!

  • Great thoughts and information. Any of you want to write some articles with examples of what's worked or hasn't?

  • Perhaps rather than arguing about whether there is ANY merit to CLR procedures we could just accept it as another tool in the ol' tool chest. It's there when you need it and if you never need it, well at least you have the option.

    I have seen suggestions that when doing computation bound processing CLR code can perform better while database IO is best done in T-SQL. Personally, I've never seen a computationally bound database procedure but I suppose they're out there.

    I know this is a forum by and for SQL Server bigots but there is also the possibility that those of us who are strong in the .Net CLR and not so strong in T-SQL (I'm more of an Oracle PL/SQL kinda guy myself) would be better off creating strong CLR code instead of weak T-SQL code.

    One place I do see a definite benefit to the CLR capability is in sharing code that might be needed both in your Data Access Layer and Business code. We are currently passing a disturbing and increasing number of authorization logic parameters to our T-SQL code. I'm thinking that we should just pass the credentials and let the database code invoke the (CLR) authorization methods directly.

    Will


    Thanks ,

    Will Dougherty

  • I know this is a forum by and for SQL Server bigots but there is also the possibility that those of us who are strong in the .Net CLR and not so strong in T-SQL (I'm more of an Oracle PL/SQL kinda guy myself) would be better off creating strong CLR code instead of weak T-SQL code.

    I cant agree with that. They would be better off either getting stronger in T-SQL code, or getting someone to do it for them, than writing underperforming CPU hogging CLR code to do what it was never intended to do.

    Tom Garth
    Vertical Solutions[/url]

    "There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers
  • Will DOugherty (5/15/2008)


    I know this is a forum by and for SQL Server bigots but there is also the possibility that those of us who are strong in the .Net CLR and not so strong in T-SQL (I'm more of an Oracle PL/SQL kinda guy myself) would be better off creating strong CLR code instead of weak T-SQL code.

    No... we let .Net biggots in, as well... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • We are pro-SQL Server, but not bigoted against other solutions. It's just that we often see procedural or OOP type solutions that don't perform well in SQL Server. SQL Server is built to fundamentally work well with sets of data and that code is harder to write well.

    I'd love to see the strong .NET folks build some good assemblies we can use. I'm struggling to figure out what the good uses are. I think writing .NET to rip through data row by row instead of using the power of T-SQL to handle sets is a mistake since it could (potentially) eat up more resources than needed on the database server. Which is a limited resource.

Viewing 13 posts - 46 through 57 (of 57 total)

You must be logged in to reply to this topic. Login to reply