When is 8000 too small?

  • Steve Jones said,

    "XML doesn't work everywhere, and I'm not sure I think that's the answer."

    Where does XML not work?

    Please explain, cause XML has worked well on several projects that I have in production for years.

    Also, is is an accident you picture the first system using the Zilog 16 bit chip that used an extra registers to address double the amount of memory?

    Did you just pick it cause it had 8000 in it's name or are you thinking we should use solutions like this to remove limitations.

  • Steve Jones - Editor (2/11/2010)


    This is definitely a place where the OS and SQL should align the partitions and align the cluster/page size when you create a DB as well.

    Yes, but this discussion is about manipulating arrays of bytes (in memory) rather than how they get stored on disk. On current hardware allocating a Mb block of memory to processes a very long string function will have a minimal impact on Gb's of memory space - getting a Mb of contiguous memory may be an issue - but it's still a minor one, one which the database already needs to manage.

    Limiting any function is archaic, and costs us (the customers) in time and effort to provide our own unrestricted version via xp_ or clr. for something that should work out-of-the-box.

  • XML imposes a lot of overhead, and so many systems are built on relational data, not XML. If you've bought into a solution and use it extensively, it will work fine. However there are lots of applications, and lots of work gone into building relational databases. They function better than an XML database in many applications.

    Moving all large text to XML, IMHO, is not a solution for functions that do not support > 8000 bytes.

    The image was picked stricltly for the name. No other meaning intended.

  • Steve Jones - Editor (2/11/2010)


    This is definitely a place where the OS and SQL should align the partitions and align the cluster/page size when you create a DB as well.

    Also consider the Raid stripe size. The database page size needs to be a submultiple of (or equal to) the stripe size to get best performance. It also needs to be equal to or a multiple of the cluster size. There's no reason why different page sizes shouldn't exist in a database, but I would want all pages in a given table or in a given non-clustered index to be the same size and I certainly wouldn't want to allocate different page sizes within a single OS file: in effect that means that all filegroups have page sizes and all filegroups in any given partition_scheme have the same page size. As for range of page sizes, I guess going down from 8kb is not very interesting but there are some very small tables in most databases, and allocating 8kb to a table which can never contain more than one row (enforced by a check constraint and a unique constraint on the same column, which means another 8kb for the index used by the unique constraint) does seem a bit wasteful, particularly if the single row is very small. I don't see any good reason to have an upper bound below 128kb (maybe even 1MB, or even higher?).

    Reverting to the original topic - the 8000 limitation on many T_SQL operators/functions: I think it ought to be killed off. It's a real pain having to write code to work around it. It's a pain that many of us are very used to, having lived with SQLS 2K for a decade, but we surely don't want to live with it for ever into the future.

    Tom

  • I think that there is a serious issue of backwards compatibility here. It may be likely that changing the existing functions would break current uses. This would require either an optional parameter to the interface (that defaults to being limited to performing exactly as the current implementation) or a set of parallel separate functions that work in the same way but remove or alter the current limitations.

    As for the issue with page size, I only understand this on a conceptual level but given that some of the previous posters have described valid reasons for control over the page size then I cannot see any justification for not making this alterable.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I am more worried about functions that return an INT rather than a BIGINT.

    Calculations on larger datasets can easily involve values that throw overflow exceptions.

    If there is a uniqueidentier datatype of 16 bytes can we have a 16byte integer data type added? God knows what it would be called. HUGEINT?

    COLOSSALINT?:hehe:

  • Lynn Pettis (2/10/2010)


    I take it that the 8060 byte limit has its roots with Sybase. ....

    Actually Sybase had different page sizes that was determined at initial configuration of the server: 2KB,4KB,8KB,16KB. Bit like the collation setting. Problem was to then restore a database to a different sever page size. But then again the City used Sybase because of its speed compared to the totally flexible row size of Oracle.

    Just as well .Net assemblies can be added to SQL it gets over a lot of problems.

  • Yet Another DBA (10/23/2014)


    ...Just as well .Net assemblies can be added to SQL it gets over a lot of problems.

    Not sure that I like the thought of being forced to enable and use CLR assemblies to fix T-SQL. It seems like a risky work around. Possibly valid or rather valid work around in some circumstances but it does increase the surface of attack.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Gary Varga (10/23/2014)


    Yet Another DBA (10/23/2014)


    ...Just as well .Net assemblies can be added to SQL it gets over a lot of problems.

    Not sure that I like the thought of being forced to enable and use CLR assemblies to fix T-SQL. It seems like a risky work around. Possibly valid or rather valid work around in some circumstances but it does increase the surface of attack.

    True, but this sort of thing can be declared as SAFE which keep the risks to a minimum. Some DBAs I have worked with will never enable this type of thing on the grounds that they dont understand what is in the assemble. Not that they query what is in a service pack, or review all the TSLQ code that goes into the database.

  • Yet Another DBA (10/23/2014)


    Gary Varga (10/23/2014)


    Yet Another DBA (10/23/2014)


    ...Just as well .Net assemblies can be added to SQL it gets over a lot of problems.

    Not sure that I like the thought of being forced to enable and use CLR assemblies to fix T-SQL. It seems like a risky work around. Possibly valid or rather valid work around in some circumstances but it does increase the surface of attack.

    True, but this sort of thing can be declared as SAFE which keep the risks to a minimum. Some DBAs I have worked with will never enable this type of thing on the grounds that they dont understand what is in the assemble. Not that they query what is in a service pack, or review all the TSLQ code that goes into the database.

    People can unfairly treat different artifacts of a similar risk level with different perceptions of trust.

    I guess my biggest issue is that it would be using one technology to workaround the limitations in another's API.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Gary Varga (10/23/2014)


    Yet Another DBA (10/23/2014)


    ...Just as well .Net assemblies can be added to SQL it gets over a lot of problems.

    Not sure that I like the thought of being forced to enable and use CLR assemblies to fix T-SQL. It seems like a risky work around. Possibly valid or rather valid work around in some circumstances but it does increase the surface of attack.

    I agree. I wonder if having the work-around puts less pressure on MS to create a solution in t-sql.

  • The page limitation ties into another discussion Steve started a few months ago (in 2014 that is) about the slowing pace of change in SQL Server in general. Given that hardware for servers is now mostly 64 bit, memory is far greater than in 1999, when SQL Server moved from 2k to 8k pages (with the release of v7) it makes sense that the page size should be expanded.

    But, this is now just one of many growing limitations of SQL Server compared to others. So, perhaps the better question is, will SQL Server be a viable option in five, or ten years? I started with SQL Server back in 1995, with version 4.21 and v6. Back then it was a great choice. Now, it is just a good choice. My sense is that Microsoft has now setup SQL Server as a cash cow, with the expectation that in ten years it will be gone.

    For those of you with close ties to MS, perhaps you can wake them up!

    The more you are prepared, the less you need it.

  • Steve Jones - SSC Editor (2/11/2010)


    jparker2 (2/11/2010)


    This has already been resolved.

    If you want to work with strings greater than 8K then declare them as XML or TEXT.

    In Access use a Note data type.

    Text has been deprecated. varchar(max) is what you use, which allows 2^31 - 1 bytes. (http://msdn.microsoft.com/en-us/library/ms176089.aspx) That's a good change, but if all the code and functions we have don't handle that 8001st byte, that's an issue.

    XML doesn't work everywhere, and I'm not sure I think that's the answer.

    2^31-1 is still a restriction that can cause problems. Back in 2003 a limit of 2^32-1 was causing problems for me in another context, making life pretty complicated, and if we had tried to put that data in SQL Server (presumably using text or image at that date, as there was no varbinary(max) and no filestore data) the 2^31-1 limitation would have made it impossible. I don't think the binary blobs that arise in science, and in entertainment, and sometimes elsewhere have got any smaller since then.

    I agree that XML doesn't always provide an answer (in my experience a decision to use XML has more often been part of the problem than part of the solution).

    Tom

  • Lynn Pettis (2/10/2010)


    I take it that the 8060 byte limit has its roots with Sybase. Not all database systems have such a limitation. I had the opportunity to work with one of those many years ago. It was part of a head to head competition between MS SQL Server 6.5 and Borland InterBase. Now InterBase is a good product and has many interesting features (like storing multideminsional arrays -- why, not sure). It didn't have a fixed page size. If your record was 64K long, it didn't care.

    The main reason InterBase lost the competition was that the lead programmer determined it was easier to code against SQL Server than InterBase using Visual Basic (can't remember if it was VB 5 or VB 6, but think it was VB 6). Coding to the InterBase ODBC was extremely bloody according to this individual.

    Back at the time of the Sybase/MS SQL Server split, Sybase's page size was 2k.

    Since Sybase ASE 12.x, I believe, you can choose between 2k, 4k, 8k adn 16k page sizes.

  • TomThomson (10/23/2014)


    ...I agree that XML doesn't always provide an answer (in my experience a decision to use XML has more often been part of the problem than part of the solution).

    Often it is misused as a way to avoid converting XML documents/fragments. Or stipulated as a requirement by someone who is not an implementer (i.e. neither a coder nor a DBA) or, wholly unforgivably, a lazy coder.

    If data needs to be processed by a RDBMS then it needs to be stored as relational data. Storing XML in an RDBMS, for me, is the equivalent of storing media files e.g. I would not use SQL Server to sharpen an image file.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply