I missed this yesterday, but I saw it on the schedule again, and I knew
I had to make this one. It's titled "Beyond Relational", which I expect
to mean it's focused on enhancements not necessarily related to rows
and columns. But I didn't see the abstract, so I'm not 100% sure what
it will be. I'm hoping it's not too much time on GIS stuff.
Jnue CTP is out and is the first public release. They are definitely
looking for feedback on the features that are already baked in.
4 pillars of SQL Server
- enterprise platform - all the "-abilities"
- Beyond relational
- Dynamic development - moving towards model based development.
- Pervuasive Insight - BI everywhere, all data can be used in BI.
Beyond Relational - The explosion of data which means more and more types of data will need to be hosted.
Survey of the audience: how many have more than 1TB at home? 75-80% did.
Applications work with different types of data to be rich.
- relational (rows/colummns), structured data
- XML
- documents and multimedia data.
- spatial data
An example is an insurance claims application. In the past the
application was forms mapped to tables in rows. Today we have pictures
and videos, relating to other documents like witness reports, moving
data in XML, and even mapping those claims to geographical locations.
Idea on new types of data is to ensure that the functionality available
for traditional types of data is also available for new data types.
Goals
- Reduce cost of managing all types of data
- simply development of apps using both relational and non-relational data
- Extend services for relational data to non-relational data.
Additions in Katmai
- XML upgrades
- Remote BLOB Store API, FILESTREAM, Integrated FTS
- support geometry and geography types and functions
- Large UDTs, flexible columns, wide tables, filtered indexes, hierarchyID
WinFS? - most of team was folded into SQL Server team. A lot of
the work went into SQL Server. The FILESTREAM type and large UDTs came
out of this project.
XML Upgrades
- Full support for storing and validating Office 12 document formats.
- Support for lax validation
- Full xs:dateTime support
- support for no timezone values and timezone preservation
- support for lists and union types
- There is XQUERY support for the let-clause.
- Added support for insert sql:variable("@xml") into /a/b
Doucments and Multimedia
Why store in the db?
- integrated management and data level consistency
- problems with poor data streaming, size limitations, and high cost/GB
- right now you can use VARBINARY(MAX)
Alternatives
- File server storage or dedicated BLOB store server (RMC Centera, Fujitsu Nearline).
None of these is the right way. Each works in different situations.
SS2K8 should deliver a few different platforms.
- FILESTREAM feature. Addresses limitations of SQL BLOBs ni making them work better.
- REMOTE BLOB storage - allows blobs to live in file server or dedicated store with reference in db.
- SQL BLOBs - more support for internal storage.
FILESTREAMS
storage attribute on VARVINARY(MAX)
- unstrcutured data is stored in the file systems (NTFS)
- dual programming model.
- TSQL (same as blob)
- WIN32 streaming APIs with T-SQL transactional semantics.
- Gives data consistency and integrated manageability (backup/restore, administraton)
- Size limit is the file system volume size
- uses SQL Server security stack
- demo - Kevin Farlee
- create db with filegroups, one of which has an attribute as
containing filestreams. the location is a directory that will be
created. Different tables can have filestreams in different locations.
Just like tables.
- table is a "varbinary(max) filestream" type.
- insert a row and we see
- to work with the data, need to get a handle and logical path to get to the data in the filesystem
"select get_filestream_transaction_context(), field.pathname from MyTable"
- Use a Win32 call to read and write this data from the file system (fascreat??)
- Showed transaction rollback as well as trigger fire from update to the file system data.
- backup, drop db, restore db, and data is there, including filestream data.
- First release in 2008 should have these limitations
- remote storage of filestream not supported
- DB snapshot and mirroring are not supported (log shipping will work)
- encryption and table value parameters are not supported.
My question is what does this do to the log?
REMOTE STORE BLOB API
- applications can link via a provider. So each application provider needs to deliver their own provider for their own system.
- simple api for fetch, create, enumerate, Garbage collection, delete
- returns reference that is stored in the db. By going through the db,
there is some maintenance of the references. This should ensure some
link consistency and allows the application to be loosely linked to the
BLOB store.
Full-Text Indexing
Challenges
- indexes outside SS have manageability problems
- mixed query performance suffers from pull over complete full-text set
- scaling issues on big boxes
SS2K8
- FTI fully integrated into the engine.
- Mixed queries should scale better. (query on "contains() and xx = i)
Spatial Data
- New types (geometry, geography)
- New methods (intersects, buffer, more)
- new indexes for similar performance to other types.
- Full set of OGC/SQL MM, ISO 19125 components
- Integration with Virtual Earth
- Demo - michael rys
- intersection of map with zip code query.
- added spatial index, performance is improved.
- showed integration with Virtual Earth as well
Relational Data
- HierarchyID - store arbitrary hierarchies of data and efficiently query them. New UDT to implement hierarchies.
- Large UDTs - no more 8k limit
- Sparse columns - optimized storage for sparsely populated columns.
- Wide tables - support for hundreds of thousands of sparse columns.
- Filtered indexes - define indexes over subsets of data. Index on rows in a table with a particular value.