[read this post on Mr. Fox SQL blog]
Recently we had a requirement to perform SQL Spatial functions on data that was stored in Azure SQL DW. Seems simple enough as spatial has been in SQL for many years, but unfortunately, SQL Spatial functions are not natively supported in Azure SQL DW (yet)!
If interested – this is the link to the Azure Feedback feature request to make this available in Azure SQL DW – https://feedback.azure.com/forums/307516-sql-data-warehouse/suggestions/10508991-support-for-spatial-data-type
AND SO — to use spatial data in Azure SQL DW we need to look at alternative methods. Luckily a recent new feature in Azure SQL DB in the form of Elastic Query to Azure SQL DW now gives us the ability to perform these SQL Spatial functions on data within Azure SQL DW via a very simple method!
So the purpose of this blog is to show how to perform native SQL Spatial functions on data within Azure SQL DW.
How does the Solution Work?
These are the steps needed to get this solution moving…
- SQL Spatial data types (geometry, geography) are not supported in Azure SQL DW tables, so you must use varbinary(max) data type as the spatial column. SQL spatial data can very happily be stored in varbinary columns!
- Create a small standalone Azure SQL DB next to your primary Azure SQL DW – on the same virtual Azure SQL Server.
- Create a SQL Login / User on the primary Azure SQL DW which will be used by Azure SQL DB to connect. Grant the database user rights to select from the table containing spatial data.
- Create a SQL Credential in Azure SQL DB which uses the above SQL Login to connect to Azure SQL DW
- Create an External Data Source in Azure SQL DB that points to the Azure SQL DW and uses the SQL Credential to authenticate
- Create an External Table in Azure SQL DB that “points” to the primary table in your existing Azure SQL DW (the one containing spatial data) using the External Data Source
- NOW – the Azure SQL DB now becomes the new query entry point for any/all SQL queries that require spatial functionality. Any SELECT on the External Table in the Azure SQL DB will source their data from the existing Azure SQL DW via a Remote Query – and bring selected data back into Azure SQL DB where SQL Spatial functionality is available!
So the setup looks something like this…
Connecting Azure SQL DB and Azure SQL DW
There’s already a great tutorial on MS DOCS on how to connect Azure SQL DB to Azure SQL DW via Elastic Query, so I’m going going to repeat it here.
To see the SQL code for the above steps see the tutorial here – https://docs.microsoft.com/en-us/azure/sql-data-warehouse/tutorial-elastic-query-with-sql-datababase-and-sql-data-warehouse
Setting up SQL Spatial Tables and Data
The following SQL will setup a table in Azure SQL DW which contains 2 rows of sample SQL Spatial data. Note that the data type used for all commands in Azure SQL DW is varbinary(max) and not the standard SQL spatial types.
-- RUN ON AZURE SQL DW: Create Spatial Table CREATE TABLE [dbo].[PolyTable] ( [PolyName] [varchar](255) NULL, [PolyVarBin] [varbinary](max) NOT NULL ) WITH ( DISTRIBUTION = ROUND_ROBIN, HEAP ) -- Create 2 spatial objects in the table INSERT INTO [dbo].[PolyTable] SELECT 'Point1', CONVERT(VARBINARY(MAX), '0xdbo].[PolyTable] SELECT 'Point2', CONVERT(VARBINARY(MAX), '0x
The following SQL will setup an External Table in Azure SQL DB which will connect across to the Azure SQL DW table containing the SQL Spatial data. Note that for this demo I have called my External Data Source as “ASDW“.
-- RUN ON AZURE SQL DB: Create External Table Pointing to SQL DW CREATE EXTERNAL TABLE [dbo].[PolyTable] ( [PolyName] [varchar](255) NOT NULL, [PolyVarBin] [varbinary](max) NOT NULL ) WITH ( DATA_SOURCE = [ASDW], SCHEMA_NAME = N'dbo', OBJECT_NAME = N'PolyTable' )
Querying SQL Spatial Data from Azure SQL DB
So now that we have the source Azure SQL DW table containing our spatial data, and the External Table in Azure SQL DB pointing to the source table, we can now run some SQL Spatial Queries!
Connect using SQL Management Studio (SSMS) to the Azure SQL DB and run the following “classic” SQL Spatial queries.
For my demo I’m only using a couple of spatial rows, so its worth poining out that you will need to validate this architecture for your data set, specifically at scale (ie if you are pulling back millions of rows over Remote Query). This articile spells out some of the recommended best practices when setting this up at scale – https://docs.microsoft.com/en-us/azure/sql-data-warehouse/how-to-use-elastic-query-with-sql-data-warehouse
Query 1 – Simple Geometry Select
SELECT PolyName, cast(PolyVarBin as GEOMETRY) as PolyVarBin FROM [dbo].PolyTable
Query Result;
Query 2 – Spatial Boundary Function
DECLARE @g GEOMETRY; SELECT @g = PolyVarBin FROM [dbo].PolyTable WHERE [PolyName] = 'Point1'; select @g.STBoundary() UNION ALL select @g.STEnvelope();
Result;
Query 3 – Spatial Area Function
DECLARE @g GEOMETRY; SELECT @g = PolyVarBin FROM [dbo].PolyTable WHERE [PolyName] = 'Point1'; select @g.STArea() as PolyArea;
Result;
Query 4 – Spatial Distance Function
DECLARE @g GEOMETRY; DECLARE @g2 GEOMETRY; SELECT @g = PolyVarBin FROM [dbo].PolyTable WHERE [PolyName] = 'Point1'; SELECT @g2 = PolyVarBin FROM [dbo].PolyTable WHERE [PolyName] = 'Point2'; select @g.STDistance(@g2) as DistanceP1toP2;
Result;
Query 5 – The Old Classic “Nearest Neighbour” Spatial Function
DECLARE @g GEOMETRY = 'POINT(57 39)'; SELECT @g = @g.STBuffer(2); SELECT --TOP 1 -- Uncomment this to only show closest polygon PolyName +' [' + cast(cast(PolyVarBin as GEOMETRY).STDistance(@g) as varchar(250)) + ']' as PolyName, cast(PolyVarBin as GEOMETRY) as PVB, cast(PolyVarBin as GEOMETRY).STDistance(@g) as Distance FROM [dbo].PolyTable UNION ALL SELECT 'To Point' as PolyName, @g as PVB, 0 as Distance ORDER BY cast(PolyVarBin as GEOMETRY).STDistance(@g);
Summary
So there you have it, a pretty simple method to perform SQL Spatial function on data within Azure SQL DW – even when spatial data types aren’t even supported!
So as usual, and as I always say, please test this out yourself on your own data and validate your scalability needs as your mileage may vary!
References
Some great MS DOCS references I can call out here…
- How to use Elastic Query with SQL Data Warehouse – https://docs.microsoft.com/en-us/azure/sql-data-warehouse/how-to-use-elastic-query-with-sql-data-warehouse
- Configure Elastic Query with SQL Data Warehouse – https://docs.microsoft.com/en-us/azure/sql-data-warehouse/tutorial-elastic-query-with-sql-datababase-and-sql-data-warehouse
- Create External Data Source – https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-data-source-transact-sql
- Create External Table – https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-table-transact-sql
- SQL Server Spatial Data – https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-data-sql-server
Disclaimer: all content on Mr. Fox SQL blog is subject to the disclaimer found here