There are new OPENJSON and FOR JSON commands in SQL Server 2016. Let's see if we can use OPENJSON to grab publically available data from the thousands of JSON formatted API's across the internet. Let's create the sample Blockchain database with the btc.Blocks table.
USE [master] GO CREATE DATABASE [Blockchain] GO ALTER DATABASE [Blockchain] SET COMPATIBILITY_LEVEL = 130 GO EXEC sys.sp_db_vardecimal_storage_format N'Blockchain', N'ON' GO USE [Blockchain] GO CREATE SCHEMA [btc] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [btc].[Blocks]( [BlockID] [int] NOT NULL, [BlockVersion] [int] NULL, [BlockHash] [nvarchar](32) NULL, [PreviousBlockHash] [nvarchar](32) NULL, [BlockTimestamp] [datetime] NULL, PRIMARY KEY CLUSTERED ( [BlockID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO USE [master] GO ALTER DATABASE [Blockchain] SET READ_WRITE GO
In order to reach the API, we need to enable Ole Automation to open up the sp_OA system stored procedures.
From there we will create two functions. The first function will allow you to enter the block height of BlockChain.info's API. To see the data in your browser, you can click on this link: https://blockchain.info/block-height/0?format=json. The block height that will pull is block 0, Satoshi's genesis block. You can change the 0 in the URL to pull any block up to the current block height (a little over 400,000 blocks).
Unfortunately BlockChain.info stores the original UNIX timestamp in the blockXXX.dat data files from the chain. In order to convert this to SQL Server datetime, the script below will add and a second function (btc.fn_UNIXConvertToDateTime) for the conversion.
-- Enable Ole Automation sp_configure 'show advanced options', 1; GO RECONFIGURE; GO sp_configure 'Ole Automation Procedures', 1; GO RECONFIGURE; GO sp_configure 'show advanced options', 0; GO RECONFIGURE; GO -- This function will use the CLR system stored procedures to make a request to the API url and bring data back as an NVARCHAR(4000) variable CREATE FUNCTION [btc].[fn_GetBlockFromAPI] (@BlockHeight INT) RETURNS NVARCHAR(MAX) AS BEGIN DECLARE @urlVARCHAR(255); DECLARE @objINT; DECLARE @hrINT; DECLARE @msgVARCHAR(8000); DECLARE @jsonNVARCHAR(4000);--Cannot use MAX with CLR stored procedures SET @url = 'https://blockchain.info/block-height/' + CAST(@BlockHeight AS VARCHAR(255)) + '?format=json'; EXEC @hr = sp_OACreate 'WinHttp.WinHttpRequest.5.1', @obj OUT -- Creates an instance of an OLE object IF @hr <> 0 BEGIN SET @Msg = 'sp_OACreate WinHttp.WinHttpRequest.5.1 failed. Cannot connect to Blockchain.info' GOTO Error END EXEC @hr = sp_OAMethod @obj, 'Open', NULL, 'GET', @Url, false -- Calls a method of an OLE object. IF @hr <> 0 BEGIN SET @msg = 'sp_OAMethod Open failed. Cannot connect to Blockchain.info' GOTO Error END EXEC @hr = sp_OAMethod @obj, 'SetRequestHeader', NULL, 'Content-Type', 'multipart/form-data' IF @hr <> 0 BEGIN SET @msg = 'sp_OAMethod SetRequestHeader failed. Cannot connect to Blockchain.info' GOTO Error END EXEC @hr = sp_OAMethod @obj, SEND, NULL, '' IF @hr <> 0 BEGIN SET @msg = 'sp_OAMethod Send failed. Cannot connect to Blockchain.info' GOTO Error END EXEC @hr = sp_OAGetProperty @Obj, 'ResponseText', @json OUTPUT IF @hr <> 0 EXEC sp_OAGetErrorInfo @Obj EXEC @hr = sp_OADestroy @obj RETURN @json Error: EXEC @hr = sp_OADestroy @obj RETURN @msg END GO -- API is holding UNIX time, so we will need to convert to SQL Server datetime format with this function CREATE FUNCTION [btc].[fn_UNIXConvertToDateTime] (@DateTime BIGINT) RETURNS DATETIME AS BEGIN DECLARE @LocalTimeOffset BIGINT ,@AdjustedLocalDatetime BIGINT; SET @LocalTimeOffset = DATEDIFF(second,GETDATE(),GETUTCDATE()) SET @AdjustedLocalDatetime = @Datetime - @LocalTimeOffset RETURN (SELECT DATEADD(SECOND,@AdjustedLocalDatetime, CAST('1970-01-01 00:00:00' AS DATETIME))) END;
Now lets call the btc.fn_GetBlockFromAPI function to grab the JSON data and insert into the @Block temp table. From there we will insert the @Block data into the btc.Blocks table IF (SELECT MainChain FROM @BlcokTemp) = 1. These are main chain blocks (not orphaned) and we will also convert the nasty UNIX timestamp.
DECLARE @BlockHeight INT; SET @BlockHeight = 0-- Height of block on the chain you want data for DECLARE @Block VARCHAR(MAX); SET @Block = (SELECT [btc].[fn_GetBlockFromAPI](@BlockHeight)); DECLARE @BlockTemp TABLE( [BlockID]INTNOT NULL, [BlockVersion]INTNOT NULL, [BlockHash]VARCHAR(32)NOT NULL, [PreviousBlockHash] VARCHAR(32)NOT NULL, [BlockTimestamp]VARCHAR(10)NOT NULL, [MainChain]BITNOT NULL ); INSERT INTO @BlockTemp ([BlockID], [BlockVersion], [BlockHash], [PreviousBlockHash], [BlockTimestamp], [MainChain]) SELECT * FROM OPENJSON (@Block, '$.blocks') WITH ( BlockIDINT'$.height', BlockVersionINT'$.ver', BlockHashVARCHAR(32)'$.hash', PreviousBlockHashVARCHAR(32)'$.prev_block', BlockTimestampVARCHAR(10)'$.time', MainChainBIT'$.main_chain' ) -- Now only INSERT the data from the temp table if MainChain = 1 (True) and also convert the UNIX timestamp IF (SELECT MainChain FROM @BlockTemp) = 1 BEGIN INSERT INTO btc.Blocks ([BlockID], [BlockVersion], [BlockHash], [PreviousBlockHash], [BlockTimestamp]) SELECT [BlockID], [BlockVersion], [BlockHash], [PreviousBlockHash], btc.fn_UNIXConvertToDateTime([BlockTimestamp]) FROM @BlockTemp END
Result set from the INSERT statement:
Of course you can add a loop to pull rows sequentially, but some API providers are sensitive about their load. If the API provider is hit too hard they may ban your IP, unless you space out your API calls or sign up for and API Code to bypass limits.
Disadvantages:
- One major restriction to be aware of is the 4000 character limit of sp_OACreate. If you need to pull JSON data from an API that may have over 4000 characters, the result set will return NULL. However, this process can be highly effective for small amounts of data you need to consume frequently if you know the JSON data source will never exceed 4000 characters.
- Must turn on Ole Automation, be sure to check your security needs and settings.
Advantages:
- Bypasses Python, Java, PHP, Ruby, etc libraries to reach the JSON target (and the devs that go with them)
- Can be implemented and managed by the DBA or SQL Developer natively within SQL Server 2016
Now that you know how to pull data from an API natively, you can search for publicly available data from the world largest API directory. This tool should be great for data warehousing or reporting needs to compare public data with your private date to see how they benchmark.