SQLServerCentral Article

JSON in Microsoft SQL Server: A Comprehensive Guide

,

Introduction

JSON (JavaScript Object Notation) has become a popular data format for storing and exchanging information. Microsoft SQL Server, starting from version 2016, introduced built-in support for JSON, allowing developers to work with JSON data more efficiently within the relational database environment. This article will explore how to store, retrieve, and manipulate JSON data in MSSQL, along with best practices and performance considerations.

We'll cover everything from basic storage to advanced querying techniques, with practical examples you can run in your own environment. By the end of this guide, you'll have a thorough understanding of how to effectively work with JSON in SQL Server.

Setting Up the Environment

Let's start by creating a sample database and tables that we'll use throughout this guide. We'll create two tables: one for user data and another for product information, both storing JSON data. This will help you follow along and experiment with the examples.

-- Create a test database
CREATE DATABASE JsonDemo;
GO
USE JsonDemo;
GO
-- Create a table for storing user data
CREATE TABLE Users (
UserID INT PRIMARY KEY,
UserData NVARCHAR(MAX)
);
-- Create a table for storing product catalog
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductDetails NVARCHAR(MAX)
);
-- Insert sample user data
INSERT INTO Users (UserID, UserData)
VALUES
(1, '{"name": "John Doe", "age": 30, "email": "john@example.com", "address": {"street": "123 Main St", "city": "Boston"}, "skills": ["SQL", "Python"]}'),
(2, '{"name": "Jane Smith", "age": 28, "email": "jane@example.com", "address": {"street": "456 Oak Ave", "city": "Seattle"}, "skills": ["Java", "JavaScript"]}');
-- Insert sample product data
INSERT INTO Products (ProductID, ProductDetails)
VALUES
(1, '{"name": "Gaming Laptop", "price": 1299.99, "specs": {"cpu": "i7", "ram": "16GB", "storage": "1TB SSD"}, "inStock": true}'),
(2, '{"name": "Wireless Mouse", "price": 49.99, "specs": {"dpi": "16000", "buttons": 8}, "inStock": false}');

JSON Storage and Validation

When storing JSON in SQL Server, we use the NVARCHAR(MAX) data type. However, it's crucial to ensure that only valid JSON data is stored. SQL Server provides the ISJSON function that we can use with a CHECK constraint to validate JSON data before insertion. Here's how to implement this validation:

-- Create a table with JSON validation
CREATE TABLE ValidatedUsers (
    UserID INT PRIMARY KEY,
    UserData NVARCHAR(MAX),
    CONSTRAINT CHK_UserData_JSON CHECK (ISJSON(UserData) = 1)
);

Let's test this validation with both valid and invalid JSON:

-- This insert will succeed because the JSON is valid
INSERT INTO ValidatedUsers (UserID, UserData)
VALUES (1, '{"name": "John Doe", "age": 30}');
-- This insert will fail because the JSON is invalid (missing quote)
INSERT INTO ValidatedUsers (UserID, UserData)
VALUES (2, '{"name": "Invalid JSON, missing quote}');

The second INSERT statement will fail because the JSON string is malformed. This demonstrates how the CHECK constraint helps maintain data integrity by preventing invalid JSON from being stored in your database.

Working with JSON Data

SQL Server provides several methods for querying and manipulating JSON data. Let's explore these capabilities, starting with basic extraction and moving to more complex operations.

The JSON_VALUE function is used to extract scalar values (strings, numbers, etc.) from JSON data, while JSON_QUERY retrieves objects and arrays. Here's how to use both:

-- Extract simple values using JSON_VALUE
SELECT
    UserID,
    JSON_VALUE(UserData, '$.name') AS Name,
    JSON_VALUE(UserData, '$.age') AS Age,
    JSON_VALUE(UserData, '$.address.city') AS City
FROM Users;

This query will return:

  • The user's name from the 'name' property
  • Their age from the 'age' property
  • Their city from the nested 'address.city' property
UserID    Name        Age    City
1         John Doe    30     Boston
2         Jane Smith  28     Seattle

For complex properties like objects and arrays, use JSON_QUERY:

-- Extract objects and arrays using JSON_QUERY
SELECT
    UserID,
    JSON_QUERY(UserData, '$.address') AS AddressObject,
    JSON_QUERY(UserData, '$.skills') AS SkillsArray
FROM Users;

This query returns:

  • The complete address object with all its nested properties
  • The skills array containing all the user's skills
UserID    AddressObject                                          SkillsArray
1         {"street": "123 Main St", "city": "Boston"}           ["SQL", "Python"]
2         {"street": "456 Oak Ave", "city": "Seattle"}          ["Java", "JavaScript"]

Using NVARCHAR(MAX) with Validation

SQL Server stores JSON data as text using the NVARCHAR(MAX) data type. To ensure data integrity, we need to validate that any stored text is valid JSON. SQL Server provides the ISJSON() function which returns 1 for valid JSON and 0 for invalid JSON. We can use this function in a CHECK constraint to automatically validate all JSON data before it's inserted or updated in the table.

The validation works by:

  1. Creating a CHECK constraint that calls ISJSON() on the JSON column
  2. Automatically checking all inserts and updates against this constraint
  3. Rejecting any operations that would store invalid JSON
  4. Allowing only properly formatted JSON strings to be stored

Here's how to implement this validation:

-- Create a table with JSON validation
CREATE TABLE ValidatedUsers (
UserID INT PRIMARY KEY,
UserData NVARCHAR(MAX),
CONSTRAINT CHK_UserData_JSON CHECK (ISJSON(UserData) = 1)
);
-- This insert will succeed
INSERT INTO ValidatedUsers (UserID, UserData)
VALUES (1, '{"name": "John Doe", "age": 30}');
-- This insert will fail due to invalid JSON
INSERT INTO ValidatedUsers (UserID, UserData)
VALUES (2, '{"name": "Invalid JSON, missing quote}');

When you run this code, you'll see that:

  • The first INSERT succeeds because it contains well-formed JSON
  • The second INSERT fails with a constraint violation error because the JSON is malformed (missing a closing quote)
  • The CHECK constraint automatically prevents invalid JSON from corrupting your data

This validation ensures that your JSON column always contains parseable JSON data, making it safer to use in queries and preventing data integrity issues down the line.

Advanced Querying with OPENJSON

While JSON_VALUE and JSON_QUERY are great for simple extractions, sometimes you need to transform JSON data into a traditional relational format. This is where OPENJSON comes in. OPENJSON can:

  • Convert JSON properties into regular table columns
  • Flatten nested JSON structures
  • Transform JSON arrays into rows
  • Apply SQL operations to the extracted data

Here's how to use OPENJSON to convert JSON properties to columns:

-- Convert JSON properties to columns
SELECT u.UserID,
JsonData.*
FROM Users u
CROSS APPLY OPENJSON(u.UserData)
WITH (
Name NVARCHAR(100) '$.name',
Age INT '$.age',
Email NVARCHAR(100) '$.email',
City NVARCHAR(100) '$.address.city'
) AS JsonData;

This query produces a traditional table format:

UserID    Name        Age    Email               City
1         John Doe    30     john@example.com    Boston
2         Jane Smith  28     jane@example.com    Seattle

OPENJSON is particularly useful for working with arrays. Here's how to transform an array into rows:

-- Working with nested arrays
SELECT u.UserID,
    SkillName
FROM Users u
CROSS APPLY OPENJSON(u.UserData, '$.skills') WITH (
    SkillName VARCHAR(50) '$'
) AS Skills;

This query "unpivots" the skills array into separate rows:

UserID    SkillName
1         SQL
1         Python
2         Java
2         JavaScript

These transformations make it easier to:

  • Join JSON data with regular tables
  • Apply grouping and aggregation
  • Create reports and analytics
  • Export data to systems that expect traditional row-column format

The results will show the data in a traditional relational format, making it easier to use in reports or applications.

Modifying JSON Data

SQL Server provides the JSON_MODIFY function for updating JSON data. This function can add, update, or remove properties from JSON documents. When working with JSON data in SQL Server, the JSON_MODIFY function is your primary tool for making changes to JSON documents. This function provides several powerful capabilities:

  1. Modifying existing values
  2. Adding new properties
  3. Updating nested objects
  4. Managing arrays
  5. Removing properties

Let's explore each type of modification and understand how they work.

The JSON_MODIFY function takes three parameters:

  • The JSON document to modify
  • The path to the property you want to change
  • The new value to set

The syntax is:

JSON_MODIFY(<jsonDocument>, <path>, <value>)

SQL Server provides several ways to modify JSON data. Here are common scenarios with examples:

To update a basic property, specify the direct path to that property:

-- Update a simple property
UPDATE Users
SET UserData = JSON_MODIFY(UserData, '$.age', 31)
WHERE UserID = 1;
-- Before:
-- {"name": "John Doe", "age": 30, ...}
-- After:
-- {"name": "John Doe", "age": 31, ...}

JSON_MODIFY will automatically create new properties if they don't exist. Here I add the phone number as a new property to the top level of the JSON document.

-- Add a new property
UPDATE Users
SET UserData = JSON_MODIFY(UserData, '$.phoneNumber', '555-0123')
WHERE UserID = 1;
-- Before:
-- {"name": "John Doe", "age": 31, ...}
-- After:
-- {"name": "John Doe", "age": 31, "phoneNumber": "555-0123", ...}

Working with Nested Objects

You can navigate to nested properties using dot notation. In this example, the code uses dot notation to navigate through nested properties in a JSON object. Dot notation allows you to traverse the structure of the JSON object by specifying the hierarchy of keys from the root to the desired property. Each key is separated by a dot (.).

For instance:

  • $ represents the root of the JSON object.
  • .address accesses the address key within the object.
  • .zipCode drills further into the address object to access the zipCode property.

You can easily specify and update deeply nested properties by using dot notation.

-- Update nested object property
UPDATE Users
SET UserData = JSON_MODIFY(UserData, '$.address.zipCode', '02108')
WHERE UserID = 1;
-- Before:
-- {"address": {"street": "123 Main St", "city": "Boston"}}
-- After:
-- {"address": {"street": "123 Main St", "city": "Boston", "zipCode": "02108"}}
  1. The JSON_MODIFY function updates the zipCode property of the address object within the UserData JSON column.
  2. The $ symbol specifies the root of the JSON object, and the path $.address.zipCode specifies the property to be updated.
  3. The new value '02108' is assigned to the zip code property.

This approach makes it straightforward to work with JSON data, even when the structure is complex or deeply nested.

Array Operations

SQL Server provides a special syntax for array modifications. SQL Server provides a straightforward way to modify JSON arrays using the JSON_MODIFY function. One such operation is appending a value to an array. This can be achieved using the append keyword, which instructs SQL Server to add a new value at the end of the specified JSON array.

Here’s a step-by-step explanation of the process:

  1. Target the array: The $ symbol represents the root of the JSON object, and the path specifies the key that contains the array ($.skills in this case).
  2. Specify the operation: The append keyword ensures the new value will be added to the end of the existing array, rather than replacing its contents.
  3. Provide the new value: The value to be appended (e.g., 'TypeScript') is included as an argument.

The append keyword notes that the value should be added to the JSON array.

-- Append to an array
UPDATE Users
SET UserData = JSON_MODIFY(UserData, 'append $.skills', 'TypeScript')
WHERE UserID = 1;
-- Before:
-- {"skills": ["SQL", "Python"]}
-- After:
-- {"skills": ["SQL", "Python", "TypeScript"]}

Removing Properties

To remove a property, set its value to NULL:

-- Remove a property
UPDATE Users
SET UserData = JSON_MODIFY(UserData, '$.phoneNumber', NULL)
WHERE UserID = 1;
-- Before:
-- {"name": "John", "phoneNumber": "555-0123"}
-- After:
-- {"name": "John"}

After each modification, you can verify the changes:

-- Check the modified data
SELECT 
    UserID,
    UserData,
    JSON_VALUE(UserData, '$.age') AS Age,
    JSON_QUERY(UserData, '$.skills') AS Skills
FROM Users
WHERE UserID = 1;

The JSON_MODIFY function:

  • Returns NULL if the input JSON is invalid
  • Creates parent objects/arrays automatically if they don't exist
  • Removes a property when setting its value to NULL

This comprehensive approach to JSON modifications allows you to:

  • Maintain data integrity while updating
  • Handle complex nested structures
  • Manage arrays effectively
  • Track changes to your JSON documents
  • Ensure your modifications are successful

Performance Optimization

When working with JSON data in SQL Server, performance considerations become crucial as your data grows. Here are key techniques for optimizing JSON queries.

Creating Computed Columns and Indexes

For frequently queried JSON properties, creating computed columns and indexing them can significantly improve query performance. Let’s break this process down step by step and demonstrate how it works.

A computed column is a virtual column derived from other columns in the same table. In this example, we extract specific JSON properties ($.name and $.address.city) using the JSON_VALUE() function and store them as computed columns. A few steps to improve query performance.

Add Computed Columns

Extract the JSON properties using JSON_VALUE() and define them as computed columns.

ALTER TABLE Users
ADD UserName AS JSON_VALUE(UserData, '$.name'),
    UserCity AS JSON_VALUE(UserData, '$.address.city');

These columns are “computed” from the JSON data stored in the UserData column.

Create Indexes on Computed Columns:

Indexing these computed columns allows SQL Server to optimize queries that filter or search on these properties.

CREATE INDEX IX_Users_UserName ON Users(UserName);
CREATE INDEX IX_Users_UserCity ON Users(UserCity);

Run Queries Using the Indexed Columns

Queries referencing the computed columns will benefit from the indexes:

SELECT UserID, UserName
FROM Users
WHERE UserName LIKE 'J%';

Comparing Query Performance

To measure the impact of indexing computed columns, let’s compare two scenarios:

Query Using the Computed Column

SELECT UserID
FROM Users
WHERE UserName = 'John Doe';

This query leverages the index on UserName, resulting in faster execution.

Query Using JSON_VALUE() Directly

SELECT UserID
FROM Users
WHERE JSON_VALUE(UserData, '$.name') = 'John Doe';

Without an index, this query must parse the JSON data for each row, making it significantly slower.

Results: Why Is This Better?

  • Performance: Queries on indexed computed columns are much faster than those using JSON_VALUE() directly, as the latter involves row-by-row JSON parsing.
  • Execution Plan: Use SET STATISTICS IO ON; and SET STATISTICS TIME ON; to compare the number of logical reads and CPU time between the two methods. You’ll observe that the indexed query uses fewer resources.
  • Scalability: When querying large datasets, the performance improvement becomes even more pronounced.

As a practical tip, use computed columns and indexing for JSON properties that are frequently accessed or filtered. This technique transforms semi-structured JSON data into a structure that SQL Server can optimize efficiently.

Performance Testing

The query using the indexed computed column,  ValueColumn, will be significantly faster than the query that uses JSON_VALUE directly.

  • The indexed query benefits from SQL Server's ability to use the index to quickly find matching rows, reducing the number of logical reads and overall execution time.
  • The non-indexed query requires SQL Server to perform a full table scan, extracting the JSON property on the fly for each row, which is much more resource-intensive.

Here's a script to test performance differences between indexed and non-indexed JSON queries:

-- Create a large test dataset
CREATE TABLE PerformanceTest (
ID INT IDENTITY PRIMARY KEY,
JsonData NVARCHAR(MAX)
);
-- Insert test data (example with 1000 rows)
DECLARE @i INT = 1;
WHILE @i <= 1000
BEGIN
INSERT INTO PerformanceTest (JsonData)
VALUES (
JSON_MODIFY(
JSON_MODIFY(
JSON_MODIFY('{}', '$.id', @i),
'$.value', 'Test' + CAST(@i AS VARCHAR(10))
),
'$.timestamp', GETDATE()
)
);
SET @i += 1;
END;
-- Add computed column and index
ALTER TABLE PerformanceTest
ADD ValueColumn AS JSON_VALUE(JsonData, '$.value');
CREATE INDEX IX_PerformanceTest_Value ON PerformanceTest(ValueColumn);
-- Compare performance
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
-- Query with index
SELECT ID
FROM PerformanceTest
WHERE ValueColumn = 'Test500';
-- Query without index
SELECT ID
FROM PerformanceTest
WHERE JSON_VALUE(JsonData, '$.value') = 'Test500';

The indexed query will typically show significantly better performance metrics in terms of logical reads and execution time.

Best Practices and Guidelines

While SQL Server doesn’t natively support JSON schema validation, you can create a custom stored procedure to ensure JSON data adheres to your expected schema. Here’s an example procedure to validate JSON structure and required fields:

CREATE PROCEDURE ValidateUserJson
@JsonData NVARCHAR(MAX)
AS
BEGIN
IF (ISJSON(@JsonData) = 0)
THROW 50000, 'Invalid JSON format', 1;
IF NOT EXISTS (
SELECT 1
FROM OPENJSON(@JsonData)
WITH (
name NVARCHAR(100) '$.name',
age INT '$.age',
email NVARCHAR(100) '$.email'
)
WHERE name IS NOT NULL
AND age IS NOT NULL
AND email IS NOT NULL
)
THROW 50000, 'Missing required properties', 1;
RETURN 1;
END;
GO
-- Test the validation
BEGIN TRY
EXEC ValidateUserJson '{"name": "John", "age": 30}';
END TRY
BEGIN CATCH
SELECT ERROR_MESSAGE() AS ValidationError;
END CATCH;

How Does This Work?

  1. Validating JSON Format: The ISJSON() function checks if the input string is a valid JSON object. If it’s not, the procedure raises an error using the THROW statement, providing the message 'Invalid JSON format'.
  2. Checking for Required Properties:
    1. The OPENJSON() function parses the JSON input and maps its structure to a relational format.
    2. The WITH clause defines the expected schema, mapping keys like $.name, $.age, and $.email to SQL data types.
    3. A WHERE clause ensures all required properties (name, age, email) are present and not null.
    4. If any property is missing, the procedure raises an error with the message 'Missing required properties'.
  3. Successful Validation: If the JSON passes both checks, the procedure returns 1, signaling successful validation.

Error Handling

Error handling in SQL Server (MSSQL) for JSON data ensures that invalid or unexpected JSON input does not disrupt application functionality, corrupt data, or lead to runtime errors. Here’s how error handling works and why it’s essential when dealing with JSON. Let’s break down the following example to see how error handling is applied.

Always implement proper error handling when working with JSON data:

BEGIN TRY
-- Attempt to insert invalid JSON
INSERT INTO ValidatedUsers (UserID, UserData)
VALUES (3, '{"name": "Bad JSON}');
END TRY
BEGIN CATCH
SELECT
ERROR_NUMBER() AS ErrorNumber,
ERROR_MESSAGE() AS ErrorMessage,
ERROR_SEVERITY() AS ErrorSeverity,
ERROR_STATE() AS ErrorState,
ERROR_LINE() AS ErrorLine,
ERROR_PROCEDURE() AS ErrorProcedure;
END CATCH;

In the BEGIN TRY Block, the code attempts to insert data into the ValidatedUsers table. The UserData column expects valid JSON, but the value provided ('{"name": "Bad JSON}') is malformed.

For the BEGIN CATCH Block, if an error occurs during the INSERT operation, control transfers to the CATCH block. This block executes and retrieves error details using functions like ERROR_MESSAGE() and ERROR_LINE().

The CATCH block outputs valuable debugging information. This is the data captured:

  • ErrorNumber: Unique identifier for the error.
  • ErrorMessage: Description of the error.
  • ErrorSeverity: Level of severity (e.g., critical or informational).
  • ErrorState: Internal SQL Server state linked to the error.
  • ErrorLine: Line number where the error occurred.
  • ErrorProcedure: Name of the procedure (if applicable) causing the error.

Why Is This Important?

In this example, invalid JSON triggers an error, and the CATCH block captures its details. Without error handling, such issues could remain undetected, causing data corruption or system crashes. By proactively capturing errors, developers can log them, alert administrators, and design recovery mechanisms.

As a practical tip, when working with JSON in SQL Server, validate your data before attempting to insert it. Consider using functions like ISJSON() to ensure the data format is correct.

Conclusion

SQL Server's JSON capabilities provide a robust foundation for working with semi-structured data within a relational database. By following the best practices and examples outlined in this guide, you can effectively implement JSON storage and querying in your applications while maintaining good performance and data integrity.

Remember these key takeaways:

  1. Always validate JSON data before storage
  2. Use computed columns and indexes for frequently queried properties
  3. Consider the performance implications of JSON operations on large datasets
  4. Implement proper error handling for JSON operations
  5. Use appropriate JSON functions (JSON_VALUE, JSON_QUERY, OPENJSON) based on your specific needs

With these tools and techniques, you can successfully integrate JSON capabilities into your SQL Server applications while maintaining the benefits of a relational database system.

Rate

5 (8)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (8)

You rated this post out of 5. Change rating