This blog is intended to be a follow up from the SQL Saturday 2022 in Oregon & SW Washington. In that session I presented an introduction to FHIR and JSON data produced from the Azure Health Services API’s.
With the recent updated mandates in the healthcare environment in the United States, Microsoft has continued to expand its capability to support the FHIR standard for integrating healthcare data. While the standard is well documented and Microsoft’s capabilities are expansive, it falls on data professionals to interpret that data and build meaningful reports and produce meaningful insights from the data as it is collected and integrated across environments. This requires a good working knowledge of JSON in SQL to manipulate complex data models. In the session, we did a short review of the FHIR standard and the overall implementation of FHIR in Azure. From there we reviewed the resulting data in the data lake and in Synapse. That was followed up with an overview into the heart of complex SQL using JSON functions in Synapse. Whether or not you are active in healthcare today, this will be an enlightening session on how to use JSON SQL functions within the Azure SQL platforms.
What is FHIR and why should you care?
FHIR stands for Fast Healthcare Interoperability Resources. this is the latest specification for interoperability in healthcare produced by HL7. To be clear the word fast has nothing to do with performance, but more about the ability to implement and integrate data quickly. With the latest regulations around the world in health care, this standard is the established standard for integrating healthcare data and we’ll continue to be on the forefront of this work. If you do any work in health care, you will need to understand FHIR because you will likely run across data formatted to the standard from many different sources.
FHIR is very well documented. In many ways when the standard is properly followed the JSON documents or other supported formats are effectively self-documenting. It is commonly understood that the core FHIR specification handles about 80% of the use cases in healthcare. It is designed to be flexible so that it can support specialized needs within regions or healthcare areas. For example, in the US there is a need to support race and ethnicity. The U.S. Core Implementation Guide provides guidance on the specification enhancements to support this need for U.S. healthcare organizations. You will find similar support for other countries as well as specific implementations for healthcare vendors such as Epic.
Neither the notebook, the presentation, or this blog is expected to be and exhaustive coverage of FHIR. before we move on to some of the other implementation pieces, it is important to understand one key aspect of FHIR is the basic building block called a resource. A resource is the core exchangeable content within the specification. All resources share the following characteristics:
- A common way to define and represent the resource including data types and patterns
- A common set of metadata which can be discovered easily
- A human readable part
For more detailed information on the supported resources and other details around FHIR implementation, you should visit the following website:
Azure Health Services and the FHIR API
I will not be digging into a lot of the health care services information nor the FHIR support within Azure in this post. The important things to understand is that Microsoft has made a concerted effort to support this specification which includes technology and architectures for the extraction of data from various healthcare systems which will then use the FHIR APIs to standardize that extracted data into the FHIR spec typically in JSON files in the data lake. Because of the standardized format, Microsoft is able to supply a set of common schemas that can be used in serverless synapse to create external tables and views to accelerate the implementation and usage of data produced from the APIs. It is from this starting point that we are able to start working with the data in reporting and analytics solutions.
At this point I want to put a plug in for the company I work for. If you're interested in learning how Azure health services and the FHIR specification can be implemented at your company, we have FHIR Quick Start and FHIR Data Blueprint solutions. These solutions have been used by many other customers to achieve high levels of integration in their health care data estate. If you're interested in learning more, please reach out to us at: https://3cloudsolutions.com/get-started/
Working with the data from the FHIR API using JSON in SQL
As noted in the previous section, Azure Health Services comes with setup serverless tables and views to be used with the extracted data. However due to the complexity of FHIR, there are a number of columns within those tables and views which still contain JSON snippets. For example, there is one field for name which has several objects and arrays to support the specification. You cannot simply select the name from the table and use that as you move forward. There are many different fields like this throughout the data. For the rest of this blog and in the notebook, we will work through a number of scenarios to build a view of the patient resource that can be used for simple reporting. This view will contain a few JSON functions from SQL Server and solve simple to complex scenarios in the illustration.
The functions we will be using:
- ISJSON
- JSON_VALUE
- OPENJSON
In addition to these functions, we will also be using the CROSS APPLY operator in SQL to join our data with relational data.
The examples in the notebook are built on the tables resulting from working with the Azure FHIR API. I am unable to provide a sample of the data to use with the set of information in the notebook currently. However, the SQL will work if you have your own FHIR implementation and a Patient resource to work with. rather than rewrite the entire contents of the notebook in the blog post, here is a link to the notebook.
If you plan to implement this in the same way, you will need Azure Data Lake, Azure Synapse serverless, and Azure Data Studio. the notebook can be opened in Azure Data Studio. If you are unfamiliar with working with notebooks inside of Azure Data Studio, you are not alone. Check out this post which discusses how to implement your first notebook in Azure Data Studio.
Building our view and SQL with JSON functions
If you decide not to open the notebook but are curious what the view looks like here is a finished product that we created in the notebook.
SELECT TOP (20) p.resourceType + '/' + p.id as PatientResourceID
, p.resourceType as ResourceType
, p.id as ResourceID
, cast(p.[meta.versionId] as int) as VersionID
, cast(p.[meta.lastUpdated] as DATETIME2(7)) as LastUpdated
, JSON_VALUE(p.[name], '$[0].family') as LastName
, JSON_VALUE(p.[name], '$[0].given[0]') as FirstName
, cast(p.active as bit) as IsActive
, p.gender as Gender
, CAST(p.birthDate as date) as BirthDate
, CASE WHEN p.[maritalStatus.coding] is null THEN NULL
WHEN JSON_VALUE(p.[maritalStatus.coding], '$[0].system') = 'http://terminology.hl7.org/CodeSystem/v3-MaritalStatus'
THEN JSON_VALUE(p.[maritalStatus.coding], '$[0].code')
ELSE NULL
END as MaritalStatus
, CASE WHEN JSON_VALUE(p.[address], '$[0].use') = 'home' THEN JSON_VALUE(p.[address], '$[0].state')
WHEN JSON_VALUE(p.[address], '$[1].use') = 'home' THEN JSON_VALUE(p.[address], '$[1].state')
WHEN JSON_VALUE(p.[address], '$[2].use') = 'home' THEN JSON_VALUE(p.[address], '$[2].state')
WHEN JSON_VALUE(p.[address], '$[3].use') = 'home' THEN JSON_VALUE(p.[address], '$[3].state')
ELSE NULL
END as HomeStateOrProvince
, e.Ethnicity
, r.Race
FROM fhir.Patient p
INNER JOIN (SELECT id, max([meta.versionId]) as currentVersion FROM fhir.Patient GROUP BY id) cp
ON p.[meta.versionId] = cp.currentVersion
AND p.id = cp.id
LEFT JOIN
(SELECT p.id
, CASE WHEN JSON_VALUE(ext.value,'$.extension[0].url') = 'ombCategory'
THEN
CASE WHEN JSON_VALUE(ext.value, '$.extension[1].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[1].valueString')
WHEN JSON_VALUE(ext.value, '$.extension[0].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[0].valueString')
ELSE JSON_VALUE(ext.value, '$.extension[0].valueCoding.display')
END
ELSE JSON_VALUE(ext.value, '$.valueCodeableConcept.coding[0].display')
END AS Ethnicity
FROM
(
SELECT fp.id, fp.extension FROM fhir.Patient fp
INNER JOIN (SELECT id, max([meta.versionId]) as currentVersion FROM fhir.Patient GROUP BY id) cp
ON fp.[meta.versionId] = cp.currentVersion
AND fp.id = cp.id
WHERE ISJSON(fp.extension) =1
) p
CROSS APPLY
OPENJSON(p.extension,'$'
) as ext
WHERE JSON_VALUE(ext.value,'$.url') = 'http://hl7.org/fhir/us/core/StructureDefinition/us-core-ethnicity'
) e on e.id = p.id
LEFT JOIN
(SELECT p.id
, CASE WHEN JSON_VALUE(ext.value,'$.extension[0].url') = 'ombCategory'
THEN
CASE WHEN JSON_VALUE(ext.value, '$.extension[3].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[3].valueString')
WHEN JSON_VALUE(ext.value, '$.extension[2].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[2].valueString')
WHEN JSON_VALUE(ext.value, '$.extension[1].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[1].valueString')
WHEN JSON_VALUE(ext.value, '$.extension[0].valueString') IS NOT NULL THEN JSON_VALUE(ext.value, '$.extension[0].valueString')
ELSE JSON_VALUE(ext.value, '$.extension[0].valueCoding.display')
END
ELSE JSON_VALUE(ext.value, '$.valueCodeableConcept.coding[0].display')
END AS Race
FROM
(
SELECT fp.id, fp.extension FROM fhir.Patient fp
INNER JOIN (SELECT id, max([meta.versionId]) as currentVersion FROM fhir.Patient GROUP BY id) cp
ON fp.[meta.versionId] = cp.currentVersion
AND fp.id = cp.id
WHERE ISJSON(fp.extension) =1
) p
CROSS APPLY
OPENJSON(p.extension,'$'
) as ext
WHERE JSON_VALUE(ext.value,'$.url') = 'http://hl7.org/fhir/us/core/StructureDefinition/us-core-race'
) as r on r.id = p.id
Here is a sample of the results from that view:
PatientResourceID | ResourceType | ResourceID | VersionID | LastUpdated | LastName | FirstName | IsActive | Gender | BirthDate | MaritalStatus | HomeStateOrProvince | Ethnicity | Race |
Patient/d8af7bfa-5008-4a0f-85d1-0af3448a31dd | Patient | d8af7bfa-5008-4a0f-85d1-0af3448a31dd | 2 | 2022-05-31 18:07:03.2150000 | DUCK | DONALD | 1 | male | 1965-07-14 | NULL | ON | NULL | NULL |
Patient/78cf7725-a0e1-44a4-94d4-055482781afb | Patient | 78cf7725-a0e1-44a4-94d4-055482781afb | 1 | 2022-05-31 18:07:30.7490000 | Gretzky | Wayne | NULL | NULL | 1990-05-31 | NULL | NULL | NULL | NULL |
Patient/9e909e52-61a1-be50-1878-a12ef8c36346 | Patient | 9e909e52-61a1-be50-1878-a12ef8c36346 | 4 | 2022-05-31 18:39:58.1780000 | EVERYMAN | ADAM | NULL | male | 1988-08-18 | M | NULL | Non Hispanic or Latino | White+Asian |
Patient/585f3cc0-c727-4989-9214-a7a7b60a2ade | Patient | 585f3cc0-c727-4989-9214-a7a7b60a2ade | 1 | 2022-05-31 13:14:57.0640000 | DUCK | DONALD | 1 | male | 1965-07-15 | NULL | ON | NULL | NULL |
Patient/29a819c4-f553-8189-2354-9441b86d37ef | Patient | 29a819c4-f553-8189-2354-9441b86d37ef | 1 | 2022-05-18 15:18:40.1560000 | FORD | ELAINE | NULL | female | 1992-03-10 | NULL | NULL | NULL | NULL |
Patient/d5fe6802-a680-e762-8f43-9659340b00ac | Patient | d5fe6802-a680-e762-8f43-9659340b00ac | 3 | 2022-05-18 14:39:52.2550000 | EVERYMAN | ADAM | NULL | male | 1961-06-15 | S | NULL | NULL | C |
Patient/4d661053-a8d0-148c-7023-54508fd04a52 | Patient | 4d661053-a8d0-148c-7023-54508fd04a52 | 1 | 2022-05-21 13:48:24.9720000 | EVERYMAN | sam | NULL | male | 1966-05-07 | M | NULL | Not Hispanic or Latino | White |
Wrapping it up
As you can see, understanding the specification well enough to build a complex SQL statement using JSON functions is required to work within FHIR effectively. Due to the complex nature of the nested JSON, you may not be able to reconcile this in tools such as power BI. Being able to build this out in SQL guarantees that you have provided you will report writers and analysts with a solid result set which can be used with confidence.
Resources summary:
- Link to notebook
- Link to FHIR spec
- Link to 3Cloud FHIR
- Link to Azure Health Services