Problem
Combining overlapping intervals in order to compute the missing intervals could be a complex task for advanced programmers. This article presents a new solution based on usage of geometry data type and on a unknown function (if not, maybe less used function) specific designed for geometry objects.
Sample: Give following intervals [1, 5], [64, 70], [66, 72], [150, 200], [130, 300], the question is: what are the missing intervals of numbers? In this sample the merged intervals are [1, 5], [64, 72], [130, 300] and then the missing intervals are [6, 63], [73, 129].
Proposed Solution
To find out the missing intervals, the first step is to transform every interval (Start, End) into a POLYGON (spatial object) thus:
POLYGON((Start 0, End 0, End 1, Start 1, Start 0)): (Start 1)-(End 1) | | (Start 0)-(End 0)
Thus, first range (1, 5) become POLYGON((1 0,5 0,5 1,1 1,1 0)):
DECLARE @pol GEOMETRY = 'POLYGON((1 0,5 0,5 1,1 1,1 0))' SELECT @pol AS IntAsObject, @pol.STAsText() AS [Description]
Then, these polygons will be merged in order to computed the non-overlapping intervals using the UnionAggregate() function:
INSERT INTO @tab(Col1) SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((1 0,5 0,5 1,1 1,1 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((64 0,70 0,70 1,64 1,64 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((66 0,72 0,72 1,66 1,66 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((150 0,200 0,200 1,150 1,150 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((130 0,300 0,300 1,130 1,130 0))', 0); DECLARE @ranges GEOMETRY; SELECT @pol = GEOMETRY::UnionAggregate(t.Col1) FROM @tab t SELECT @pol.STAsText();
The end result is a MULTIPOLYGON object containing non-overlapping polygons:
MULTIPOLYGON
(
((130 0, 150 0, 200 0, 300 0, 300 1, 200 1, 150 1, 130 1, 130 0)),
((64 0, 66 0, 70 0, 72 0, 72 1, 70 1, 66 1, 64 1, 64 0)),
((1 0, 5 0, 5 1, 1 1, 1 0))
)
This new object is nothing more than a collection of polygons having intermediary points. Next step is to compute the bounding box of MULTIPOLYGON using STEnvelope() function, the end results being a polygon:
POLYGON ((1 0, 300 0, 300 1, 1 1, 1 0)).
Next step (which is also the final step) is to compute the missing polygons from this envelope using the STSymDifference() function like this:
SELECT @ranges = @pol.STEnvelope().STSymDifference(@pol); SELECT @ranges.STAsText();
The end results being a new MULTIPOLYGON spatial object
MULTIPOLYGON ( ((72 0, 130 0, 130 1, 72 1, 72 0)), ((5 0, 64 0, 64 1, 5 1, 5 0)) )
At this moment, we have to extract the start point and the end point of every polygon([72, 130], [5, 64]) and then:
- Add +1 to start point ([73, 130], [6, 64]) and
- Subtract -1 from end point ([73, 129], [6, 63]).
Source Code
The final source code is:
DECLARE @pol GEOMETRY DECLARE @tab TABLE(Col1 GEOMETRY); INSERT INTO @tab(Col1) SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((1 0,5 0,5 1,1 1,1 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((64 0,70 0,70 1,64 1,64 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((66 0,72 0,72 1,66 1,66 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((150 0,200 0,200 1,150 1,150 0))', 0) UNION ALL SELECT RangePol = GEOMETRY::STPolyFromText('POLYGON((130 0,300 0,300 1,130 1,130 0))', 0); DECLARE @ranges GEOMETRY; SELECT @pol = GEOMETRY::UnionAggregate(t.Col1) FROM @tab t SELECT @pol.STAsText(); SELECT @pol.STEnvelope().STAsText() SELECT @ranges = @pol.STEnvelope().STSymDifference(@pol); SELECT @ranges.STAsText(); SELECT Range_N = val.number, Range_Polygon = @ranges.STGeometryN(val.number), Range_PolygonT = @ranges.STGeometryN(val.number).ToString(), Range_Start = @ranges.STGeometryN(val.number).STPointN(1).STX+1, Range_End = @ranges.STGeometryN(val.number).STPointN(2).STX-1 FROM master.dbo.spt_values val WHERE val.type = N'P' AND val.number >= 1 AND val.number <= @ranges.STNumGeometries();
Final Conclusions
This article describe a new solution that could be used to discover for a collection of overlapping intervals the missing intervals. This new solution is based on the usage of POLYGON, MULTIPOLYGON spatial objects and also on following functions specific designed for geometry objects: UnionAggregate, STEnvelope, STSymDifference, STNumGeometries(), STGeometryN().
Note: Please read also this article https://www.sqlservercentral.com/articles/collections