This article is just a short follow up of previous article (http://www.sqlservercentral.com/articles/SQL+Server/158068/) regarding filegroup restoring and it describes one issue with join elimination optimization.
Filegroup restore Scenario
The scenario described in previous article assumes that
- there are two types of tables (tables with PRIMARY KEY constraints and tables with FOREIGN KEY constraints) placed on different filegroups (example: dbo.TablePK on SECONDARY01 and dbo.TableFK on SECONDARY02)
- at moment T, a full backup is taken (all filegroups included) with following states: dbo.TablePK has two rows (1 and 2), dbo.TableFK has one row (1)
- at moment T+1, new rows are inserted into both tables. Now, current state is as follow: dbo.TablePK has three rows (1, 2 and 3), dbo.TableFK has three rows (1, 2 and 3)
- at moment T+2, a restore of filegroup SECONDARY01 (used only by dbo.TablePK) is initiated and current state becomes as follow: dbo.TablePK has only 2 rows (1 and 2) because of filegroup restore (using backup from moment T) and dbo.TableFK (filegroup SECONDARY02) is unchanged having three rows (1, 2, 3).
After these steps, tables with foreign keys could become inconsistent because some rows from the PRIMARY KEY tables could be missing. In this case the row 3 from dbo.TableFK has a missing reference to dbo.TablePK, which now, at time T+2, has only two rows (1 and 2). Another important thing is that the state of foreign keys contraints (sys.foreign_keys.is_not_trusted) is unchanged, in this scenario the foreign key constraint remains trusted (is_no_trusted = 0).
Join elimination and One Possible Issue
Let's create a simple view using both tables and then execute a query using this view as data source:
USE FGRestore GO SELECT *, 'TablePK' AS TableName FROM dbo.TablePK SELECT *, 'TableFK' AS TableName FROM dbo.TableFK GO CREATE VIEW dbo.SourceData AS SELECT x.ID AS PK_ID, y.ID AS FK_ID FROM dbo.TablePK x INNER JOIN dbo.TableFK y ON x.ID = y.ID GO SELECT sd.PK_ID, sd.FK_ID FROM dbo.SourceData sd GO
For the first two queries we get the following results
For the last query, we get just two rows (as expected):
In this case, the execution plan includes two data access operator (one Table Scan on dbo.TableFK and another Clustered Index Seek on dbo.TablePK) as expected:
Let's change a little bit final query by removing first column (PK_ID) from SELECT clause:
SELECT sd.FK_ID FROM dbo. SourceData sd GO
This time, in a unexpected turn, we are getting not two, but three rows:
These results are wrong. Because of the INNER JOIN, we should get only common IDs; the two IDs in our case.
The execution plan is also changed. Instead of having two data access operators, the execution plan includes just one data access operator (for dbo.TableFK):
This behavior is caused by join elimination optimization described in following blog post https://blogs.msdn.microsoft.com/conor_cunningham_msft/2009/11/12/conor-vs-foreign-key-join-elimination/
According to above blog post this optimization happens only when foreign key constraint is trusted. This is our case because
SELECT fkeys.name, fkeys.is_not_trusted, fkeys.is_disabled FROM sys.foreign_keys fkeys WHERE fkeys. parent_object_id = OBJECT_ID('dbo.TableFK')
returns
On short term, one solution to avoid this behavior would be to transform trusted foreign key constraint into untrusted ones after a filegroup restore:
-- Disable constraint ALTER TABLE dbo. TableFK NOCHECK CONSTRAINT [FK__TableFK__ID__24927208] GO -- Re-enable constraint (CHECK) without verying data (WITH NOCHECK) ALTER TABLE dbo. TableFK WITH NOCHECK CHECK CONSTRAINT [FK__TableFK__ID__24927208] GO
The status of the foreign key is changed now:
and we get the expected results:
Also, the execution plan is changed, including a Nested Loops (Inner Join)
The medium/long term solution should be be, from my point of view, to find all missing references by using
and then changing data and re-enabling disabled constraints.
Final conclusion
After a filegroup restore, not only could the database become inconsistent, but we could also get the wrong query results if the restored filegroup contains tables referenced by tables from other filegroups. This article shows how to avoid this issue.