Introduction
In the article, named Incremental Data loading through ADF using Change Tracking – SQLServerCentral, I discussed the step-by-step process for incremental data loading using Change Tracking. Incremental data loading was shown for a table, named Student, from an on-premise SQL Server to an Azure SQL database. In this article, I will perform the incremental data loading for three related tables using Azure Data Factory and Change Tracking.
Implementation
I will create three tables, named Faculty, Department, and Student, in an on-premise SQL database. I will create three tables, named AFaculty, ADepartment, and AStudent in Azure SQL database with a similar structure to the on-premise tables. Then, I will do the incremental data loading from the on-premise SQL tables to the Azure SQL tables.
Configuration and Table Creation in SQL Server
I start SSMS and connect to the existing on-premise SQL Server. I open a SQL script in the existing database, named ResearchWork. First, I alter the database to set change tracking. Then I drop the tables Student, Department, and Faculty in order, in case the tables already exist.
The Student table has a foreign key, which references the Department table. The Department has a foreign key, which references Faculty table. So, we need to maintain the order of the tables while using DROP, CREATE, INSERT, UPDATE, DELETE statements on the tables.
After dropping the tables, I create the tables Faculty, Department, and Student in order. Next, I alter the three tables to enable Change Tracking.
--1. change tracking set to ON for the database ALTER DATABASE ResearchWork SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON) --2. drop the table Student IF OBJECT_ID('dbo.Student','U') IS NOT NULL DROP TABLE dbo.Student --3. drop the table Department IF OBJECT_ID('dbo.Department','U') IS NOT NULL DROP TABLE dbo.Department --4. drop the table Faculty IF OBJECT_ID('dbo.Faculty','U') IS NOT NULL DROP TABLE dbo.Faculty --5. create the table Faculty CREATE TABLE dbo.Faculty ( facultyId INT IDENTITY(1,1) NOT NULL PRIMARY KEY, facultyName VARCHAR(50), DOJ Date, createDate datetime2 NULL, updateDate datetime2 NULL ) --6. create the table Department CREATE TABLE dbo.Department ( deptId INT IDENTITY(1,1) NOT NULL PRIMARY KEY, deptName VARCHAR(50), deptHeadId INT NOT NULL FOREIGN KEY REFERENCES dbo.Faculty (facultyId), createDate datetime2 NULL, updateDate datetime2 NULL ) --7. create the table Student CREATE TABLE dbo.Student ( studentId int IDENTITY(1,1) NOT NULL PRIMARY KEY, studentName varchar (100) NULL, deptId INT NOT NULL FOREIGN KEY REFERENCES dbo.Department (deptId), marks int NULL, createDate datetime2 NULL, updateDate datetime2 NULL ) GO --8. alter the Student table to set change tracking ALTER TABLE dbo.Student ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON) --9. alter the Department table to set change tracking ALTER TABLE dbo.Department ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON) --10. alter the Faculty table to set change tracking ALTER TABLE dbo.Faculty ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON)
Create and Populate the ChangeTrackingStore Table
I create a table, named ChangeTrackingStore, to maintain the change tracking version information for the three tables.
I retrieve the change tracking version associated with the last committed transaction of the database and insert this value for all three tables into ChangeTrackingStore. This version information will be used as a reference for reflecting the changes from source to target tables.
--11. drop the table ChangeTrackingStore IF OBJECT_ID('dbo.ChangeTrackingStore','U') IS NOT NULL DROP TABLE dbo.ChangeTrackingStore --12. create the ChangeTrackingStore table create table dbo.ChangeTrackingStore ( tableName varchar(255), SYS_CHANGE_VERSION BIGINT, ); --13. retrieve the latest value from CHANGE_TRACKING_CURRENT_VERSION() and populate in ChangeTrackingStore for the three tables DECLARE @changeTrackingVersion BIGINT SET @changeTrackingVersion = CHANGE_TRACKING_CURRENT_VERSION(); SELECT @changeTrackingVersion INSERT INTO dbo.ChangeTrackingStore VALUES ('Student', @changeTrackingVersion), ('Department', @changeTrackingVersion), ('Faculty', @changeTrackingVersion) --14. select data from ChangeTrackingStore table SELECT tableName,SYS_CHANGE_VERSION FROM dbo.ChangeTrackingStore
Populate Data in the Three Source Tables
First, I insert four records in the Faculty table and verify the inserted data. Then, I insert two records in Department table. The deptHeadId column is a foreign key and references the facultyId column of Faculty table. So, the value inserted in the deptHeadId column must already be present in the facultyId column of Faculty table.
Next, I insert six records in Student table. The deptId column is a foreign key and references the deptId column of Department table. So, the value inserted in the deptId column must already be present in the deptId column of Department table.
--15. populate data in Faculty table INSERT INTO dbo.Faculty (facultyName,DOJ,createDate,updateDate) VALUES ('fname1','01-01-2021',GETDATE(), GETDATE()), ('fname2','01-02-2021',GETDATE(), GETDATE()), ('fname3','01-03-2021',GETDATE(), GETDATE()), ('fname4','01-04-2021',GETDATE(), GETDATE()) SELECT facultyId,facultyName,DOJ,createDate,updateDate FROM dbo.Faculty --16. populate data in Department table INSERT INTO dbo.Department (deptName,deptHeadId,createDate,updateDate) VALUES ('dept1',1,GETDATE(), GETDATE()), ('dept2',2,GETDATE(), GETDATE()) SELECT deptId,deptName,deptHeadId,createDate,updateDate FROM dbo.Department --17. populate data in student table INSERT INTO dbo.Student (studentName,deptId,marks,createDate,updateDate) VALUES ('st1',1,94,GETDATE(), GETDATE()), ('st2',1,95,GETDATE(), GETDATE()), ('st3',1,92,GETDATE(), GETDATE()), ('st4',1,93,GETDATE(), GETDATE()), ('st5',1,90,GETDATE(), GETDATE()), ('st6',2,89,GETDATE(), GETDATE()) SELECT studentId,studentName,deptId,marks,createDate,updateDate FROM dbo.Student
Create the Staging and Final Tables in Azure SQL database
I create a staging table named stgAFaculty. This table has three extra columns other than the Faculty table columns. These three columns will be returned from the CHANGETABLE function. Similarly, the staging tables named stgADepartment and stgAStudent are created for storing Department and Student table data respectively.
I drop the tables AStudent, ADepartment and AFaculty in order, in case the tables already exist. AStudent, ADepartment, and AFaculty tables have the foreign keys similar with the on-premise tables named Student, Department, and Faculty respectively. After dropping the tables, I create the tables, AFaculty, ADepartment, and AStudent, in order.
--18. drop and create the Faculty staging table IF OBJECT_ID('dbo.stgAFaculty','U') IS NOT NULL DROP TABLE dbo.stgAFaculty CREATE TABLE dbo.stgAFaculty ( facultyId INT NOT NULL, facultyName VARCHAR(50), DOJ Date, createDate datetime2 NULL, updateDate datetime2 NULL, SYS_CHANGE_VERSION BIGINT, SYS_CHANGE_OPERATION CHAR(5), SYS_CHANGE_COLUMNS VARBINARY(50) ) --19. drop and create the Department staging table IF OBJECT_ID('dbo.stgADepartment','U') IS NOT NULL DROP TABLE dbo.stgADepartment CREATE TABLE dbo.stgADepartment ( deptId INT NOT NULL PRIMARY KEY, deptName VARCHAR(50), deptHeadId INT NULL, --FOREIGN KEY REFERENCES dbo.AFaculty (facultyId), createDate datetime2 NULL, updateDate datetime2 NULL, SYS_CHANGE_VERSION BIGINT, SYS_CHANGE_OPERATION CHAR(5), SYS_CHANGE_COLUMNS VARBINARY(50) ) --20. drop and create the Student staging table IF OBJECT_ID('dbo.stgAStudent','U') IS NOT NULL DROP TABLE dbo.stgAStudent CREATE TABLE dbo.stgAStudent ( studentId int NOT NULL PRIMARY KEY, studentName varchar (100) NULL, deptId INT NULL, --FOREIGN KEY REFERENCES dbo.ADepartment (deptId), marks int NULL, createDate datetime2 NULL, updateDate datetime2 NULL, SYS_CHANGE_VERSION BIGINT, SYS_CHANGE_OPERATION CHAR(5), SYS_CHANGE_COLUMNS VARBINARY(50) ) --21. drop the Student table IF OBJECT_ID('dbo.AStudent','U') IS NOT NULL DROP TABLE dbo.AStudent --22. drop the Department table IF OBJECT_ID('dbo.ADepartment','U') IS NOT NULL DROP TABLE dbo.ADepartment --23. drop the Faculty table IF OBJECT_ID('dbo.AFaculty','U') IS NOT NULL DROP TABLE dbo.AFaculty --24. create the Faculty table CREATE TABLE dbo.AFaculty ( facultyId INT NOT NULL PRIMARY KEY, facultyName VARCHAR(50), DOJ Date, createDate datetime2 NULL, updateDate datetime2 NULL ) --25. create the Department table CREATE TABLE dbo.ADepartment ( deptId INT NOT NULL PRIMARY KEY, deptName VARCHAR(50), deptHeadId INT NOT NULL FOREIGN KEY REFERENCES dbo.AFaculty (facultyId), createDate datetime2 NULL, updateDate datetime2 NULL ) --26. create the Student table CREATE TABLE dbo.AStudent ( studentId int NOT NULL PRIMARY KEY, studentName varchar (100) NULL, deptId INT NOT NULL FOREIGN KEY REFERENCES dbo.ADepartment (deptId), marks int NULL, createDate datetime2 NULL, updateDate datetime2 NULL )
Integration Runtime
In the Azure portal, I create a Data Factory named 'adf-multi-table'. I go to the Manage tab and create a self-hosted Integration Runtime, named selfhostedIR1-sd. This Integration Runtime is required for movement of data from an on-premise SQL Server to an Azure SQL Database.
The Azure Integration Runtime, named AutoResolveIntegrationRuntime, was already available and is required to copy data between cloud data stores.
Linked Services
I create two linked services as detailed below:
- AzureSQLDatabase1 for Azure SQL Database
- SQLServer1 for SQL Server (on-premise)
Datasets
I create three datasets for the Azure SQL database tables and four other datasets for the on-premise SQL database tables. The dataset details are as given below:
- AzureSqlTable1 is created for dbo.stgAFaculty
- AzureSqlTable2 is created for dbo.stgADepartment
- AzureSqlTable3 is created for dbo.stgAStudent
- SqlServerTable1 is created for dbo.ChangeTrackingStore
- SqlServerTable2 is created for dbo.Department
- SqlServerTable3 is created for dbo.Faculty
- SqlServerTable4 is created for dbo.Student
Create a Pipeline
I go to the Author tab of the ADF resource and create a new pipeline. I name it pipeline_incrload_change_tracking.
Lookup Activities
I create the first Lookup activity, named lkupLastChngTrackVer. The source dataset is set as SqlServerTable1, pointing to the dbo.ChangeTrackingStore table. This activity returns the three rows of the table that contain the change tracking version value for the three tables. All the changes made in the three source tables after this version will be considered for loading (insert, update, delete) in the three target tables in Azure SQL database.
I assume here that before this version, the three source tables were empty.
I create the second Lookup activity, named lkupCurChngTrackVer.
I write the following query to retrieve the change tracking version associated with the last committed transaction of the on-premise SQL database. In this task, the 'First Row Only' checkbox is selected.
SELECT CHANGE_TRACKING_CURRENT_VERSION() as CurrentChangeTrackingVersion
I will consider tracking the changes in the source tables up to this version. Output of the two lookup activities helps to decide from which change tracking version to which version the changes to be considered for data loading at each iteration of the ADF execution.
Copy Activity for Faculty Staging table
I create the Copy data activity named CopyToStgAFaculty and add the output links from the two lookup activities as input to the Copy data activity. In the source tab, the source dataset is set to SqlServerTable3.
I write the following query to retrieve all the records from the Faculty table with the change operation and the change version details after performing a right outer join with CHANGETABLE. Extra records from CHANGETABLE are also retrieved for deleted records from the Faculty table.
SELECT c.facultyId, f.facultyName, f.DOJ, f.createDate, f.updateDate, c.SYS_CHANGE_VERSION, c.SYS_CHANGE_OPERATION, c.SYS_CHANGE_COLUMNS FROM Faculty AS f RIGHT OUTER JOIN CHANGETABLE( CHANGES Faculty, @{activity('lkupLastChngTrackVer').output.value[2].SYS_CHANGE_VERSION}) as c ON f.[facultyId] = c.[facultyId] where c.SYS_CHANGE_VERSION <= @{activity('lkupCurChngTrackVer').output.firstRow.CurrentChangeTrackingVersion}
In the sink tab, I select AzureSqlTable1 as the sink dataset. I write the pre copy script to truncate the staging table stgAFaculty every time before data loading.
TRUNCATE TABLE [dbo].[stgAFaculty]
Copy Activity for Department Staging table
I create the Copy data activity named CopyToStgADepartment next to the CopyToStgAFaculty activity. In the source tab, the source dataset is set to SqlServerTable2.
I write the following query to retrieve all the records from the Department table with the change operation and the change version details after performing a right outer join with CHANGETABLE. Extra records from CHANGETABLE are also retrieved for deleted records from the Department table.
SELECT c.deptId, d.deptName, d.deptHeadId, d.createDate, d.updateDate, c.SYS_CHANGE_VERSION, c.SYS_CHANGE_OPERATION, c.SYS_CHANGE_COLUMNS FROM Department AS d RIGHT OUTER JOIN CHANGETABLE( CHANGES Department, @{activity('lkupLastChngTrackVer').output.value[1].SYS_CHANGE_VERSION}) as c ON d.[deptId] = c.[deptId] where c.SYS_CHANGE_VERSION <= @{activity('lkupCurChngTrackVer').output.firstRow.CurrentChangeTrackingVersion}
In the sink tab, I select AzureSqlTable2 as the sink dataset. I write the pre copy script to truncate the staging table stgADepartment every time before data loading.
TRUNCATE TABLE [dbo].[stgADepartment]
Copy Activity for Student Staging table
I create the Copy data activity named CopyToStgAStudent next to the CopyToStgADepartment activity. In the source tab, the source dataset is set to SqlServerTable4.
I write the following query to retrieve all the records from the Student table with the change operation and the change version details after performing a right outer join with CHANGETABLE. Extra records from CHANGETABLE are also retrieved for deleted records from the Student table.
SELECT c.studentId, s.studentName, s.deptId, s.marks, s.createDate, s.updateDate, c.SYS_CHANGE_VERSION, c.SYS_CHANGE_OPERATION, c.SYS_CHANGE_COLUMNS FROM Student AS s RIGHT OUTER JOIN CHANGETABLE( CHANGES Student, @{activity('lkupLastChngTrackVer').output.value[0].SYS_CHANGE_VERSION}) as c ON s.[studentId] = c.[studentId] where c.SYS_CHANGE_VERSION <= @{activity('lkupCurChngTrackVer').output.firstRow.CurrentChangeTrackingVersion}
In the sink tab, I select AzureSqlTable3 as the sink dataset. I write the pre copy script to truncate the staging table stgAStudent every time before data loading.
TRUNCATE TABLE [dbo].[stgAStudent]
Stored Procedure Activity
I create a Stored Procedure activity next to the Copy Data activity named CopyToStgAStudent. I select the linked service, AzureSqlDatabase1, and the stored procedure, usp_Modify_Tables. The purpose of this stored procedure is to delete the records from the three Azure SQL tables that are already deleted from the source tables after the last data load. Update and insertion of records are also done on the Azure SQL target tables from the staging tables, based on the SYS_CHANGE_OPERATION column value.
The stored procedure code is given below. Records are first deleted from the AStudent, ADepartment, and AFaculty tables in order. Then, records are updated in AFaculty, ADepartment, and AStudent tables in order. Next, new records are inserted in AFaculty, ADepartment, and AStudent tables in order.
IF OBJECT_ID('dbo.usp_Modify_Tables','P') IS NOT NULL DROP PROCEDURE dbo.usp_Modify_Tables GO CREATE PROCEDURE dbo.usp_Modify_Tables AS BEGIN BEGIN TRY BEGIN TRANSACTION --delete records from the three tables in order DELETE FROM dbo.AStudent FROM dbo.AStudent AS S INNER JOIN dbo.stgAStudent AS Ct ON S.studentId = Ct.studentId WHERE Ct.SYS_CHANGE_OPERATION = 'D' DELETE FROM dbo.ADepartment FROM dbo.ADepartment AS D INNER JOIN dbo.stgADepartment AS Ct ON D.deptId = Ct.deptId WHERE Ct.SYS_CHANGE_OPERATION = 'D' DELETE FROM dbo.AFaculty FROM dbo.AFaculty AS F INNER JOIN dbo.stgAFaculty AS Ct ON F.facultyId = Ct.facultyId WHERE Ct.SYS_CHANGE_OPERATION = 'D' --update records in the three tables in order UPDATE dbo.AFaculty SET facultyName = Ct.facultyName, DOJ = Ct.DOJ, createDate = Ct.createDate, updateDate = Ct.updateDate FROM dbo.AFaculty AS F INNER JOIN dbo.stgAFaculty AS Ct ON F.facultyId = Ct.facultyId WHERE Ct.SYS_CHANGE_OPERATION = 'U' UPDATE dbo.ADepartment SET deptName = Ct.deptName, deptHeadId = Ct.deptHeadId, createDate = Ct.createDate, updateDate = Ct.updateDate FROM dbo.ADepartment AS D INNER JOIN dbo.stgADepartment AS Ct ON D.deptId = Ct.deptId WHERE Ct.SYS_CHANGE_OPERATION = 'U' UPDATE dbo.AStudent SET studentName = Ct.studentName, deptId = Ct.deptId, marks = Ct.marks, createDate = Ct.createDate, updateDate = Ct.updateDate FROM dbo.AStudent AS S INNER JOIN dbo.stgAStudent AS Ct ON S.studentId = Ct.studentId WHERE Ct.SYS_CHANGE_OPERATION = 'U' --insert records in the three tables in order INSERT INTO dbo.AFaculty (facultyId,facultyName,DOJ,createDate,updateDate) SELECT facultyId,facultyName,DOJ,createDate,updateDate FROM dbo.stgAFaculty WHERE SYS_CHANGE_OPERATION = 'I' INSERT INTO dbo.ADepartment (deptId,deptName,deptHeadId,createDate,updateDate) SELECT deptId,deptName,deptHeadId,createDate,updateDate FROM dbo.stgADepartment WHERE SYS_CHANGE_OPERATION = 'I' INSERT INTO dbo.AStudent (studentId, studentName,deptId,marks,createDate,updateDate) SELECT studentId, studentName,deptId,marks,createDate,updateDate FROM dbo.stgAStudent WHERE SYS_CHANGE_OPERATION = 'I' COMMIT TRANSACTION END TRY BEGIN CATCH SELECT 'Error' ROLLBACK TRANSACTION RETURN -1 END CATCH RETURN 0 END GO
Second Stored Procedure Activity
I create the second Stored Procedure activity, named uspUpdateChangeTrackingStore. I select the linked service, SQLServer1, and stored procedure, uspUpdateChangeTrackingStore. The purpose of this stored procedure is to update the SYS_CHANGE_VERSION column value of ChangeTrackingStore table with the current value of change tracking version for the three records in the table.
This procedure takes four parameters: changeTrackingVersion, t1, t2, and t3.
IF OBJECT_ID('dbo.uspUpdateChangeTrackingStore','P') IS NOT NULL DROP PROCEDURE dbo.uspUpdateChangeTrackingStore GO CREATE PROCEDURE dbo.uspUpdateChangeTrackingStore @changeTrackingVersion BIGINT, @t1 varchar(255), @t2 varchar(255), @t3 varchar(255) AS BEGIN UPDATE dbo.ChangeTrackingStore SET SYS_CHANGE_VERSION = @changeTrackingVersion WHERE tableName IN (@t1,@t2, @t3) END
The parameter details supplied in the Stored Procedure activity are as given below:
changeTrackingVersion int64 -- @{activity('lkupCurChngTrackVer').output.firstRow.CurrentChangeTrackingVersion} t1 string -- @{activity('lkupLastChngTrackVer').output.value[0].tableName} t2 string -- @{activity('lkupLastChngTrackVer').output.value[1].tableName} t3 string -- @{activity('lkupLastChngTrackVer').output.value[2].tableName}
Debug the Pipeline for the Initial Data Load
I publish all the changes and execute the pipeline by pressing the Debug button. I follow the debug progress and the execution of all the activities completed successfully.
Check data in the Target tables
I select data from the Azure SQL tables. These tables now contain all the data as available in the source SQL database tables.
SELECT facultyId,facultyName,DOJ,createDate,updateDate FROM dbo.AFaculty /* facultyIdfacultyNameDOJcreateDateupdateDate 1fname12021-01-012021-09-13 12:34:59.82333332021-09-13 12:34:59.8233333 2fname22021-01-022021-09-13 12:34:59.82333332021-09-13 12:34:59.8233333 3fname32021-01-032021-09-13 12:34:59.82333332021-09-13 12:34:59.8233333 4fname42021-01-042021-09-13 12:34:59.82333332021-09-13 12:34:59.8233333 */SELECT deptId,deptName,deptHeadId,createDate,updateDate FROM dbo.ADepartment /* deptIddeptNamedeptHeadIdcreateDateupdateDate 1dept112021-09-13 12:35:05.04666672021-09-13 12:35:05.0466667 2dept222021-09-13 12:35:05.04666672021-09-13 12:35:05.0466667 */SELECT studentId,studentName,deptId,marks,createDate,updateDate FROM dbo.AStudent /* studentIdstudentNamedeptIdmarkscreateDateupdateDate 1st11942021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 2st22952021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 3st31922021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 4st42932021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 5st51902021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 6st62892021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 */
Modify Data in On-Premise source SQL database
Now, I connect to the source SQL database. I delete one record from Student, one record from Department, and one record from Faculty. Before deletion, I need to make sure that the Department record to be deleted should not be referred to in the Student table. I also ensure the Faculty record to be deleted should not be referred to in the Department table.
Next, I update one record in each of the three tables. Then, I insert two new records in each of the three tables.
--delete existing record DELETE FROM dbo.Student WHERE studentId = 6 DELETE FROM dbo.Department WHERE deptId = 2 DELETE FROM dbo.Faculty WHERE facultyId = 4 --update existing record UPDATE dbo.Faculty SET facultyName = 'fname3New' WHERE facultyId = 3 UPDATE dbo.Department SET deptName = 'dept1New' WHERE deptId = 1 UPDATE dbo.Student SET marks = 70 WHERE studentId = 5 --new records inserted INSERT INTO dbo.Faculty (facultyName,DOJ,createDate,updateDate) VALUES ('fname5','01-05-2021',GETDATE(), GETDATE()), ('fname6','01-06-2021',GETDATE(), GETDATE()) INSERT INTO dbo.Department (deptName,deptHeadId,createDate,updateDate) VALUES ('dept3',5,GETDATE(), GETDATE()), ('dept4',6,GETDATE(), GETDATE()) INSERT INTO dbo.Student (studentName,deptId,marks,createDate,updateDate) VALUES ('st7',3,60,GETDATE(), GETDATE()), ('st8',4,61,GETDATE(), GETDATE())
Debug the pipeline again for incremental data load
I execute the pipeline again by pressing the Debug button. I follow the debug progress and the execution of all the activities completed successfully.
Check Data the Azure SQL Database
Now, I check data in both the staging tables and final tables in the Azure SQL database. Each staging table has entries for the inserted, updated and deleted records. In case of deleted record, only the Id column has value and rest of the table columns contain NULL. Because, in case of deletion, record does not exist in the table in on-premise SQL Server database, But, the primary key of the deleted record is returned from CHANGETABLE.
The final tables in Azure SQL Database reflect all the changes done in the source tables. So, Azure SQL tables now contain the same set of data as present in the source SQL tables.
SELECT facultyId,facultyName,DOJ,createDate,updateDate,SYS_CHANGE_VERSION,SYS_CHANGE_OPERATION,SYS_CHANGE_COLUMNS FROM dbo.stgAFaculty /* facultyIdfacultyNameDOJcreateDateupdateDateSYS_CHANGE_VERSIONSYS_CHANGE_OPERATIONSYS_CHANGE_COLUMNS 3fname3New2021-01-032021-09-13 12:34:59.82333332021-09-13 12:34:59.823333331U 0x0000000002000000 4NULLNULLNULLNULL30D NULL 5fname52021-01-052021-09-13 13:41:04.62000002021-09-13 13:41:04.620000034I NULL 6fname62021-01-062021-09-13 13:41:04.62000002021-09-13 13:41:04.620000034I NULL */SELECT deptId,deptName,deptHeadId,createDate,updateDate,SYS_CHANGE_VERSION,SYS_CHANGE_OPERATION,SYS_CHANGE_COLUMNS FROM dbo.stgADepartment /* deptIddeptNamedeptHeadIdcreateDateupdateDateSYS_CHANGE_VERSIONSYS_CHANGE_OPERATIONSYS_CHANGE_COLUMNS 1dept1New12021-09-13 12:35:05.04666672021-09-13 12:35:05.046666732U 0x0000000002000000 2NULLNULLNULLNULL29D NULL 3dept352021-09-13 13:41:04.62000002021-09-13 13:41:04.620000035I NULL 4dept462021-09-13 13:41:04.62000002021-09-13 13:41:04.620000035I NULL */SELECT studentId,studentName,deptId,marks,createDate,updateDate,SYS_CHANGE_VERSION,SYS_CHANGE_OPERATION,SYS_CHANGE_COLUMNS FROM dbo.stgAStudent /* studentIdstudentNamedeptIdmarkscreateDateupdateDateSYS_CHANGE_VERSIONSYS_CHANGE_OPERATIONSYS_CHANGE_COLUMNS 5st51702021-09-13 12:35:10.35000002021-09-13 12:35:10.350000033U 0x0000000004000000 6NULLNULLNULLNULLNULL28D NULL 7st73602021-09-13 13:41:04.62000002021-09-13 13:41:04.620000036I NULL 8st84612021-09-13 13:41:04.62000002021-09-13 13:41:04.620000036I NULL */SELECT facultyId,facultyName,DOJ,createDate,updateDate FROM dbo.AFaculty /* facultyIdfacultyNameDOJcreateDateupdateDate 1fname12021-01-012021-09-13 12:34:59.82333332021-09-13 12:34:59.8233333 2fname22021-01-022021-09-13 12:34:59.82333332021-09-13 12:34:59.8233333 3fname3New2021-01-032021-09-13 12:34:59.82333332021-09-13 12:34:59.8233333 5fname52021-01-052021-09-13 13:41:04.62000002021-09-13 13:41:04.6200000 6fname62021-01-062021-09-13 13:41:04.62000002021-09-13 13:41:04.6200000 */SELECT deptId,deptName,deptHeadId,createDate,updateDate FROM dbo.ADepartment /* deptIddeptNamedeptHeadIdcreateDateupdateDate 1dept1New12021-09-13 12:35:05.04666672021-09-13 12:35:05.0466667 3dept352021-09-13 13:41:04.62000002021-09-13 13:41:04.6200000 4dept462021-09-13 13:41:04.62000002021-09-13 13:41:04.6200000 */SELECT studentId,studentName,deptId,marks,createDate,updateDate FROM dbo.AStudent /* studentIdstudentNamedeptIdmarkscreateDateupdateDate 1st11942021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 2st22952021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 3st31922021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 4st42932021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 5st51702021-09-13 12:35:10.35000002021-09-13 12:35:10.3500000 7st73602021-09-13 13:41:04.62000002021-09-13 13:41:04.6200000 8st84612021-09-13 13:41:04.62000002021-09-13 13:41:04.6200000 */
Conclusion
Initial full load and Incremental data load is now complete for three related tables from the on-premise source SQL database to the target Azure SQL database. More tables can be added in this process, if required. As the tables Student, Department and Faculty are related to each other with Referential Integrity, the incremental data loading together makes the process effective and useful. Unrelated tables also can be modified together using a single Data Factory pipeline.