Introduction
As a developer or a database administrator we all have to deal with in-house developed applications and 3rd party vendor applications. If there is an issue in the design of the database for in-house applications; it is easier to refine and fix it. But when 3rd party application have issues, it's rarely possible to fix such changes. Since in 3rd party applications you don't have access to their code; however it is usually possible to fix the back end database issues long as it doesn't affect the front-end application. I ran into a case where the vendor had no front-end component to allow updates to one of their key tables; they had recommended for us to go directly to the back end to make these changes. We had implemented their product for multiple groups of people; so having to do manual updates was going to be cumbersome and in addition, while looking at their update script, I discovered that the table was not normalized.
The table held the information for resources within the application; it was designed to hold information for each resource the application had and which groups within the application had permission to access that resource. It was implemented by means of storing multiple group values in one column indicating what groups had access to the resource. These groups were bundled together like "Group1|Group2|Group3"; no easy way to remove and add the groups in as needed.
Since there was no front-end component that required writing ability; I came up with the following way to normalize the database without breaking the application; it is possible to rebuild the table structure into a database view without having to worry about getting the vendor fix their code.
Note: Structure has been altered from original, but it still demonstrates the point properly.
Present Structure and Data
Figure 1. Present denormalized structure
Figure 2. Current Data in denormalized form
The current structure doesn't seem like any issue; when you look at it without the data, it is a normal table with a few columns, but when you look at it with the data we notice that groups are grouped together using "|" (pipe) indicating which groups have access to what resource.
When I look at this a few things come to mind:
- A field in a record should only store one value; here we have multiple values being stored.
- The data value being stored should be unique for all rows; i.e. it cannot be pulled out to minimize the storage of the data.
So our goal here is to change the above information in a way to minimize the above mentioned issues, to that I propose the following changes:
- Divide up ResourceName to its own table with an identity column.
- Divide up Groups (Uniquely) to its own table with an identity column.
- Create a link table that shows the relationship between Resources and Groups.
- Since we don't want to break our current application, create a new view that will return the information as the application expects it.
- In addition we need to create a function that will combine the groups as in Figure 2 so the end result is transparent to the application.
Design Changes Explanation
Create a new table Resource to hold just resource information. This table will contain a new int column with the original ResourceName and IsEnabled fields.
CREATE TABLE [dbo].[Resource] ( [ResourceID] [int] IDENTITY(1,1) NOT NULL, [ResourceName] [varchar](50) NOT NULL, [IsEnabled] [char](1) NOT NULL CONSTRAINT [DF_Resource_IsEnabled] DEFAULT ('Y'), CONSTRAINT [PK_Applications] PRIMARY KEY CLUSTERED ( [ResourceID] ASC ) ON [PRIMARY] ) ON [PRIMARY] GO
Create a new table SecurityGroup to hold just group information. This table will contain a new int column with the new GroupName field. Notice this field is now ¼ the size of the original field because it will only be storing one value per row for each group.
CREATE TABLE [dbo].[SecurityGroup] ( [GroupID] [int] IDENTITY(1,1) NOT NULL, [GroupName] [varchar](255) NOT NULL, CONSTRAINT [PK_SecurityGroup_GroupID] PRIMARY KEY CLUSTERED ( [GroupID] ASC) ON [PRIMARY] ) ON [PRIMARY] GO
Create a new table ResourceSecurityGroup to hold the link information between Resources and Groups. This table allows us to define the relationship between the resources and groups; which defines which groups have access to which resources. We create two int fields, which are foreign key constraints back to their main table.
CREATE TABLE [dbo].[ResourceSecurityGroup] ( [ResourceID] [int] NOT NULL, [GroupID] [int] NOT NULL, CONSTRAINT [PK_ResourceSecurityGroup] PRIMARY KEY CLUSTERED ( [ResourceID] ASC, [GroupID] ASC ) ON [PRIMARY] ) ON [PRIMARY] GO ALTER TABLE [dbo].[ResourceSecurityGroup] WITH CHECK ADD CONSTRAINT [FK_ResourceSecurityGroup_Resource_ResourceID] FOREIGN KEY([ResourceID]) REFERENCES [dbo].[Resource] ([ResourceID]) ALTER TABLE [dbo].[ResourceSecurityGroup] CHECK CONSTRAINT [FK_ResourceSecurityGroup_Resource_ResourceID] ALTER TABLE [dbo].[ResourceSecurityGroup] WITH CHECK ADD CONSTRAINT [FK_ResourceSecurityGroup_SecurityGroup_GroupID] FOREIGN KEY([GroupID]) REFERENCES [dbo].[SecurityGroup] ([GroupID]) ALTER TABLE [dbo].[ResourceSecurityGroup] CHECK CONSTRAINT [FK_ResourceSecurityGroup_SecurityGroup_GroupID] GO
Now all our transformation of the data structure is complete. We have successfully broken up the one table into three tables to separate the data so it will be normalized (please reference Appendix A, on the data transformation Routines).
Transparent Transformation
The whole objective of this task is to make this as transparent for the front-end application as possible. If we were to now just leave the data in those three tables the application will not be able to access to data. Therefore we are going to define a few additional objects to complete the transformation:
Rename the original table; we don't want to delete it because it is needed for data transformation routines. Also since we will be creating a new view to replace this table we can't have two objects with the same name within a database.
sp_rename 'ApplicationResources', 'oldApplicationResources' GO
Create a new function CombineGroupNames that will combine all the groups back together for the view. This function is required to get the data from our new normalized tables back into the format the application will understand. Therefore based on the resource ID passed; a simple cursor walks through all the groups that have access to the resource and concatenates it together as "Grp1|Grp2|Grp3|etc..."
CREATE FUNCTION CombineGroupNames ( @ResourceID int ) RETURNS VarChar(MAX) AS BEGIN DECLARE @Groups VarChar(MAX) DECLARE @GroupName VarChar(255) DECLARE GroupsCursor CURSOR FOR SELECT G.GroupName FROM Resource R INNER JOIN ResourceSecurityGroup RG ON R.ResourceID = RG.ResourceID INNER JOIN SecurityGroup G ON G.GroupID = RG.GroupID WHERE R.ResourceID = @ResourceID OPEN GroupsCursor FETCH NEXT FROM GroupsCursor INTO @GroupName WHILE @@FETCH_STATUS = 0 BEGIN IF (LEN(@Groups) > 1) BEGIN SET @Groups = @Groups + '|' + @GroupName END ELSE BEGIN SET @Groups = @GroupName END FETCH NEXT FROM GroupsCursor INTO @GroupName END CLOSE GroupsCursor DEALLOCATE GroupsCursor RETURN @Groups END
The final step in the transformation is to create a new view with the same name as the original table, ApplicationResources. This view returns the same columns as the application is expecting with the same ordering and the same data format.
Create View ApplicationResources AS SELECT R.[ResourceName] ,R.[IsEnabled] ,dbo.CombineGroupNames(R.ResourceID) AS Groups FROM dbo.Resource R
Now we have completed our transformation, please note this change will work for read-only applications without issues. Data updatable applications will require additional work on top of the view (another article).
After-Affects of the Change
After all the normalization, we get the following three tables plus a view (sorry no screen shot):
Figure 3. Normalized Table Structure
Figure 4. Resource Table
Figure 5. SecurityGroup Table
Figure 6. ResourceSecurityGroup Table
Figure 7. Output from the ApplicationResources view
Conclusion
The goal was to make all changes transparent enough so the front-end application doesn't notice or break and we still get an easier set of tables to manage in the future. Developing a new design layout by separating the original table into three tables allowed for us to simplify the data management.
Since we didn't want to break the application the table name still had to exist, which we created as a database view. As a front-end application it doesn't know the difference between a view and a table; in this case the front-end only had read ability. If the write ability existed and normalization was still required it would be possible to do this with an addition of INSTEAD-OF INSERT trigger on the view that properly stores the data into the normalized tables. Figure 7 represents a list of all the resources and their respective groups like the applications would expect from the new view created in step 7. And when comparing it to the Figure 2, we notice there is no difference; thus completing the transformation.
However even though this change has made it easier to manage the data, it came with a bit of a price tag on performance. Before the change all the groups were combined together and no additional processing was required to return the results when the ApplicationResource table was queried; but now in the database view this information is combined each time; creating a slight performance drag.
Appendix A
To transform the data from the original table we need to do the following parts:
- Copy the ResourceName and IsEnabled property to the new Resource table without any group information.
- Copy over each distinct group from the list of groups available in the Groups field.
- Create a mapping between resources and the new security group table to allow for the final view.
The attached script, Normalizing-DenormalizedTable-AppendixASQLScript.sql, gives you example code on how to do that.
** Normalizing-DenormalizedTable.sql script is all the SQL Code shown in this article **