Introduction
In the world of software development you find lots of problems during software development and to overcome those problems we developers mostly depend on Google searches. But it is very true that all the time you cannot succeed with this strategy; sometimes you have to think about a new solution with a new mindset. I could not easily solve one problem and had to develop my own solution.
Cleaning up garbage from data is a big problem for a developer. It's a time intensive and tedious job. It also involves risk because there is a chance to lose your important data if you have not enough experience to handle this situation. So I want to share a painless and generic way to achieve this goal. Before I go further and describe the generic solution, let me explain the typical solution to achieve this goal.
Typical Solution
If the total number of tables is within 5 to 15 then there is no issue. You can simply delete / truncate the garbage data using simple SQL. Let me show you an example. I am assuming there is a database table for this example named DBTable1. Here is the generic SQL syntax of a DELETE command using the WHERE clause to delete data from table
DELETE FROM DBTable1 WHERE [Any Column Name] = [Some Value]
Here is generic SQL syntax of DELETE command to delete all data from table.
DELETE FROM DBTable1
Here is generic SQL syntax of TRUNCATE command to delete all data from table.
TRUNCATE TABLE DBTable1
Key points to remember:
** TRUNCATE TABLE permissions default to the table owner and are not transferable.
Problem Secnerio
Imagine if you working on a project which has lots of database tables, i.e. 50, 100, 200 or more than that. In my scenario I can imagine 1800 tables in an ERP solution. Now the question that arises here: How you identify the tables from those you want to perform the cleanup operation?
Let me point out the pain points are
- You have to check tables that are related to another tables with a foreign key relationship.
- You have to analysis the database diagram to know the relationship.
- You have to prepare a final list of those tables from which you want to perform the cleanup operation.
- You have to take the decision on which table you want to run delete / truncate statement
It is a tough job now based on above mentioned typical solution, isn't it? To overcome this problem I found the bellow solution which is very useful for me. That's why I want to share my code here. Check it out.
Generic Solution to Cleaning up garbage from data
My goal is to achieve the above mentioned tedious & painful job in a simplified way. To achieve my goal you have to create stored procedure below (stp_CleanGarbageData) and then you have to create the function GetTableName. Let me explain how you execute the code
Example: Let's assume you have a database with 500 tables. Now you want to remove garbage data from 496 tables and keep the data in [500-496 = 4] tables. So you have to put 4 tables name in any order as comma separated value in the SQL statement below. Please keep in mind you only have to provide those table names from which you don't want to remove data.
SQL Syntax:
EXEC StoredProcedureName 'Table name as comma separated value within String Parameter'
So as per example SQL syntax will be
EXEC stp_CleanGarbageData 'Table_1, Table_2, Table_3, Table_n'
After executing the SQL syntax as I mentioned you get a list of table names in Query Analyzer. It shows on how many tables this query performs a delete operation and truncate operation respectively. And you don't need to worry about the pain points now.
** Please don't forget to replace Table_1 & Table_2, Table_3 ... Table_n with original table Name, Only those table name from which you don't want to remove the data.
Code Snippet: Stored Procedure
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[stp_CleanGarbageData] @TableNames varchar(8000) AS
BEGIN DECLARE @DynamicString varchar(8000) EXEC sp_MSForEachTable '
IF OBJECTPROPERTY(object_id(''?''), ''TableHasForeignRef'') = 1
BEGIN
PRINT ''--DELETED TABLE NAME--''
PRINT ''----------------------''
PRINT ''?''
END
ELSE
BEGIN
PRINT ''--TRUNCATED TABLE NAME--''
PRINT ''----------------------''
PRINT ''?''
END
'
-- disable referential integrity
EXEC sp_MSForEachTable 'ALTER TABLE ? NOCHECK CONSTRAINT ALL'
Set @DynamicString = '
IF (object_id(''?'') not in (select object_id(Id) from dbo.GetTableName(''' + @TableNames + ''')))
BEGIN
IF OBJECTPROPERTY(object_id(''?''), ''TableHasForeignRef'') = 1
BEGIN
DELETE FROM ?
END
ELSE
BEGIN
TRUNCATE TABLE ?
END
END
'
EXEC sp_MSForEachTable @DynamicString -- enable referential integrity again
EXEC sp_MSForEachTable 'ALTER TABLE ? CHECK CONSTRAINT ALL' END GO
Code Snippet:Function
CREATE FUNCTION dbo.GetTableName
(
@Ids varchar(5000)
)
RETURNS @TempTable TABLE (Id varchar(8000))
AS
BEGIN IF(@Ids is not null)
begin
DECLARE @Id varchar(5000), @Pos int
SET @Ids = LTRIM(RTRIM(@Ids))+ ','
SET @Pos = CHARINDEX(',', @Ids, 1)
IF REPLACE(@Ids, ',', '') <> ''
BEGIN
WHILE @Pos > 0
BEGIN
SET @Id = LTRIM(RTRIM(LEFT(@Ids, @Pos - 1)))
IF @Id <> ''
BEGIN
INSERT INTO @TempTable (Id) VALUES (CAST(@Id AS varchar(8000))) --Use Appropriate conversion
END
SET @Ids = RIGHT(@Ids, LEN(@Ids) - @Pos)
SET @Pos = CHARINDEX(',', @Ids, 1)
END
END
end
else
INSERT INTO @TempTable (Id) VALUES (null)
RETURN
END
Explanation
Let me explain my code snippet of Stored Procedure and Function. So I start with stp_CleanGarbageData stored procedure, within this procedure I need to check referential integrity for all tables in a database.
SQL Syntax to check referential integrity
IF OBJECTPROPERTY(object_id(''?''), ''TableHasForeignRef'') = 1
To perform the same actions for all tables in a database, you can create cursor or you can use the sp_MSforeachtable undocumented stored procedure to accomplish the same goal with less work.
For example, you can use the sp_MSforeachtable stored procedure to rebuild all indexes in a database:
EXEC sp_MSforeachtable @command1="print '?' DBCC DBREINDEX ('?')"
I use sp_MSforeachtable to check referential integrity for all tables in a database and prepare a list of that using bellow syntax
EXEC sp_MSForEachTable '
IF OBJECTPROPERTY(object_id(''?''), ''TableHasForeignRef'') = 1
BEGIN
PRINT ''--DELETED TABLE NAME--''
PRINT ''----------------------''
PRINT ''?''
END
ELSE
BEGIN
PRINT ''--TRUNCATED TABLE NAME--''
PRINT ''----------------------''
PRINT ''?''
END
'
to confirm the table names in which the Delete or Truncate operation done. In my case I want to know the list of table names on which Delete or Truncate operation performed.
Then I disable referential integrity checking for all tables in the database. See the bellow syntax
EXEC sp_MSForEachTable 'ALTER TABLE ? NOCHECK CONSTRAINT ALL'
Then I build a dynamic string to perform the DELETE or TRUNCATE actions respectively for all tables in a database based on referential integrity, excluding those tables from which I don't want to delete data. To build this dynamic string I called GetTableName function, it takes comma separated Table Names (** those table names, which I already passed as a parameter of stp_CleanGarbageData stored procedure) as an input parameter and returns respective table name.
Set @DynamicString = '
IF (object_id(''?'') not in (select object_id(Id) from dbo.GetTableName(''' + @TableNames + ''')))
BEGIN
IF OBJECTPROPERTY(object_id(''?''), ''TableHasForeignRef'') = 1
BEGIN
DELETE FROM ?
END
ELSE
BEGIN
TRUNCATE TABLE ?
END
END
'
EXEC sp_MSForEachTable @DynamicString
To complete DELETE or TRUNCATE operation after checking 'TableHasForeignRef' I use bellow syntax.
IF OBJECTPROPERTY(object_id(''?''), ''TableHasForeignRef'') = 1
BEGIN
DELETE FROM ?
END
ELSE
BEGIN
TRUNCATE TABLE ?
END
I want to DELETE all data from a table without any condition. See bellow syntax
DELETE FROM ?
If you want to DELETE data based on condition then please modify bellow syntax based your requirement.
IF OBJECTPROPERTY(object_id(''?''), ''TableHasForeignRef'') = 1
BEGIN
DELETE FROM ? -- Write your own custom delete logic here.
END
** ' ? ' This question mark actually specifies the table name dynamically to complete the DELETE or TRUNCATE operation.
After all, I use sp_MSForEachTable to enable referential integrity checking again and complete my work.
EXEC sp_MSForEachTable 'ALTER TABLE ? CHECK CONSTRAINT ALL'
Conclusion
My approach towards the Cleaning up garbage from data I found it is very useful, reusable and painless solution. Because it will help you a lot to overcome above described problem. You also noticed that I use powerful features of SQL Server with my custom logic to overcome the problem. Hope you like the solution.
🙂 Happy Coding