June 16, 2010 at 9:29 am
A little back story before my question
I work for a company where when data corrections were needed in SQL Server instead of backing up the entire database the developers would back up the tables by INTO statements (EG SELECT * INTO Table2 From Table). However they would not go back and drop the table. There are now hundreds of backed up tables and the backed up tables are now taking up half the memory of the database (In some scenarios a 20 GB database is 10 gbs of backed up table). Yes I know this is stupid and there are many better ways to do things but they are magic programmers and not SQL programmers ;-).
My question is this
I have set up a routine where it will cycle through SysObjects and drop any table not a part of my IN statement (which has nearly a thousand records in it). However I'd like to automate this instead of having to constantly add new records to it whenever a new table is added and there is always the possibility that I've missed a table. Is there any way to tell a created table versus a table that has been backed up via an into statement that isn't the name? Create date is not an option. I've tried looking for info on sysobjects but most of the columns desc are "Reserved for internal use". Any assistance or direction would be appreciated
June 16, 2010 at 9:53 am
I can't think of anything that would tell you how the table was created, there may be a way that i do not know however, one thought though is that it could be possible to tell by the primary key / indexes on the table.
I assume the real tables would have indexes / primary keys and the ad-hoc backups will not have these, unless the magic developers added them but this seems unlikely.
June 16, 2010 at 2:00 pm
Depending on how they created the tables and what roles they had at the time, you might be able to go off table owner.
select name from sysobjects where xtype = 'U' and uid = user_id('magic programmers')
_____________________________________________________________________
- Nate
June 16, 2010 at 2:38 pm
There was a place I worked for that had a special database for developers to put data they wanted saved. It would delete anything created over a month ago on a nightly basis. Not the best solution, but it might be better than what you currently have. All the developers were of course aware the data would be cleaned up after a month.
June 16, 2010 at 2:53 pm
ok try this as a proof of concept: in your Sandbox, do a Select * into newtable from sometable, and then run this query:
i'm using FORXML to concat all the column names together, then row number to partition them by the columns;
all the tables witht eh same column names will have an row_number() value greater than 1, so you can infer by the create_date which is the older, more original;
see what you think of this:
--example results:
RW create_date TheTable TheColumns
1 2010-05-17 19:02:58.053 AllIndexes schema_id,schema_name,object_id,object_name,index_id,index_name,Rows,SizeMB,IndexDepth,type,type_desc,fill_factor,is_unique,is_primary_key,is_unique_constraint,index_columns_key,index_columns_include
2 2010-06-16 16:45:17.127 MyClone schema_id,schema_name,object_id,object_name,index_id,index_name,Rows,SizeMB,IndexDepth,type,type_desc,fill_factor,is_unique,is_primary_key,is_unique_constraint,index_columns_key,index_columns_include
select ROW_NUMBER() OVER(PARTITION BY [TheColumns] ORDER BY [TheColumns],[TheTable]) AS RW,*
FROM
(SELECT
max(s3.create_date) AS create_date,
s3.name As [TheTable],
stuff(( SELECT ',' + name
FROM sys.columns s2
WHERE OBJECT_NAME(s2.object_id)= s3.name --- must match GROUP BY below
ORDER BY s2.column_id
FOR XML PATH('')
),1,1,'') as [TheColumns]
FROM sys.columns s1
inner join sys.objects s3 on s1.object_id = s3.object_id
GROUP BY s3.name --- without GROUP BY multiple rows are returned
--ORDER BY OBJECT_NAME(s1.object_id)
)MyAlias
Lowell
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply