April 20, 2016 at 2:21 pm
Comments posted to this topic are about the item Brute Force data search tool
April 27, 2016 at 8:04 am
This proc is awesome! But I've noticed it consumes a lot of memory, which is fine, but it doesn't appear to release it once the query is done and the session is closed. The only way I've found, so far, to release the memory is to restart SQL Server's service. (see attached)
Is there a way to release the memory without having to restart the service?
April 27, 2016 at 11:22 am
Yeah it can be a memory consumer for sure depending on the search and the size of the database and the amount of data it contains. I have read a few things about sql server and how it manages memory and really i think its less to do with the query and more to do with sql server and how it internally manages memory. I know that this sort of issue is more common with sql server 2008 r2 and earlier versions, and I don't know if this has changed in 2012 or not, so it may be still an issue in the newer ones.
Basically Sql Server will grab memory for a batch/thread, it will not automatically release that memory and even though you may close the query in SSMS or kill its process, it will keep the thread it ran the query or batch on open, thus retaining that memory it has grabbed for it. It does this as a performance gain in that its rather costly to spool up another thread, so it tries to reuse them and any memory that they are holding. It should however release that memory as soon as something else in the OS needs it. Normally Sql server likes being the only process on an OS, so its designed to take everything it can. However if the OS needs memory for something it will send a low memory event to sql server which then releases memory back to the OS. So everything i have read points to this being a normal part of the process sql server uses in managing memory.
I did however find a few articles via Google search about this issue and the following one not only explains the same thing i did above, but also has a few ideas on how to force sql server to release memory without restarting the service. So i think for you it would be worth the read for sure. The parts you would be interested in are the last few posts in that thread.
here is the link
Hopefully this will help you out:)
John
May 2, 2016 at 11:54 am
Thanks for the script.
May 2, 2016 at 5:10 pm
Hello -
Very nice script!
I have two questions:
1.) What is the difference, if any, between "tblqry" and "sqltext" in the results?
2.) Is there a way to conditionally print the errors?
SELECT 'Errors Encountered';
SELECT * FROM #TempBruteForce WHERE LEN(errortxt) > 0;
Thank You
May 2, 2016 at 10:43 pm
Yeah originally they were slightly different. tblqry was the SQL for the counts, and sqltxt was the same query but with the column selected instead of the count. basically allowing me to copy and paste the results and see all the rows in that column in the table it was found in that qualified. I removed that in the last update because it previously looked at each column individually making the sqltxt value easy to make into a select with the same where clause as the tblqry had, but made the running of it really slow on large databases. Once i changed it to its current format of doing all the columns of a table at once in the where clause, i decided for now not to bring that one back.
The error text could be formatted any way you like, return anything you like, its all done in the only catch there so it would be easy to modify and format. I just did not have a need or a care really about the actual errors being encountered as the script is running unless every table errors, and then its a quick copy and paste from those results to a new query window to see what the error was.
With all that said I do see how it would be cool to have these features, so if you want i can modify the script for you to include this stuff, and post it here for you to grab, or you can do it yourself. I leave that for you to figure out.
Thanks
John
May 3, 2016 at 11:22 am
Hi John,
For the present time, I modified the end of the procedure as follows:
[font="Courier New"]CLOSE dbnames
DEALLOCATE dbnames
--SELECT 'Search Results';
SELECT * FROM #TempBruteForce
WHERE Matches > 0 OR LEN(errortxt) > 0
ORDER BY Matches DESC;
--SELECT 'Errors Encountered';
--SELECT * FROM #TempBruteForce WHERE LEN(errortxt) > 0;[/font]
I am curious as to what type of errors might be encountered. I suppose if a person did not have read rights for a table that this could trigger an error.
What I really need is an equivalent procedure for Oracle (11.2).
Tom
May 3, 2016 at 12:05 pm
Yeah haven't had that one come up but i imagine its possible, usually the errors are for calculated columns or improper search terms, ie searching for a date that is not valid like 1-32-2016, and a host of other weird issues that to be honest i never really pay to much attention to. Actually there is not a lot of errors that can be encountered with this, but in an effort to not have to check every single possibility of an error i decided to cheat and use the try catch which allows it to just work. I normally do care about error checking and the like, just with this script being adhoc and not for production, it seemed less important than that of just finding the data im looking for. Now if i was to find nothing but have errors in tables i might look further into it, but 99.9% of the time the data is found and any errors are just ignored.
I havent worked with Oracle that much so i dont have anything written there for it, but i imagine it wouldnt be to much of a change at least in the basic idea of it, Oracle has the data dictionary views to get the table and column info i believe and everything is a cursor in oracle so looping through them would be fairly easy as well, granted it will look completely different in code but the theory is the same. I did some googling for you and found a few, here is the search i did https://www.google.com/#q=oracle+find+data+in+any+column, seems like there are more than a few scripts out there to look at as an example in oracle. Most seem to do the same thing as mine, query up the tables and columns, build a dynamic query, execute it and look for results. hopefully that helps you out:)
John
May 3, 2016 at 12:16 pm
Thanks for doing the Google search for an Oracle method. I will certainly have a look, as that would be very useful for me at work.
I tried changing:
SELECT t.tbl,'select count(1) from ' + tbl + '
to this:
SELECT t.tbl,'select * from ' + tbl + '
but the result is that Errortxt column gets populated with an identical SQL statement. As long as the two columns "tblqry" and "sqltext" are always the same, my preference is to eliminate one of them and substitute a SELECT * for the SELECT Count(1).
Can I take you up on this offer, below? 🙂
"With all that said I do see how it would be cool to have these features, so if you want i can modify the script for you to include this stuff, and post it here for you to grab, or you can do it yourself."
May 3, 2016 at 1:04 pm
yup not a problem at all here you go. now the tblqry column has the "select count(1)", the sqltxt will now have "select *" errors now are formatted with the error info, i also added in a check so the error results are only returned when there is an error. if you want to see an error just execute this on it.
EXEC DBBruteForce '01-32-1980','date',0,'contains';
should error on every table with "Conversion failed when converting date and/or time from character string."
IF OBJECT_ID ( 'DBBruteForce', 'P' ) IS NOT NULL
DROP PROCEDURE [dbo].DBBruteForce;
GO
CREATE PROCEDURE [dbo].[DBBruteForce]
@searchvalue varchar(4000),
@type varchar(200),
@fuzzy BIT,
@fuzzyoperator VARCHAR(200)
AS
BEGIN
--set nocount on as there could be thousands of queries here
SET NOCOUNT ON;
--build up helper table for types used later to limit rows we will query
DECLARE @typelist TABLE(rowid INT IDENTITY, TYPE VARCHAR(255), typename varchar(255) PRIMARY KEY(type,typename));
--feel free to add more in here if you need currently this only supports number, string and date
INSERT INTO @typelist(type,typename)
VALUES
('number','decimal'),
('number','numeric'),
('string','char'),
('number','smallint'),
('string','varchar'),
('date','datetime'),
('string','nvarchar'),
('string','text'),
('string','ntext'),
('number','int'),
('number','bigint'),
('date','smalldatetime'),
('number','float'),
('number','money');
--now remove the temp tbles if they already exists
IF OBJECT_ID('tempdb..#TempCols') IS NOT NULL
DROP TABLE #TempCols;
IF OBJECT_ID('tempdb..#TempBruteForce') IS NOT NULL
DROP TABLE #TempBruteForce;
--create the temp table needed for the search
CREATE TABLE #TempCols(id INT IDENTITY PRIMARY KEY, tbl VARCHAR(255) NOT null, col VARCHAR(255) NOT null, TYPE varchar(255));
CREATE TABLE #TempBruteForce(id INT IDENTITY PRIMARY KEY, tbl VARCHAR(255) NOT null, tblqry Nvarchar(max), cnt INT DEFAULT (0),processed BIT DEFAULT(0),sqltxt Nvarchar(max),errortxt NVARCHAR(max));
--there shouldnt be a huge need to a index here so im skipping that
--now we have 2 seperate ways to deal with this, one for strings and one for the rest
IF(@type = 'string')
BEGIN
INSERT INTO #TempCols(tbl,col,type)
SELECT DISTINCT '[' + ss.name + '].[' + t.name + ']' AS tblname, c.name AS colname, st.name AS coltype
FROM sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
INNER JOIN sys.types st ON st.system_type_id = c.system_type_id AND st.name != 'sysname'
INNER JOIN sys.schemas ss ON ss.schema_id = t.schema_id
WHERE EXISTS (SELECT 1 FROM @typelist WHERE type = @type AND typename = st.name)
AND (st.name IN ('ntext','text') OR c.max_length >= LEN(@searchvalue))
--now lets combine these to lessen the load on the server dependent on what was requested by grouping these by table
--then combining all the columns into one where clause, this will reduce the number of searches to the number of tables with qualifying columns in them
--changing it a little to work with all the differnt options available
IF(@fuzzy = 1 AND @fuzzyoperator = 'beginswith')
BEGIN
--begins with search
INSERT INTO #TempBruteForce(tbl,tblqry,cnt,processed,sqltxt,errortxt)
SELECT t.tbl,'select count(1) from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',0,0,'select * from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',null
FROM #TempCols t
OUTER APPLY ( SELECT STUFF(( SELECT '|' + ('[' + tt.col + '] LIKE '''+@searchvalue+'%''')
FROM #TempCols tt
WHERE tt.tbl = t.tbl
FOR
XML PATH(''), TYPE ).value('.[1]', 'nvarchar(max)'
), 1, 1, '') AS csvvalue
) x
GROUP BY tbl,x.csvvalue
END
ELSE IF(@fuzzy = 1 AND @fuzzyoperator = 'endswith')
BEGIN
--ends with search
INSERT INTO #TempBruteForce(tbl,tblqry,cnt,processed,sqltxt,errortxt)
SELECT t.tbl,'select count(1) from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',0,0,'select * from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',null
FROM #TempCols t
OUTER APPLY ( SELECT STUFF(( SELECT '|' + ('[' + tt.col + '] LIKE ''%'+@searchvalue+'''')
FROM #TempCols tt
WHERE tt.tbl = t.tbl
FOR
XML PATH(''), TYPE ).value('.[1]', 'nvarchar(max)'
), 1, 1, '') AS csvvalue
) x
GROUP BY tbl,x.csvvalue
END
ELSE IF(@fuzzy = 0)
BEGIN
--string exact match, using like to work around text and ntext columns but with no wildcards
INSERT INTO #TempBruteForce(tbl,tblqry,cnt,processed,sqltxt,errortxt)
SELECT t.tbl,'select count(1) from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',0,0,'select * from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',null
FROM #TempCols t
OUTER APPLY ( SELECT STUFF(( SELECT '|' + ('[' + tt.col + '] LIKE '''+@searchvalue+'''')
FROM #TempCols tt
WHERE tt.tbl = t.tbl
FOR
XML PATH(''), TYPE ).value('.[1]', 'nvarchar(max)'
), 1, 1, '') AS csvvalue
) x
GROUP BY tbl,x.csvvalue
END
ELSE
BEGIN
--default to contains
INSERT INTO #TempBruteForce(tbl,tblqry,cnt,processed,sqltxt,errortxt)
SELECT t.tbl,'select count(1) from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',0,0,'select * from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',null
FROM #TempCols t
OUTER APPLY ( SELECT STUFF(( SELECT '|' + ('charindex('''+@searchvalue+''',[' + tt.col + '], 1) > 0')
FROM #TempCols tt
WHERE tt.tbl = t.tbl
FOR
XML PATH(''), TYPE ).value('.[1]', 'nvarchar(max)'
), 1, 1, '') AS csvvalue
) x
GROUP BY tbl,x.csvvalue
end
END
ELSE IF(@type = 'number')
BEGIN
--build up the columns for number search
INSERT INTO #TempCols(tbl,col,type)
SELECT DISTINCT '[' + ss.name + '].[' + t.name + ']' AS tblname, c.name AS colname, st.name AS coltype
FROM sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
INNER JOIN sys.types st ON st.system_type_id = c.system_type_id AND st.name != 'sysname'
INNER JOIN sys.schemas ss ON ss.schema_id = t.schema_id
WHERE EXISTS (SELECT 1 FROM @typelist WHERE type = @type AND typename = st.name)
--build up query texts
INSERT INTO #TempBruteForce(tbl,tblqry,cnt,processed,sqltxt,errortxt)
SELECT t.tbl,'select count(1) from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',0,0,'select * from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',null
FROM #TempCols t
OUTER APPLY ( SELECT STUFF(( SELECT '|' + ('[' + tt.col + '] = '+@searchvalue+'')
FROM #TempCols tt
WHERE tt.tbl = t.tbl
FOR
XML PATH(''), TYPE ).value('.[1]', 'nvarchar(max)'
), 1, 1, '') AS csvvalue
) x
GROUP BY tbl,x.csvvalue
END
ELSE IF(@type = 'date')
BEGIN
--build up TABLE AND col list OF datetime columns
INSERT INTO #TempCols(tbl,col,type)
SELECT DISTINCT '[' + ss.name + '].[' + t.name + ']' AS tblname, c.name AS colname, st.name AS coltype
FROM sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
INNER JOIN sys.types st ON st.system_type_id = c.system_type_id AND st.name != 'sysname'
INNER JOIN sys.schemas ss ON ss.schema_id = t.schema_id
WHERE EXISTS (SELECT 1 FROM @typelist WHERE type = @type AND typename = st.name)
--in this case we cast datetimes as daes to do a simple comparison, mainly to avoid a 2 millisec difference causing a non match.
--Its better to error on the side of more rows idenified then less.
INSERT INTO #TempBruteForce(tbl,tblqry,cnt,processed,sqltxt,errortxt)
SELECT t.tbl,'select count(1) from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',0,0,'select * from ' + tbl + ' where 1 = 1 and ('+REPLACE(x.csvvalue,'|',' or ')+')',null
FROM #TempCols t
OUTER APPLY ( SELECT STUFF(( SELECT '|' + ('cast([' + tt.col + '] as date) = cast('''+@searchvalue+''' as date)')
FROM #TempCols tt
WHERE tt.tbl = t.tbl
FOR
XML PATH(''), TYPE ).value('.[1]', 'nvarchar(max)'
), 1, 1, '') AS csvvalue
) x
GROUP BY tbl,x.csvvalue
END
ELSE
BEGIN
--if the type cant be determined just error the process out
RAISERROR ('Unknown type encountered! string, number, and date are the only accepted types', 0, 20) WITH NOWAIT;
END
--declarations
DECLARE @rowcnt INT;
DECLARE @operator VARCHAR(200);
DECLARE @where varchar(200);
DECLARE @sqltxt NVARCHAR(max);
DECLARE @sql NVARCHAR(max);
DECLARE @SQL_cnt NVARCHAR(max);
DECLARE @ParmDefinition nVARCHAR(500);
DECLARE @I INT;
DECLARE @id INT;
DECLARE @tbl varchar(255);
DECLARE @col VARCHAR(255);
DECLARE @fldtype VARCHAR(255);
DECLARE @counter INT;
DECLARE dbnames CURSOR LOCAL FORWARD_ONLY FAST_FORWARD FOR
SELECT id,tbl,tblqry,sqltxt FROM #TempBruteForce
OPEN dbnames
FETCH NEXT FROM dbnames INTO @id,@tbl,@sqltxt,@SQL
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @SQL_cnt = 'SELECT @IOUT = ('+@sqltxt+');';
--SELECT @sql = @sqltxt;
SET @ParmDefinition = N'@IOUT INT OUTPUT';
BEGIN TRY
EXEC sp_executesql
@SQL_cnt,
@ParmDefinition,
@IOUT=@I OUTPUT;
--get count results and update the temptable, mark as processed
UPDATE #TempBruteForce SET cnt = @I, sqltxt = @sql, processed = 1 WHERE id = @id;
--if @i is greater than 0 than lets print the query to the messages, allows a preview of data found before it finishes,
--it can take an hour on large databases with large tables and lots of rows. so sometimes i dont like to wait.
IF(@I > 0)
BEGIN
RAISERROR (@SQL, 0, 1) WITH NOWAIT;
END
ELSE
BEGIN
--i like to at least print the table out for ones nothing was found in, mainly to tell where its at in the process
DECLARE @temptxt VARCHAR(255);
SET @temptxt = @tbl + ': [no data found]'
RAISERROR (@temptxt, 0, 1) WITH NOWAIT;
end
END TRY
BEGIN CATCH
--in this case i am interested in what errors so lets just record that one back in the temp table as an error
DECLARE @errorstring NVARCHAR(max),@errornum INT, @errorseverity INT, @errorstate INT, @errorline INT, @errormsg NVARCHAR(4000)
SELECT @errornum = ERROR_NUMBER(),
@errorseverity = ERROR_SEVERITY(),
@errorstate = ERROR_STATE(),
@errorline = ERROR_LINE(),
@errormsg = ERROR_MESSAGE();
SET @errorstring = ISNULL(('Error occurred in table: '+@tbl+'' + CHAR(13) + CHAR(10)),'');
SET @errorstring = @errorstring + ISNULL(('Error Number: ' + CAST(@errornum AS nvarchar) + CHAR(13) + CHAR(10)),'')
SET @errorstring = @errorstring + ISNULL(('Error Severity: ' + CAST(@errorseverity AS nvarchar) + CHAR(13) + CHAR(10)),'')
SET @errorstring = @errorstring + ISNULL(('Error State: ' + CAST(@errorstate AS nvarchar) + CHAR(13) + CHAR(10)),'')
SET @errorstring = @errorstring + ISNULL(('Error Line: ' + CAST(@errorline AS nvarchar) + CHAR(13) + CHAR(10)),'')
SET @errorstring = @errorstring + ISNULL(('Error Message: ' + @errormsg),'')
PRINT('Error occurred in table: '+@tbl+'');
UPDATE t SET processed = 1,errortxt = @errorstring
FROM #TempBruteForce t
where t.id = @id;
END CATCH
FETCH NEXT FROM dbnames INTO @id,@tbl,@sqltxt,@SQL
END
CLOSE dbnames
DEALLOCATE dbnames
SELECT 'Search Results';
SELECT tbl ,tblqry ,cnt ,sqltxt FROM #TempBruteForce WHERE cnt > 0 ORDER BY cnt DESC;
IF(SELECT COUNT(1) FROM #TempBruteForce WHERE LEN(errortxt) > 0) > 0
BEGIN
SELECT 'Errors Encountered';
SELECT tbl ,tblqry ,errortxt FROM #TempBruteForce WHERE LEN(errortxt) > 0;
end
SET NOCOUNT OFF;
END
GO
--example usage-----------------------------------------------------------------
--string begins with
EXEC DBBruteForce 'yoursearchvalue','string',1,'beginswith';
--string ends with
EXEC DBBruteForce 'yoursearchvalue','string',1,'endswith';
--string contains
EXEC DBBruteForce 'yoursearchvalue','string',1,'contains';
--string exact match
EXEC DBBruteForce 'yoursearchvalue','string',0,'contains';
--date
EXEC DBBruteForce '01-31-1980','date',0,'contains';
--number
EXEC DBBruteForce 12348756453,'number',0,'contains';
May 3, 2016 at 1:16 pm
Excellent.
Thank You.
May 3, 2016 at 1:43 pm
Excellent.
Thank You.
May 4, 2016 at 9:20 am
I came up with a similar tool when trying to understand the structure of a database. Mine counts the number of occurrences of the searched item for each column in each table of the db.
If the database/application uses a uniqueidentifier to generate ID's, this can be very useful to reverse-engineer an undocumented database. Not so much when the ID's being searched on are integers.
Happy to find out I am not the only one who needed to find a specific value in the database when I did not know in which table and which column to look for.
May 4, 2016 at 3:25 pm
Yeah i actually have the same thing over here, just that its generally slower than the one i posted. This script does give you the counts still just that its not just for one column that qualifies. It still however is very helpful in giving me a good idea how the data is laid out in a database even in its current format. For example lookups usually will only have a count of one allowing me to just look at those if all im trying to do is find a lookup for a column in the db. I dont use it many times a day but it is one of my most important tools i have in my toolbox for doing imports of a clients data into our systems over here. Mainly because my clients are all generally not database savvy and all they can seem to ever provide is a screenshot of some random data they want imported. So its critical in giving me a way to find data but also to get a picture of the structure of the data itself. I generally like to just look at the database and do some queries of it well before i do any script writing for an import. I find it saves a lot of time having a good mental picture of the database and how its laid out. So good to here im not the only one needing this also. 😀
May 5, 2016 at 8:11 am
Yup, in a database designed by a third party, where some user added fields are labelled with a non human-significant (meaningful) name and filled with non-human significant uniqueidenitifier values, finding the table which had an exact count of 1 for a is PRECISELY what I was looking for. Hence the need for the count.
Having a count by column might be overdoing things a bit but since such reverse-engineering is not a standard process of a production environment, I could afford to run it off hours without bothering anyone.
Regards
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply