Identifying the problem
I am in a unique situation where I work. I ware two hats. Of course, I’m the local DBA doing everything SQL, from administration to development. But, I also ware the MCSE hat. As such, I often get asked many various things from my CIO. Everything from “Can you fix the email and give me reports on usage” to more recently, “Can you tell me how to best clear up the Home Drives folder, which is full?”
Identifying the solution
Interestingly enough, a solution to the first question spawned a solution for the second. I had already developed some code to read IMAP files into SQL, in order to produce Email Reports (A subject for another article). When he came to me with the problem of the Home Directory drive, I immediately thought of the IMAP solution as a tool to help me help him. The Home Directory was a 60Gb partition on our NT4.0 Cluster Server (Yes, I know, NT4 is going out of support at the end of the year, but “Were planning on having it upgraded by then…” Yeah, right). Anyway, This 60Gb partition was the home directory for over 300 employees and has been running virtually unmanaged for over 6 years. An attempt was made, at one point, to implement Disk Quotas, but it didn’t work for various technical reasons. We knew that there were old files out there. We knew that much of the data could be archived to CD… but how to intelligently pick out that data? How can we tell what users have the oldest files? How can we tell what users have the largest? To answer these questions with Window Explorer would take 20 minutes each! Change one parameter and you have to invest another 20 minutes to find out that the parameter didn’t give you what you wanted. Enter SQL.
Developing the solution
I quickly adapted the code, which I had written to read IMAP Email files from a hierarchical email folder structure, and adapted it to read all the files on the Home drives. In a nutshell, I used the DOS command ‘DIR’ to do a recursive search of the Home Drive. Luckily, I’m a Domain Administrator, so I had all the access permissions I needed. If you’re not a Domain admin, you might try having your admin staff add the SQL Server Agent account to the Backup Operators group. That group usually has all the read permissions you’ll need. Anyway, I captured the output of the recursive dir ( “DIR \\FILESERVER\HOME$\*.* /S”) and then read the output into a SQL Temp Table. This, I then parsed into a SQL Table Variable. (Faster than writing to disk). Once my table was parsed, I was able to bulk insert the data into a permanent table and the script ended.
Implementing the solution
\\ccathers\
\\ccathers\dev folders\
\\ccathers\Articles\
\\ccathers\dev folders\project Autotime\
A sample of the folder structure on the Home Drive
Luckily for me, my admin staff had been very consistent with creating the home folders. Each users folder was also their network ID, and this 'user root' folder was placed in the root of the Home Drive. Hence, I was able to pattern search the output, and look for the beginning of each tree path starting from the second character position to the first occurrence of a backslash '\' (See example above), and this was the user. The rest of the folder tree was user created, and was the various pathing information needed to find files. Thus, I would want to retain that as well.
Once I had pulled the data in, I was able to create some reports that the admins staff just ate up. I was able to tell them what were the largest 10 percent of files on the disk. Who owned them, how old they were, and how much space could be reclaimed if they were deleted.
Another report indicated what were the oldest files. I was able to create a report that indicated what oldest files needed to be deleted to reclaim 30 GB, or half, of drive space. (Try that one with a windows search)
I was also able to provide a report, which ranked users by most files in their home folder, or largest home folder, or most oldest data in their home folders.
The time savings were tremendous. It had taken me 27 minutes to gather the data on 276,933 files in 3,827 folders! The reports I was able to produce with the power of T-SQL saved over 2 hours when compared to using Windows Explorer.
Now, I recognize that Active Directory, or SMS, or many other third party software packages could do the same thing. But, we are still a 4.0 shop. No SMS (Yet) and no interest from the CIO in spending more money on third party solutions. I ran the data gathering for this report Friday afternoon. On Monday, at 9:30 am, the home drive filled to capacity and users started having problems with applications like Outlook (Don’t yell at me! I didn’t put the PST files there!). I was able to produce the reports and give them to the admin staff, who in turn, were able to make informed decisions on what files to delete, and what users had to be notified to clean up their home drives. There is talk about using this on a weekly, or monthly basis, to monitor home drives, until we can implement 2000, or 2003 disk quotas. I have also entertained requests for collecting data on other drives. We have a SOX documentation folder that Admin wanted me to start tracking. There is also a Document Retrieval System, our engineering department created, that stores, and prints, PDF files. The Files and folders are constantly being added to the folder structure by an automated system, and the engineering staff has expressed interest in my solution. It takes them an hour and a half to gather data on 190,000 files in 30,000 folders. It had taken my system 13 minutes and 5 seconds to get the same data. That’s when I realized that others might be interested in what I had done.
The Code…
The first statement after the title comments, which is bracketed by equal signs (=================================================), is the location where you want to change the path to one that is relevant in your network. When I ran this script I did so from within the Northwind Database context. The script will automatically crate a permanent table, File_List, and dump the data into that. So, I wound up with a new table in the NorthWind Database called File_Lists. You, however, can run the script from any database you see fit. SQL 7.0 users, you will need to do a find and replace on @FILE_LIST TABLE to #FILE_LIST TABLE to make this script 7.0 compliant. Though, the benefits of a table variable are lost, and the savvy coder may want to reconfigure the script to use the permanent table instead...
-------------------------- Script to gather data ---------------------------------------------- /***************************************************************** ** Name : Using SQL to do File Searches. ** ** Description : Script will use a DOS command to read a directory ** structure, then parse the output into a permanent ** table the script crates. ** ** For SQL 7.0, Either replace the table variable @FILE_LIST ** with a temp table #FILE_LIST. Or, move the creation ** of the permanent file FILE_LIST to the beginning of ** the script and write directly to it. (The Table ** variable @FILE_LIST, as memory resident, was used ** to increase performance) ** ** Written By : Chris Cathers 10/1/04 ** ** Uses Tables: FILE_LIST ** ** Parameters : ** ** Returns : ** ** Modifications: ** ** ** ** ** *****************************************************************/SET NOCOUNT ON DECLARE @CMD VARCHAR(255), @PATH VARCHAR(255), @DOS_RTN VARCHAR(255) DECLARE @FILE_LIST TABLE ( M_PATH VARCHAR(255), M_DATE VARCHAR(22), M_SIZE VARCHAR(16), M_FILE VARCHAR(215) ) CREATE TABLE #DOS_RTN (DOS_RTN VARCHAR(2000) ) /+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Get the list of all mail records from the archive folders +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++/ -- ================================================= -- This is where you enter the path to the drive you want -- to generate reports on. SET @CMD = 'DIR "\\jk-user\home$\*.*" /S' -- ================================================= INSERT INTO #DOS_RTN EXEC MASTER..XP_CMDSHELL @CMD DECLARE PARSE_DOS_LIST CURSOR FOR SELECT LTRIM(DOS_RTN) FROM #DOS_RTN WHERE SUBSTRING(LTRIM(DOS_RTN), 1, 12) = 'Directory of' OR (DOS_RTN NOT LIKE '% BYTES%' AND DOS_RTN NOT LIKE '% FILES %' AND SUBSTRING(DOS_RTN, 25, 5) = ' ') -- ================================================= -- The cursor's select statement will strip irrelevant -- data from the raw dos return, and only return values -- that are valid for parsing. -- Such as Files and Directory paths. -- ================================================= OPEN PARSE_DOS_LIST FETCH NEXT FROM PARSE_DOS_LIST INTO @DOS_RTN WHILE @@FETCH_STATUS = 0 BEGIN IF SUBSTRING(LTRIM(@DOS_RTN), 1, 12) = 'Directory of' SET @PATH = SUBSTRING(LTRIM(@DOS_RTN), 30, 255) ELSE INSERT INTO @FILE_LIST VALUES (@PATH, SUBSTRING(@DOS_RTN, 1, 22), SUBSTRING(@DOS_RTN, 23, 16), SUBSTRING(@DOS_RTN, 40, 215)) FETCH NEXT FROM PARSE_DOS_LIST INTO @DOS_RTN END CLOSE PARSE_DOS_LIST DEALLOCATE PARSE_DOS_LIST -- ================================================= -- The script will create the permanent tale FILE_LIST -- and dump the parsed data into it. -- ================================================= CREATE TABLE FILE_LIST ( M_PATH VARCHAR(255), M_DATE VARCHAR(22), M_SIZE VARCHAR(16), M_FILE VARCHAR(215) ) TRUNCATE TABLE FILE_LIST INSERT INTO FILE_LIST SELECT * FROM @FILE_LIST DROP TABLE #DOS_RTN -- ================================================= -- Take a look at the parsed data! -- -- SELECT * FROM FILE_LIST -- -- Left commented for performance -- ================================================= -------------------------- End Data gathering Script ----------------------------------------------
Here are some reporting scripts that can be quickly adapted to your individual needs. As it is posted here, the script will do nothing. You will need to read the header of each section to determine if it is the report you want. If so, uncomment the block of code in that section. (Highlight the block and CTL-SHIFT-R). The Table Variable at the beginning of the script is a common variable used by each block. So, don't comment or delete it. SQL 7.0 users, do a find and replace on @FILE_LIST to #FILE_LIST, to make this script 7.0 compliant.
-------------------------- Data Reporting Scripts ------------------------------------------------- /***************************************************************** ** Name : File Analysis ** ** Description : Various reports that can be run against the ** Directory Data that has already been gathered ** and parsed into the SQL table File_List ** ** To use, uncomment one section at a time, to get ** the report you want. ** ** Written By : Chris Cathers 10/4/04 ** ** Uses Tables: ** ** Parameters : ** ** Returns : ** ** Modifications: ** ** ** ** ** *****************************************************************/USE NORTHWIND -- ================================================= -- I had dumped the file data into the Northwind database -- You can change that to any you like. -- -- The table variable @FILE_LIST is used in all the reports -- below. For SQL 7.0 systems, change this to a Table Variable -- ================================================= DECLARE @FILE_LIST TABLE ( OWNER VARCHAR(25), PATH VARCHAR(255), DATESTAMP DATETIME, FILE_SIZE VARCHAR(16), FILE_NAME VARCHAR(215) ) SET NOCOUNT ON -- /************************************************* -- FILES BY SIZE -- -- Returns the top 10 percent biggest files. -- *************************************************/-- -- INSERT INTO @FILE_LIST -- select TOP 10 PERCENT -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) -- ELSE M_PATH -- END AS OWNER, -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, CHARINDEX('\', M_PATH) + 1, LEN(M_PATH)) -- ELSE 'ROOT' -- END AS PATH, -- CASE -- WHEN SUBSTRING(M_DATE, 7,4) < 1753 THEN '1/1/1753' -- ELSE M_DATE -- END AS M_DATE, -- M_SIZE, -- M_FILE -- from file_list -- ORDER BY M_SIZE DESC -- -- SELECT 'TOP 10%' AS OWNER, -- 'BIGEST FILES WILL =' AS PATH, -- '' AS DATE_STAMP, -- CASE -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000000 AS VARCHAR(16)) + ' Gb' -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000 AS VARCHAR(16)) + ' Mb' -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000 AS VARCHAR(16)) + ' Kb' -- ELSE CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) AS VARCHAR(16)) + ' BYTES' -- END AS FILE_SIZE, -- 'THAT CAN BE DROPPED' AS FILE_NAME -- from @file_list -- UNION ALL -- SELECT * FROM @FILE_LIST -- /************************************************* -- FILES BY DATE -- -- This will rank files by date. The Where clause can be -- adapted to stop at a date, or file size on disk (That -- way a report of the oldest files comprising X bytes -- of disk space can be generated) -- In it's present form, the report returns the oldest -- X percent of files. -- *************************************************/-- -- INSERT INTO @FILE_LIST -- select TOP 45 PERCENT -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) -- ELSE M_PATH -- END AS OWNER, -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, CHARINDEX('\', M_PATH) + 1, LEN(M_PATH)) -- ELSE 'ROOT' -- END AS PATH, -- CASE -- WHEN SUBSTRING(M_DATE, 7,4) < 1753 THEN '1/1/1753' -- ELSE M_DATE -- END AS M_DATE, -- M_SIZE, -- M_FILE -- from file_list -- ORDER BY M_DATE -- -- SELECT 'TOP 45%' AS OWNER, -- 'OLDEST FILES WILL =' AS PATH, -- NULL AS DATE_STAMP, -- CASE -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000000 AS VARCHAR(16)) + ' Gb' -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000 AS VARCHAR(16)) + ' Mb' -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000 AS VARCHAR(16)) + ' Kb' -- ELSE CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) AS VARCHAR(16)) + ' BYTES' -- END AS FILE_SIZE, -- 'THAT CAN BE DROPPED' AS FILE_NAME -- from @file_list -- UNION ALL -- SELECT * FROM @FILE_LIST -- /************************************************* -- FILES BY TYPE -- -- This will return the path and name of all files -- on the disk of a certain extension. Currently the -- extension must include the period (.) and can not be -- larger than 4 characters. This, of course can be -- changed to suit individual needs. The output will -- also include how much space the given extension -- consumes. -- NOTE: There has been a special adaptation of this -- query that will return multiple file extensions. -- *************************************************/-- DECLARE @FILE_TYPE VARCHAR(5) -- SET @FILE_TYPE = '.JPG' -- -- INSERT INTO @FILE_LIST -- select -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) -- ELSE M_PATH -- END AS OWNER, -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, CHARINDEX('\', M_PATH) + 1, LEN(M_PATH)) -- ELSE 'ROOT' -- END AS PATH, -- CASE -- WHEN SUBSTRING(M_DATE, 7,4) < 1753 THEN '1/1/1753' -- ELSE M_DATE -- END AS M_DATE, -- M_SIZE, -- M_FILE -- from file_list -- WHERE RIGHT(M_FILE, LEN(@FILE_TYPE)) = @FILE_TYPE -- ORDER BY M_SIZE DESC -- -- SELECT 'FILE OF FILETYPE' AS OWNER, -- @FILE_TYPE + ' WILL =' AS PATH, -- NULL AS DATE_STAMP, -- CASE -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000000 AS VARCHAR(16)) + ' Gb' -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000 AS VARCHAR(16)) + ' Mb' -- WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000 AS VARCHAR(16)) + ' Kb' -- ELSE CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) AS VARCHAR(16)) + ' BYTES' -- END AS FILE_SIZE, -- 'THAT CAN BE DROPPED' AS FILE_NAME -- from @file_list -- UNION ALL -- SELECT * FROM @FILE_LIST -- /************************************************* -- LARGEST FOLDER, BY USER -- -- Rank users or folders by which has the most number -- of files. -- NOTE: By adjusting the WHERE clause, this can be -- adapted to rank users by size on disk. Or, even -- rank users by who has the largest folder with the -- oldest data! -- *************************************************/-- -- INSERT INTO @FILE_LIST -- select -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) -- ELSE M_PATH -- END AS OWNER, -- COUNT(M_PATH) AS PATH, -- MAX(CASE -- WHEN SUBSTRING(M_DATE, 7,4) < 1753 THEN '1/1/1753' -- ELSE M_DATE -- END) AS M_DATE, -- CASE -- WHEN SUM(CAST(REPLACE(M_SIZE,',', '') AS FLOAT)) > 1000000000 THEN CAST(SUM(CAST(REPLACE(M_SIZE,',', '') AS FLOAT))/1000000000 AS VARCHAR(16)) + ' Gb' -- WHEN SUM(CAST(REPLACE(M_SIZE,',', '') AS FLOAT)) > 1000000 THEN CAST(SUM(CAST(REPLACE(M_SIZE,',', '') AS FLOAT))/1000000 AS VARCHAR(16)) + ' Mb' -- WHEN SUM(CAST(REPLACE(M_SIZE,',', '') AS FLOAT)) > 1000 THEN CAST(SUM(CAST(REPLACE(M_SIZE,',', '') AS FLOAT))/1000 AS VARCHAR(16)) + ' Kb' -- ELSE CAST(SUM(CAST(REPLACE(M_SIZE,',', '') AS FLOAT)) AS VARCHAR(16)) + ' BYTES' -- END AS FILE_SIZE, -- '' AS M_FILE -- from file_list -- GROUP BY CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) -- ELSE M_PATH -- END -- -- SELECT OWNER, -- DATESTAMP AS MOST_RECNT_DATE, -- FILE_SIZE AS FOLDER_SIZE, -- PATH AS FILE_COUNT -- FROM @FILE_LIST -- ORDER BY CAST(PATH AS INT) DESC -- /************************************************* -- User Lookup, File List. -- -- Once a specific user or folder has been identified. -- Find all the files and paths in that folder. -- *************************************************/-- DECLARE @USER_NM VARCHAR(50) -- SET @USER_NM = 'NSHAEFFER' -- -- SELECT -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) -- ELSE M_PATH -- END AS OWNER, -- CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, CHARINDEX('\', M_PATH) + 1, LEN(M_PATH)) -- ELSE 'ROOT' -- END AS PATH, -- CASE -- WHEN SUBSTRING(M_DATE, 7,4) < 1753 THEN '1/1/1753' -- ELSE M_DATE -- END AS DATESTAMP, -- M_FILE, -- M_SIZE -- FROM FILE_LIST -- WHERE CASE -- WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) -- ELSE M_PATH -- END = @USER_NM -- ORDER BY CASE -- WHEN SUBSTRING(M_DATE, 7,4) < 1753 THEN '1/1/1753' -- ELSE M_DATE -- END DESC -------------------------- End Data Reporting Scripts -------------------------------------------------
This last bit of Code is a specialized report that will return data on multiple extensions. It will, however, require two custom functions that I have written, FN_PATCOUNT and FN_REPETITIVE_STR_PARSE. I've written a previous article about their functionality and use.
Make modifications in the section bracketed by (=========== MODIFY THIS STRING ONLY ========), right after the comments at the top. Format is '.ext1, ext2, extN...'
-------------------------- Special Multi-Extension Report ---------------------------------------------- /***************************************************************** ** Name : Extension Report ** ** Description : Will return a list of all files of given extension ** type. The file Extension string needs to be ** .(extension) separated by commas. ** ** NOTE: Uses Custom Functions FN_PATCOUNT and FN_REPETITIVE_STR_PARSE ** ** Written By : Chris Cathers 10/4/04 ** ** Uses Tables: ** ** Parameters : ** ** Returns : ** ** Modifications: ** ** ** ** ** *****************************************************************//************************************************* FILES BY TYPE *************************************************/USE NORTHWIND SET NOCOUNT ON DECLARE @FILE_TYPE VARCHAR(255), @EXT_IDX TINYINT, @EXT_CNT TINYINT SELECT @FILE_TYPE = -- =========== MODIFY THIS STRING ONLY ======== '.JPG, .IMG, .GIF, .MP3, .JPEG' -- =========== MODIFY THIS STRING ONLY ======== ,@EXT_CNT = DBO.FN_PATCOUNT(',', @FILE_TYPE), @EXT_IDX = 1 DECLARE @TYPE_LIST TABLE (FILE_TYPE VARCHAR(10) ) DECLARE @FILE_LIST TABLE ( OWNER VARCHAR(25), PATH VARCHAR(255), DATESTAMP DATETIME, FILE_SIZE VARCHAR(16), FILE_NAME VARCHAR(215) ) WHILE @EXT_IDX <= @EXT_CNT BEGIN INSERT INTO @TYPE_LIST SELECT DBO.FN_REPETITIVE_STR_PARSE(@FILE_TYPE, ',', @EXT_IDX) SET @EXT_IDX = @EXT_IDX + 1 END DECLARE Recursive_file_search CURSOR FOR SELECT * FROM @TYPE_LIST OPEN Recursive_file_search FETCH NEXT FROM Recursive_file_search INTO @FILE_TYPE WHILE @@FETCH_STATUS = 0 BEGIN INSERT INTO @FILE_LIST select CASE WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, 1, CHARINDEX('\', M_PATH)-1) ELSE M_PATH END AS OWNER, CASE WHEN CHARINDEX('\', M_PATH) > 0 THEN SUBSTRING(M_PATH, CHARINDEX('\', M_PATH) + 1, LEN(M_PATH)) ELSE 'ROOT' END AS PATH, CASE WHEN SUBSTRING(M_DATE, 7,4) < 1753 THEN '1/1/1753' ELSE M_DATE END AS M_DATE, M_SIZE, M_FILE from file_list WHERE RIGHT(M_FILE, LEN(@FILE_TYPE)) = @FILE_TYPE ORDER BY M_SIZE DESC FETCH NEXT FROM Recursive_file_search INTO @FILE_TYPE END CLOSE Recursive_file_search DEALLOCATE Recursive_file_search SELECT 'FILE OF FILETYPE' AS OWNER, TL.FILE_TYPE + ' WILL =' AS PATH, NULL AS DATE_STAMP, CASE WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000000 AS VARCHAR(16)) + ' Gb' WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000000 AS VARCHAR(16)) + ' Mb' WHEN SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) > 1000 THEN CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT))/1000 AS VARCHAR(16)) + ' Kb' ELSE CAST(SUM(CAST(REPLACE(FILE_SIZE,',', '') AS FLOAT)) AS VARCHAR(16)) + ' BYTES' END AS FILE_SIZE, 'THAT CAN BE DROPPED' AS FILE_NAME from @file_list FL INNER JOIN @TYPE_LIST TL ON TL.FILE_TYPE = RIGHT(FL.FILE_NAME, LEN(TL.FILE_TYPE)) GROUP BY FILE_TYPE UNION ALL SELECT * FROM @FILE_LIST -------------------------- End Multi-Extension Report ----------------------------------------------
Conclusions
As I have demonstrated here, SQL is a very powerful tool that can be adapted to many different situations. A little out of the box thinking never hurts, and SQL is flexible enough to take you where you want to go. All you need to remember, as a DBA, is that computers excel at processing data, much of what is stored on a computer is data of one form or another, and SQL is one of the most powerful tools for gathering and processing data!
Sometimes the hardest part is determining how to get the data in to SQL so you can manipulate it. In this circumstance, I have used the DOS commands from a bygone era of computing, and integrated them with one of the most powerful, and modern tools, to allow me to solve a problem in a way that neither can do alone.
Any one who is looking to implement a Disk Quota on an existing drive could use these scripts. They can determine the Quota Threshold and use the scripts to find all users that exceed the threshold. Anyone who has an automated system storing files on a drive, and wants to automatically gather data on that file system, can use these scripts.
In short, if you have ever wished you could use a SQL query in a Window Search, then you can use these scripts!