Find Large Files on OS - Results of DIR to Table
Purpose: To identify files over 40 GB that are in the filepath sprcified and its subdirectories. We had an issue where backups would fail sporadically with an error indicating they were out of space. However, when reviewing the disk in the morning, there was clearly enough space for a backup to complete. The intention with this is to schedule it to run hourly to identify a new large file that shows up that would hinder the backups completion. Put the results of DIR in a table for easy querying.
Alternative Usage: Modify the code to be able to pass in a size value to find out the largest files for a given path. How often have you asked the question, what is using up all the space on my D drive? This would allow you to find that out.
PreRequisites: Create a database named utilities or else change the database name in the script. Create a table named LargeFiles in Utilities or else rename it. The script for this table creation is commented out in the header. After compiling and executing, do a select on utilities.dbo.largefiles to see what it found.
Cautions: see various other articles on the usage of xp_cmdshell
I welcome enhancement suggestions and hope this will start you on the path to accomplishing what you need to by putting the results of the DIR command into a table.
USE [Utilities]
GO
Create Procedure RecordLargeOSFiles @Filepath VarChar(300) AS
--Created: 6/5/2008
--Author: Brandon Hunter
--Purpose: Identify files over 40 GB that are in the filepath and its
-- subdirectories. We had an issue where backups would fail
-- sporadically with an error indicating they were out of space.
-- The intention with this is to schedule it to run hourly to
-- identify a new large file that shows up that would hinder
-- the backups completion.
--Required: Database named Utilities
-- Table named LargeFiles:
-- USE [Utilities]
-- GO
-- /****** Object: Table [dbo].[LargeFiles] Script Date: 06/05/2008 08:31:27 ******/
-- SET ANSI_NULLS ON
-- GO
-- SET QUOTED_IDENTIFIER ON
-- GO
-- SET ANSI_PADDING ON
-- GO
-- CREATE TABLE [dbo].[LargeFiles](
-- [FilePath] [varchar](2000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
-- [FileName] [varchar](500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
-- [FileSize] [bigint] NULL,
-- [FileDate] [datetime] NULL,
-- [InsertDate] [datetime] NULL
-- ) ON [PRIMARY]
--
-- GO
-- SET ANSI_PADDING OFF
--Example: Exec RecordLargeOSFiles 't:\'
BEGIN
Declare @DosCMD VarChar(1000)
--Pull out all filenames with path only
Select @DosCMD = 'dir ' + @FilePath + ' /A-S /O-S /S /B'
--Create Table to hold the FileNames and FullPath
CREATE table #FilePaths (FilePaths varchar(2000))
INSERT INTO #FilePaths Exec master..xp_cmdshell @DosCMD
--Add Filenames column
ALTER TABLE #FilePaths Add [FileName] VarChar(2000)
--Extract the FileName
update #FilePaths
set [FileName] = substring(FilePaths,LEN(FILEPATHS) - CHARINDEX('\',REVERSE(FilePaths)) + 2, LEN(FILEPATHS) )
--Remove Null FilePaths
delete #FilePaths where FilePaths is null
--Pull out all information provide by DOS command DIR
Select @DosCMD = 'dir ' + @FilePath + ' /A-S /O-S /S'
--Create Table to hold the DIR info from OS
CREATE table #DriveLarge (Dirres varchar(2000))
INSERT INTO #DriveLarge Exec master..xp_cmdshell @DosCMD
--Remove Null records
delete #DriveLarge where dirres is null
--Remove Directories
delete #DriveLarge where dirres like '%<DIR>%'
--Remove Summary
delete #DriveLarge where dirres LIKE '%FILE(S)%'
--Remove Volume
delete #DriveLarge where dirres LIKE '%volume%'
--Remove extra spaces multiple times
UPDATE #DRIVELARGE
SET DIRRES = replace(dirres, SPACE(2), SPACE(1))
where dirres not like '%directory%'
UPDATE #DRIVELARGE
SET DIRRES = replace(dirres, SPACE(2), SPACE(1))
where dirres not like '%directory%'
UPDATE #DRIVELARGE
SET DIRRES = replace(dirres, SPACE(2), SPACE(1))
where dirres not like '%directory%'
UPDATE #DRIVELARGE
SET DIRRES = replace(dirres, SPACE(2), SPACE(1))
where dirres not like '%directory%'
UPDATE #DRIVELARGE
SET DIRRES = replace(dirres, SPACE(2), SPACE(1))
where dirres not like '%directory%'
--Put the #DriveLarge data in a new temp table for searching
select substring(dirres,1,19) As FileDate,
substring(dirres,21,CHARINDEX(' ',DIRRES,21)-21) As FileSize,
SUBSTRING(dirres,CHARINDEX(' ',DIRRES,21),LEN(dirres)) As FileName
INTO #FileScan
FROM #DriveLarge
where dirres not like '%directory%' AND dirres <> ' Total Files Listed:'
and dirres not like '%dir(s)%'
--Remove commas from filesize
update #FileScan
Set FileSize = Replace(FileSize,',','')
--Record it in db table
INSERT INTO [Utilities].[dbo].[LargeFiles]
([FilePath]
,[FileName]
,[FileSize]
,[FileDate]
,[InsertDate])
select #FilePaths.FilePaths, #FilePaths.[FileName], Convert(BigInt, FileSize) FileSize, Convert(DateTime, FileDate), GetDate()
from #FilePaths
join #FileScan on Rtrim(Ltrim(#FilePaths.[FileName])) = Rtrim(Ltrim(#FileScan.[FileName]))
where Convert(BigInt, FileSize) > 40000000000 --4GB
END