February 10, 2023 at 4:14 pm
Hello everyone, it's my first post here, and I'm no expert on SQL, so I hope you're able to help, and will forgive my ignorance, I'm keen to learn!
I'm trying to import around 100 txt files into a SQL table, for processing ahead of bringing into PowerBI. I've built a query, and a layout file, and have the data, which I've attached (data sample). When I run the query, I get this error:
(17 rows affected)
ClientName_PerformanceAndLoyalty_2019041250.txt
BULK INSERT dbo.SalesHistory FROM 'C:\SQLData\WRsales\ClientName_PerformanceAndLoyalty_2019041250.txt' WITH(
FORMATFILE = 'C:\SQLData\WRsales\SalesFMT.fmt',
FIRSTROW=1
);
Msg 4862, Level 16, State 1, Line 1
Cannot bulk load because the file "C:\SQLData\WRsales\SalesFMT.fmt" is incomplete or could not be read. Operating system error code (null).
Completion time: 2023-02-10T16:00:40.9714899+00:00
The query I'm trying to run is:
DECLARE @filepath varchar(100) = 'C:\SQLData\WRsales\'
,@pattern varchar(100)= 'Waitrose_PerformanceAndLoyalty_*.txt'
,@tableName varchar(100)='dbo.SalesHistory'
DECLARE @query varchar(1000)
DECLARE @numfiles int
DECLARE @filename varchar(100)
DECLARE @files TABLE (SourceFileName varchar(200) NULL)
SET @query = 'master.dbo.xp_cmdshell "dir ' + @filepath+@pattern + ' /b"'
INSERT @files(SourceFileName)
EXEC (@query)
DECLARE CUR CURSOR FAST_FORWARD FOR
SELECT SourceFileName FROM @files WHERE SourceFileName IS NOT NULL
SET @numfiles = 0
OPEN CUR
FETCH NEXT FROM CUR INTO @Filename
WHILE (@@FETCH_STATUS=0)
BEGIN
print @filename
SET @numfiles+=1
SET @query = ('BULK INSERT ' + @tableName
+ ' FROM ''' + @filepath+@filename + ''' WITH(
FORMATFILE = ''C:\SQLData\WRsales\SalesFMT.fmt'',
FIRSTROW=1
);'
)
PRINT @query
EXEC (@query)
FETCH NEXT FROM CUR INTO @filename
END
CLOSE CUR
DEALLOCATE CUR
Any ideas what I'm doing wrong, or where I should start to look to improve this? Beyond that, I'm hoping to add the filename as a column to each row in the data, so that I can connect the dates to the dataset. The final step will be to add new files to this dataset as they appear.
Any guidance on part or all of this would be so gratfully received!
February 10, 2023 at 4:18 pm
Here is the Create Table Query, incase it should be useful...
USE [DBName]
GO
/****** Object: Table [dbo].[SalesHistory] Script Date: 10/02/2023 16:16:45 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[SalesHistory](
[Store_Code] [numeric](18, 0) NULL,
[Product_Code] [nchar](10) NULL,
[Sales_Quantity] [numeric](18, 0) NULL,
[Sales_Value] [decimal](18, 0) NULL,
[Margin] [decimal](18, 0) NULL,
[Spoiled_Value] [decimal](18, 0) NULL,
[Promo_Sales] [decimal](18, 0) NULL,
[Branch_Loyalty] [decimal](18, 0) NULL,
[Estate_Loyalty] [decimal](18, 0) NULL,
[Branch_Loyalty_Score] [decimal](18, 0) NULL,
[Estate_Loyalty_Score] [decimal](18, 0) NULL,
[Customers_In_Risk] [decimal](18, 0) NULL,
[Reliance] [decimal](18, 0) NULL,
[Total_Sales_Volume_Online] [decimal](18, 0) NULL,
[Total_Sales_Volume_Branch] [decimal](18, 0) NULL,
[t14] [nchar](10) NULL,
[t15] [nchar](10) NULL,
[t16] [nchar](10) NULL,
[t17] [nchar](10) NULL,
[t18] [nchar](10) NULL,
[t19] [nchar](10) NULL
) ON [PRIMARY]
GO
February 10, 2023 at 4:53 pm
Format files are notoriously tricky to get right.
However with your sample file and sample table you can import the data easily using
bulk insert saleshistory from 'c:\temp\sh.txt' with
(
fieldterminator ='|',
rowterminator = '\n'
)
However you will have a problem as you have decimal values in the data
726|780289|1|4|1.87|0|1|||||||0|1|||||||
Your margin here is rounded to 2 due to the column not allowing decimal places
It will be the same for other columns also where the column doesn't allow decimal places.
It may be worth while looking at the data some more and redesigning the table to allow for the correct scale and precision on the decimal columns.
February 10, 2023 at 5:05 pm
Hi, good spot for the decimal places, I've added this into the Margin column. Rounding to 2dp is sufficient for now, and I'll actually be dropping all of the columns to the right of Margin in the final version of this data in any case.
I've got the import to work nicely for a single file:
BULK INSERT dbo.SalesHistory
FROM 'C:\SQLData\WRsales\Client_PerformanceAndLoyalty_2019041252.txt'
WITH
(
FIELDTERMINATOR= '|',
ROWTERMINATOR= '0x0A'
)
I just need to work out how to get it to do 100 files...
February 10, 2023 at 5:14 pm
That’s where xp_dirtree would come in handy
Something like the below would be the way to do it. It mentions csv but you can tweak it to whatever file extension you need.
February 10, 2023 at 8:05 pm
Where are the files located? Sounds like a strange question but you have a file path of 'C:\SQLData\WRsales\' for both the data and format files. Understand that file path is trying to find the path on the C: Drive of the SQL Server. You either need to use a URL with a machine reference or setup a share name where the files are being stored. Since you're using BULK INSERT, the person running the code must also have privs to that share.
There are some pretty cool workarounds to the Privs issues but I don't know enough about your system(s) to be able to help there.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 10, 2023 at 8:07 pm
p.s. And I disagree... getting FORMAT files right isn't really a fright in my book.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 10, 2023 at 8:40 pm
So just follow the instructions for it in an article like this?
Create a Format File (SQL Server) - SQL Server | Microsoft Learn
February 10, 2023 at 9:36 pm
That's a good start.
When I was hot and heavy into doing a shedload of them for a whole bunch of different files, I wrote some T-SQL to read the first row (which contained the column names) and had it create a CREATE TABLE statement from that header and the BCP Format file to support it. After that, things are pretty easy. It even figured out if it was tab or comma delimited and a couple of other things.
I'll see if I can find that code but I've gone through a few machines since then. I don't know if I still have it. It wasn't that difficult to write and just about anyone that can figure out how to import the first line of a file should be able to knock out their own version.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 10, 2023 at 10:25 pm
I wrote some T-SQL to read the first row (which contained the column names) and had it create a CREATE TABLE statement from that header and the BCP Format file to support it. After that, things are pretty easy. It even figured out if it was tab or comma delimited and a couple of other things.
I did the exact same thing with Excel spreadsheet, you sure you didn’t steal my idea 🤣
Far away is close at hand in the images of elsewhere.
Anon.
February 11, 2023 at 12:35 am
Jeff Moden wrote:I wrote some T-SQL to read the first row (which contained the column names) and had it create a CREATE TABLE statement from that header and the BCP Format file to support it. After that, things are pretty easy. It even figured out if it was tab or comma delimited and a couple of other things.
I did the exact same thing with Excel spreadsheet, you sure you didn’t steal my idea 🤣
I have one of those, as well. It also automatically adjusts for changes (adding dates, duplicated column names, etc, as spreadsheets often do) and reads them in auto-magically into an EAV. The number of columns can vary and it also automatically figures out where the row labels are. Fun stuff. We should get together on a video meeting someday and compare notes.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 14, 2023 at 9:00 am
Hi Jeff, thank you so much for your replies, I'm just working through them now!
The files are stored on the C drive of the server at the moment: (the client has a very laborious approval process for direct access to data, so it's taking some time to get sorted! In the meantime, I have 100 sample files on the local disk, which I'm testing on. The end game is to get it all into PowerBI for analysis...
February 14, 2023 at 11:46 am
This was removed by the editor as SPAM
February 14, 2023 at 12:49 pm
Still battling away with this.... latest query is:
DECLARE @filepath varchar(100) = 'C:\SQLData\WRsales\'
,@pattern varchar(100)= 'Waitrose_PerformanceAndLoyalty_*.txt'
,@tableName varchar(100)='dbo.SalesHistory'
DECLARE @query varchar(1000)
DECLARE @numfiles int
DECLARE @filename varchar(100)
DECLARE @files TABLE (SourceFileName varchar(200) NULL)
SET @query = 'master.dbo.xp_cmdshell "dir ' + @filepath+@pattern + ' /b"'
INSERT @files(SourceFileName)
EXEC (@query)
DECLARE CUR CURSOR FAST_FORWARD FOR
SELECT SourceFileName FROM @files WHERE SourceFileName IS NOT NULL
SET @numfiles = 0
OPEN CUR
FETCH NEXT FROM CUR INTO @Filename
WHILE (@@FETCH_STATUS=0)
BEGIN
print @filename
SET @numfiles+=1
SET @query = ('BULK INSERT ' + @tableName
+ ' FROM ''' + @filepath+@filename + ''' WITH(
FORMATFILE = ''C:\SQLData\WRsales\SalesFMT.fmt'',
FIRSTROW=1
);'
)
PRINT @query
EXEC (@query)
FETCH NEXT FROM CUR INTO @filename
END
CLOSE CUR
DEALLOCATE CUR
The error is
(17 rows affected)
Waitrose_PerformanceAndLoyalty_2019041250.txt
BULK INSERT dbo.SalesHistory FROM 'C:\SQLData\WRsales\Waitrose_PerformanceAndLoyalty_2019041250.txt' WITH(
FORMATFILE = 'C:\SQLData\WRsales\SalesFMT.fmt',
FIRSTROW=1
);
Msg 4862, Level 16, State 1, Line 1
Cannot bulk load because the file "C:\SQLData\WRsales\SalesFMT.fmt" is incomplete or could not be read. Operating system error code (null).
Completion time: 2023-02-14T12:43:50.1070035+00:00
Any further help would be greatly appreciated...
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply