September 29, 2021 at 7:00 pm
Hi all,
I have a requirement to read csv files skipping the First X Rows and Last Y Rows using OPENROWSET (preferably) or BULK INSERT.
We don't have control over the incoming files but the patterns are fixed. Ex: File1.CSV has always 3 rows at the top & 10 rows at the end that we need to skip.
I understand there may be other tools (ex: SSIS etc.) that are best suited to handle these types of issues but I am curious to see if other members faced this issue and how you have overcome those issues.
Both these have a LASTROW option but it requires us to know the row count upfront. Ideally, there is an option like LASTROW = -10 which skips the last 10 rows.
Appreciate your help.
Thanks
September 29, 2021 at 9:59 pm
This was removed by the editor as SPAM
September 30, 2021 at 2:55 am
You can do something like the following link contains to "pre-process" a file. If you can use xp_CmdShell (it's NOT the security risk most will say it is if you do it right), it can all be under the control of SQLServer from an SQL Agent Job or whatever.
Another way to do it would be to bulk insert it as whole rows and then pluck what you need. If that's a CSV or TSV file, you export "the good stuff" to a file and then import it as a delimited file directly into a table. Or, you could split it once you've imported the whole rows. Handling stray quotes like what Excel produces when a delimiter is found in a cell is pretty easy to handle, as well. And, I also use the ACE drivers and a little T-SQL dity that will read spreadsheets and auto-magically self-adjust for monly changes as new columns are added, etc, without even having to look at the spreadsheet.
If you don't actually know how to do the things I suggest, it's a bit too much to go through on a thread like this without a sample file to work with.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 30, 2021 at 3:51 am
Hi Jeff,
Thanks for your note. Can you please reshare the link? It didn't come through in your post.
Here are two sample data sets that I made up:
Objective: Has Header. Ignore First Row & Last Five Rows and load the file into SQL
-- Example 1:
month: 9/1/2021 - 9/28/2021
hdr1,hdr2,hdr3,hdr4, hdr5
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
16,17,18,19,20
data filters: filter1, filter2, filter3
report name: dummy report
4 data rows
sales $1000
produced: 9/29/2021
-- Example 2: No header, Inconsistent quotes. Assumption: Format file will be supplied
month: 9/1/2021 - 9/28/2021
1,2,3,4,5,100 state st
6,7,8,9,10, "3755, president drive"
11,12,13,14,15, 200 state st
16,17,18,19,20, 300 state st
data filters: filter1, filter2, filter3
report name: dummy report
4 data rows
sales $1000
produced: 9/29/2021
Thanks!
October 5, 2021 at 6:33 am
This was removed by the editor as SPAM
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply