Load folder full of csv files

  • Do you ever use the tool from dbatools Import-Excel for Powershell?  I tried creating csv's out of them then loading the csv

    to SQL, but some columns contained commas in descriptions and that thru off the loading sequence.  Don't I need to load the ACE driver on the server to use that technique, I was trying to find a fool proof method for loading these xls files. The are pretty standard files maybe 10 columns..

    Thanks

  • Yes, you would need to download and install the ACE drivers.  If you use the quiet mode, you can do it without have to make the mistaken of uninstalling all 32 bit apps.  Having people export their files to CSV is a bit like asking a porta-potty transporter to white-wash your fence.

    No.  I've not used the DBATools for any work with spreadsheets.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I used to try to solve such quest from within TSQL, as some others suggest.

    But nowadays, I keep it out of TSQL using Powershell dbatools !

    #of course I do have error handling and progress bar in place. This is just an example !
    foreach ( $csv in (Get-ChildItem -Path $SourceFolder -Filter '*.csv')) {
    Import-DbaCsv -path $csv -SqlInstance mySQLServer -SqlCredential $cred `
    -Database myDatabaseName -Schema myTableSchema -Table myTableName `
    -ColumnMap $ColumnsMap -NotifyAfter 10 -erroraction Stop ;
    }

    • This reply was modified 3 years, 5 months ago by  Johan Bijnens.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Can you send a more full blown version, and how I get(download) dbatools. I went to the page but didn't see how to download.

    Many thanks!!!!

  • Bruin wrote:

    Can you send a more full blown version, and how I get(download) dbatools. I went to the page but didn't see how to download.

    Many thanks!!!!

    You mean you haven't actually done a "google" on "DBATools", yet?  I suggest that be the next thing you do if you're actually interested and then read all the stuff on the site.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Found it.. it was under Getting Started .. not Downloads..

    Thanks.

  • @jeff

    Had a strange error processing file from server...

    It was the first file that had data...

    StartDT: 15 Jul 2021 08:44:42:777

    Working on file 25 of 89 files: G:\line_speed\F303-Line10.csv...

    Msg 2627, Level 14, State 2, Line 2

    Violation of PRIMARY KEY constraint 'PK__#Staging__060785E055D567F6'. Cannot insert duplicate key in object 'dbo.#Staging'. The duplicate key value is (1900-01-01 00:00:00).

    If I copied all the files back to my local machine and ran it it worked fine....

     

  • The file I'm processing has yesterday's dates...

  • Bruin wrote:

    @Jeff

    Had a strange error processing file from server...

    It was the first file that had data...

    StartDT: 15 Jul 2021 08:44:42:777 Working on file 25 of 89 files: G:\line_speed\F303-Line10.csv... Msg 2627, Level 14, State 2, Line 2 Violation of PRIMARY KEY constraint 'PK__#Staging__060785E055D567F6'. Cannot insert duplicate key in object 'dbo.#Staging'. The duplicate key value is (1900-01-01 00:00:00).

    If I copied all the files back to my local machine and ran it it worked fine....

    Ok... I'm  confused by a couple of things...

    First, you say that "It was the first file that had data".  Are you saying that the first 24 files it tried to load had absolutely no data in them?

    If you go back and look at the code for the creation of the #Staging table, you'll see that the first column has a PK on the the first column, which is the time stamp column for the rows.  There error is saying that it ran into duplicate time stamps in the file.  It's also saying that the time stamp was for the 1st of January 1900, which is the time stamp for a blank or "0" value.

    You going to have to load the file into the #staging table without the PK and then troubleshoot what's going on in the file.  If you want, ZIP the file up and attach it and I'll take a look.

    You also say "If I copied all the files back to my local machine and ran it it worked fine....".  I understandably have no idea why that may be.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I had a file that just 10 records all of them had a date of 7/14/2021 with different timestamps. I could see all of the records in NotePad++ so I tried to delete 3 rows at the bottom and try it... Same Error PK violation..  So I deleted all but the 1st row after the header and it loaded the 1 record. I'm attaching the script I use to dump the data from the SQL table.

    If I copy that same file all 10 records to my local machine and run the process it loads all 10 records..

    $server = "(local)"
    $database = "LineData"
    $tablequery = "SELECT name from sys.tables where name like 'f3%'"

    #Delcare Connection Variables
    $connectionTemplate = "Data Source={0};Integrated Security=SSPI;Initial Catalog={1};"
    $connectionString = [string]::Format($connectionTemplate, $server, $database)
    $connection = New-Object System.Data.SqlClient.SqlConnection
    $connection.ConnectionString = $connectionString

    $command = New-Object System.Data.SqlClient.SqlCommand
    $command.CommandText = $tablequery
    $command.Connection = $connection

    #Load up the Tables in a dataset
    $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
    $SqlAdapter.SelectCommand = $command
    $DataSet = New-Object System.Data.DataSet
    $SqlAdapter.Fill($DataSet)
    $connection.Close()
    # Loop through all tables and export a CSV of the Table Data
    foreach ($Row in $DataSet.Tables[0].Rows)
    {
    $queryData = "SELECT stamp_time,wirespeed FROM [$($Row[0])] where convert(date,stamp_time) = '07/14/2021'"

    #Specify the output location of your dump file
    $extractFile = "g:\line_speed\$($Row[0]).csv"

    $command.CommandText = $queryData
    $command.Connection = $connection

    $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
    $SqlAdapter.SelectCommand = $command
    $DataSet = New-Object System.Data.DataSet
    $SqlAdapter.Fill($DataSet)
    $connection.Close()

    $DataSet.Tables[0] | Export-Csv $extractFile -NoTypeInformation
    }
  • Could the issue be OS related our not running from the root C:\???

    very strange error of the PK on the temp table  .. data looks correct

  • Probably not the operating system.  Have you checked the files in the two different places for things like...

    1. Are the both using the same thing when it comes to Unicode and Ansi, etc? It shouldn't be a problem but at least look to see if both machines are using the same code page.
    2. Are both files using the exact same row terminators?

    On your latest script above, did you run that on both machines and get the same results?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

    1. yes
    2. Yes

    3 yes

    It's weird that If I run my PS script on SERVER then copy files down to my local PC they load. If  I copy files from my local PC

    back to SERVER the LOAD fails with PK error

    Not sure how to debug..

    Thanks,

  • The only difference is my Local PC I load off the c:\line_dumps on the SERVER there is a drive defined E: that I write the files from the PS script and load from there.

  • I found the issue it's the define of the Temp Table:

    stamp_time DATETIME2(0) NOT NULL PRIMARY KEY CLUSTERED

    If I just Define as datetime it works fine now...

    THanks.

     

Viewing 15 posts - 31 through 45 (of 45 total)

You must be logged in to reply to this topic. Login to reply