March 26, 2014 at 9:05 am
Normally, I receive text files from vendors in your typical CSV, Pipe, or fixed width formats. Not a problem; however, this time a vendor changed formats and they put a single row of information in multiple rows. Lets see if I can explain well....
Rows 1 and 4 have a digit at the end which signifies they go together, and everything between those rows goes along with it. I need them all on the same row. Rows 2 and 3 (called Grantor, and Grantee) don't have anything to tie them back to the other rows. I could write a proc to tie them all together, but I can't do that until I get it into the db. This has to be done in Sql Server, no .Net support is available (don't know .Net anyway.)
What I did was import to a table with no delimiters, just 1 column. I also added an Identity column with Unique constraint. Now Rows 1 through 4 are numbered, and verified that they match the original text file. I then Created 4 tables. Row 1 data is Document Data so that is seperated out very easily. Row 4 data is Property, also seperated out easily. Rows 2 and 3, same thing, easy to seperate.
Rows 1 and 4 have a number on the end that I can use to join them together into one row. I created a query to pull in the other rows (2 and 3) where the ID number is between the document.ID and the Property.ID numbers. This appears to work, I've checked it a couple dozen times across multiple files from the vendor and haven't found one wrong yet. However... I need some sort of documentation that demonstrates that the flat file comes into SQL Server in the same order it occurs in the table.
Anyone?
Greatly appreciated.
Thanks, Crusty.
March 26, 2014 at 9:18 am
In the past I've been told to specify MAXDOP of 1 if I've needed to import large files and keep the sequence, because SQL Server may decide to run parallel threads, which means that data may not be loaded in the sequence it was stored in the flat file.
I've never had time to prove that one way or the other but I might have to go and have a closer look now.
March 26, 2014 at 9:22 am
Can you go into more detail about MAXDOP? I don't believe I've seen that, or noticed it, in BIDS containers.
March 26, 2014 at 9:37 am
CptCrusty1 (3/26/2014)
Can you go into more detail about MAXDOP? I don't believe I've seen that, or noticed it, in BIDS containers.
In BIDS (which you didn't mention you were using) the only way you can specifiy a MAXDOP will be by executing a SQL script, that has it specified within it. Although I've never played with the settings, I believe BIDS (which I assume you mean SSIS) has MaxConcurrentExecutables and EngineThreads which aren't quite the same.
However, I'm prepared to be corrected by those with more experience in SSIS than myself.
March 26, 2014 at 11:29 am
As Brain mentioned the only way you can specify MAXDOP from within SSIS is to use it with SQL Query. MAXDOP =1 means disabling parallelism.
Check this ..
http://technet.microsoft.com/en-us/library/ms181007(v=sql.105).aspx
You can do this may be .. Disable Max Degree of parallelism on SQL Server before the import and set it back to its original value after the import.
But you need to test this to see if Paralleism setting alters the import order.
--
SQLBuddy
March 26, 2014 at 11:33 am
@Brain... Do you have a friend named Pinky?
Ok, so I figured I'd use an Execute SQL Task to set the param to 1; however, is this going to affect the entire server, or just the instance that I'm using at that time. That could have drastic consequences.
Crusty.
March 26, 2014 at 11:54 am
It will affect the only the query if you specify it as a Query Hint. If you configure that at server level, it affects the whole instance.
--
SQLBuddy
March 26, 2014 at 11:56 am
I can't configure it in a query hint since it's not in a query. I'm using a Data Flow Task in BIDS to import from a Flat File (Flat File Connection to the text file) and pushing straight to a table.
March 26, 2014 at 12:29 pm
CptCrusty1 (3/26/2014)
I can't configure it in a query hint since it's not in a query. I'm using a Data Flow Task in BIDS to import from a Flat File (Flat File Connection to the text file) and pushing straight to a table.
Then you should probably try this approach ..
You can do this may be .. Disable Max Degree of parallelism on SQL Server before the import and set it back to its original value after the import.
One more question .. At what time do you do the load and how long does it take ? And do have admin rights on SQL Server to do this change ?
--
SQLBuddy
March 27, 2014 at 2:25 am
CptCrusty1 (3/26/2014)
I can't configure it in a query hint since it's not in a query. I'm using a Data Flow Task in BIDS to import from a Flat File (Flat File Connection to the text file) and pushing straight to a table.
You'd have to replace the Data Flow Task with a script task, where the MAXDOP could be specified (which would only impact this query). I wouldn't be keen on setting MAXDOP across the entire server.
Another option is to use the Data Flow Task to put the data into a 'staging' table and then write code to load the final destination in the manner you require.
Beyond these suggestions we'd have to see some example data to play with.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply