February 11, 2016 at 9:22 am
Does anyone have any views on what is the best file format to request data to be sent in?
We have a number of processes where files are sent to use via FTP from external entities such as suppliers, and customers.
Invariably, the files will come in a format other than as specified. For example, we specify a certain CSV format, with a specified delimiter etc. It often comes in with the columns in the wrong order but there is little that can be done about that.
I am wondering if it might be better to request the data in xlsx format for example?
February 11, 2016 at 10:08 am
mikefle (2/11/2016)
Does anyone have any views on what is the best file format to request data to be sent in?We have a number of processes where files are sent to use via FTP from external entities such as suppliers, and customers.
Invariably, the files will come in a format other than as specified. For example, we specify a certain CSV format, with a specified delimiter etc. It often comes in with the columns in the wrong order but there is little that can be done about that.
I am wondering if it might be better to request the data in xlsx format for example?
Regarding XLSX: don't do it. The Excel driver has a tendency to 'helpfully' adjust data from Excel spreadsheets based on its guess about what data type it is returning. The flat file option is easier.
On the other hand, if they can send the data in Access format, for example, all of the data typing has been done for you and therefore there are likely to be fewer errors.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
February 11, 2016 at 1:44 pm
Thanks for the warning Phil.
I remember now coming across that problem before and having to change the registry to disable the Jet driver from guessing based on the first x number of rows. Looks like it's one to avoid.
I am going to be very concise when specifying and hope for the best again.
Unfortunately it's an area prone to problems.
February 11, 2016 at 2:03 pm
mikefle (2/11/2016)
Does anyone have any views on what is the best file format to request data to be sent in?We have a number of processes where files are sent to use via FTP from external entities such as suppliers, and customers.
Invariably, the files will come in a format other than as specified. For example, we specify a certain CSV format, with a specified delimiter etc. It often comes in with the columns in the wrong order but there is little that can be done about that.
I am wondering if it might be better to request the data in xlsx format for example?
No, xlsx wont' solve your problem.
Best bet is send back reject messages to the offending party so they get that you need the files as specified in the transmission specs.
Send them a copy of the specs every time too, that helps.
February 11, 2016 at 3:32 pm
mikefle (2/11/2016)
Does anyone have any views on what is the best file format to request data to be sent in?We have a number of processes where files are sent to use via FTP from external entities such as suppliers, and customers.
Invariably, the files will come in a format other than as specified. For example, we specify a certain CSV format, with a specified delimiter etc. It often comes in with the columns in the wrong order but there is little that can be done about that.
I am wondering if it might be better to request the data in xlsx format for example?
If your data providers cannot produce a simple flat-file correctly and consistently then all hope might be lost...
If you want to make sure data is good before you even try loading it, ask them to start providing XML files with an accompanying XSD. The XSD defines the format of the XML file including elements, attributes, types, max lengths, default values, etc. A nice side benefit of XML is that element-order (i.e. elements map to database columns) is not important. Also, new elements can be added to the XML schema without breaking existing imports and you get all the nice benefits I mentioned about guaranteed valid file structure and data types.
Like I said though, if they cannot provide a simple flat-file properly then XML, XLSX, pretty much requesting anything else might be a lost cause. You may just want to build a very dynamic file loader that can deal with column-order and data-type problems gracefully. You can do this kind of dynamic loading in scripting languages like PowerShell or programming languages like C# (stand-alone or within SSIS) or even raw T-SQL with some surrounding command-line tools but whichever the technology you select you'll be doing a lot of custom coding to deal with the ground moving under your feet because your data providers are not competent at producing data files.
Best case is to start trying to manage the data providers, working with them more closely, and putting incentives and deterrents into place in an attempt to get them to improve their deliveries.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply