Time to shift off of configurations and environments and talk more about SSIS packages and best practices. If you haven’t been following this series and are just joining in, we are at day 20 of the 31 Days of SSIS. Now that is out of the way, let’s move on.
Today, we are going to discuss a recent best practice that I’ve been promoting surrounding the use of stored procedures as source objects for data flows. Or to put it more succinctly that in SSIS packages data sources should use table-values functions instead of stored procedures for the SQL Command text.
Best Practice Roots
A couple months ago, I came across the following error message:
SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0×80004005.
An OLE DB record is available. Source: “Microsoft OLE DB Provider for SQL Server” Hresult: 0×80004005 Description: “Protocol error in TDS stream”.
SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component “OLE DB Source” (7) returned error code 0xC0202009. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.
The odd thing about this error is that it only occurred when executing the SSIS package through DTEXEC or SQL Agent. Running the package in BIDS presented no issue and the package was always successful. Redeploy the package and the same issue occurred.
After some research, I found a post on Protocol error in TDS stream. From there I went to the post Stored procedures are not parameterized views by Adam Machanic (Blog | @AdamMachanic). And finally, I checked out Jamie Thompson’s (Blog | @JamieT) post SSIS: Using stored procedures inside an OLE DB Source component.
To summarize what these posts talk about, stored procedures do not have a defined output schema. The output can change depending on the parameters supplied. Furthermore, the metadata that SSIS uses may not be based on the execution path that you intend to us when SSIS package executes.
Output Contract
When SSIS packages are built there needs to be a reasonable expectation that the SQL command text for the data source will provide the columns that the SSIS package requires. It would be great if stored procedures could be designed in a fashion that allowed for one and only one output. This isn’t the case and there are no indications that this will change and probably no true need for it to change.
There is, though, a defined metadata output for table-valued functions. As they say in the Highlander, “there can be only one.” Whether the table-valued function is inline or multi-line, there is always a defined set of columns that will be returned. The SSIS data source knows specifically what will be returned from the function.
It’s still true that someone can change the table-valued function and cause similar issues to occur after deployment. There is the insurance that side effects from table changes will not negatively affect the execution of the SSIS package with the obscure message above that seems more ominous that it actually is.
Best Practice Wrap-Up
Having read this, you may be wondering if this is much ado about nothing. If I hadn’t uncovered this issue a few months back I might be thinking the same time. I’ve used SSIS since SQL Server 2005 was released and that was the first time I recall having this issue.
There was time between now and when I originally encountered this issue. With one of my clients there was a major release between when I first encountered the error and now. Would you be surprise to find out that we ran into the same issue three more times? Working on these issues after a release is no picnic and had I followed the advice I give now these issues likely would not have happened.
Guess what I recommended so that we wouldn’t deal with these again in the future? Now, that you’ve heard my recommendation, what do you think?
Related posts: