June 22, 2015 at 9:25 am
Hi Everyone
I have a strange error that I am getting randomly on some environments but not on others that I can not figure out. It almost feels like a bug inside SSIS.
To give a bit of background, I have an SSIS package that takes a bunch of data from numerous SQL Server DBs and loads it into a staging SQL Server environment, which then takes that data and inserts/updates it into a PostgreSQL server. The SSIS package is deployed onto the staging environment into the integration services catalog and executed via a Job.
The insert into PostgreSQL is done with a data flow task using OLE DB Source (for SQL Server) and ADO.NET Source (for PostgreSQL). Then the data is merge joined and a conditional split is done for any records that do not exist in the destination. The data is inserted into PostgreSQL using an ADO.NET Destination connection. That part works beautifully.
The part which is causing me an ulcer is the update of the records which exist in PostgreSQL which may have changed in the source. That is done using an execute SQL task through a linked server. The query is executed on the SQL Server and PostgreSQL would be the linked server.
The are about 7 tables which I am ETLing over and the SQL tasks which update the changed data could be triggered at the same time for all 7 tables, which is causing the problem I believe. If I change the package so that those tasks execute one after another then I do not get the issue.
Let's look at two simple tables that I am generally getting the error on. The update on those two tables seem to execute in parallel the most which is why those error out regularly.
CREATE TABLE [dbo].[floor](
[global_fac_id] [varchar](100) NOT NULL,
[floor_id] [int] NOT NULL,
[floor_desc] [varchar](1000) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[unit](
[global_fac_id] [varchar](100) NULL,
[unit_id] [int] NOT NULL,
[unit_desc] [varchar](1000) NOT NULL
) ON [PRIMARY]
On the Execute SQL Tasks I use the expression builder to build the query so that it is dynamic per environment. The dynamic part is the ODBC DSN (@[User::postgres_odbc_dsn]) for the PostgreSQL which can be configured to hit different servers.
"UPDATE fd
SET fd.floor_desc = fs.floor_desc
From [" + @[User::postgres_odbc_dsn] +"].[dbname].[public].[floor] fd
INNER Join [floor] fs ON fd.org_floor_id = fs.floor_id
and fd.global_fac_id = fs.global_fac_id
Where fd.floor_desc <> fs.floor_desc"
When I evaluate the expression, and run the query manually inside SSMS it always runs fine. But when I run the package it throws an error. This is what I get:
Update PostgreSQL Floor: Error: OLE DB Provider "MSDASQL" for linked server "LOAD-ETL-DB01" returned message "ERROR: column "floor_desc" of relation "unit" does not exist: Error while executing query".
Similarly, sometimes the table and column are flipped:
Update PostgreSQL Unit: Error: OLE DB Provider "MSDASQL" for linked server "LOAD-ETL-DB01" returned message "ERROR: column "unit_desc" of relation "floor" does not exist: Error while executing query".
Now the error "column_name" of relation "table_name" means that the column does not exist in the table. Which is fine, because unit_desc certainly does not exist in table floor and vice versa. But where is SSIS getting that query from? It seems like when SSIS is running and evaluation the expression the two different update queries are getting mixed up together somehow. It makes absolutely no sense.
Has anyone ever encounter an issue like this? I would love to figure out what is causing this because it is driving me crazy.
I have been googling this for days trying to find something similar without much use. The only thing I was able to find that is something similar is https://connect.microsoft.com/SQLServer/feedback/details/332372/ssis-variable-expressions-dont-always-evaluate . However that bug report was for SQL Server 2008R2 and lower. I am running this on SQL Server 2012 SP1 so I would assume that issue would have been resolved in 2012 version since it's marked as complete for 2008.
Thanks in advance for your help.
June 22, 2015 at 12:10 pm
Instead of creating an expression, I would create a variable and maybe make sure that it has been correctly populated before using it in a SQL Task.
June 23, 2015 at 8:21 am
Do you mean create a variable and populate the linked server DSN within the variable?
That's what I ended up doing along with changing it to be serialized rather then in parallel. This seems to have fixed the issue but I still don't understand the overlying problem. That's what's bugging me.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply