July 11, 2010 at 7:25 am
Hi,
Would need some advice and guidelines on how to maintain & execute the stored procedures.
Let's start with a scenario
In a sales & production table, there have some attributes are sharing the its data to each other table. In current practice, sales & production tables has its own SSIS package to load the data.
So, my question is that how do I handle the stored procedures to execute if these 2 tables got any new data coming in?
Hope this scenario is clear to you all. Please advise how should I handle and execute the Stored Proc automatically, instead of run it manually.
Thanks!
July 11, 2010 at 8:51 am
This scenario isn't clear at all. What do you mean attributes sharing? ARAIK, Attributes are not a term that is defined as a particular thing in SQL Server. Do you mean lookup tables, or data in table that is used by two different groups of tables?
In terms of SSIS packages, why wouldn't you execute the stored procedures from the packages when they ran?
July 11, 2010 at 9:00 am
Hi,
Sorry for confusing.
Okay, I mean that the column fields in Sales is required some value from Production table to do the calculation. And now, the calculation is processed in Stored Procedures.
The problem that I cant include the stored procedure into its own package (i.e. Sales Package & Production Package) due to the data load may at different time.
Hope this is clear enough for you to understand my problem. Please advise. Thank you!
July 11, 2010 at 9:02 am
It's not clear, and part of it is language. That's OK, but what I'm wondering, is how things are linked and what is the timing here.
Can you give an example, using specific tables/fields/packages to explain what you mean?
July 11, 2010 at 9:17 am
Okay, let's we talk about the tables now.
1. 2 tables : one is Sales table, the other one is Production Cost table
The relationship between these 2 tables are:
From Sales table, it has its own sales quantity, sales amount by its product type. As for Production Cost table, it has product type and its original cost per unit.
So, you can see that if we want to see the profit of a specific product, we can use [ Sales.Sales_amt - (ProductionCost.Cost * Sales.Sales_qty)] , am i right?
[In summary, here we can see the amount of our original cost vs our sales in Product Type A]
2. As in SSIS , there are 2 different existing projects(one is for Sales table, another one is for Production Cost table) ,which is used for data load.
The problem that I face is that I cant include the my stored procedure (sp_SalesProfit) into any one of the SSIS project (either to be in Sales project or Production Cost project).
So, is there any other method to automate the stored procedure, rather than execute it manually?
Please advise.
Thank you!
July 11, 2010 at 9:21 am
I don't see a place you're storing the profit. Is that correct?
If you aren't storing it, why do you need to calculate it after a load? Usually a client requests this, and then wouldn't the client run the stored proc?
If you are storing it, then why not add it to both packages? Does it hurt if you recalculate some values when only one side has changed? Wouldn't this ensure that no matter when you looked at the data it would be correct? If Production hasn't changed, then recalculating things will only change those items that have changed in sales, and vice versa.
July 11, 2010 at 9:25 am
Hi,
Yes, i need to store it in both tables.
I may test it out by include the stored procedure into these 2 SSIS projects. Yet, is there any other solution of doing this? like triggers or functions ? or any others? Apologizes for asking so many questions, as I'm new in this.
Please advise. Thanks!
July 11, 2010 at 9:33 am
Trigger will fire for each batch update. So depending on how SSIS updates the tables, you might end up slowing the package.
If you have a datechanged for each row, you can use that to run the stored procedure for those rows that have changed since some date.
Ultimately, is this a long process? Does it take time/resources to run? How many rows? If it's not millions, or tens of millions, or run every few minutes, I'd just recalculate both sides.
July 11, 2010 at 9:38 am
Thanks for sharing...
It's manufacturing industry, probably will goes up to millions or even billions.
I think trigger will cause the worst performance compared to recalculate both sides. But, would like to know if there's any links or sample for reference on triggers?
And, does the SQL Server has any built-in function that can fit into this situation?
July 11, 2010 at 9:48 am
I would agree triggers sound bad.
You have a process here, and you are storing data, not calculating on the fly, so likely there isn't something built into SQL Server for this.
I would modify the SSIS process perhaps to make this run smoother and recalculate the changes only. Either take a look at the way you change data, and perhaps modify/calculate inline in the SSIS package and then just send the updated values for everything in SSIS.
July 11, 2010 at 10:17 am
Why would you need to store the result of the calculation in a table (even during import)?
Wouldn't it be easier to use a view to display the profit or a join to the product table for the insert statement?
If you have a fairly normalized database design you shouldn't have any issues importing the sales data...
What do you mean by
Yes, i need to store it in both tables.
?
You've mentioned that this stuff is new for you on one side vs. dealing with million even billion rows on the other side. If that's the case I strongly recommend to get someone in helping you to describe the process and to code the solution.
Sie note: The easiest way to explain what you're looking for is by showing code snippets. So, please take a look at the first article referenced in my signature on how to provide sample data and show us what your data look like before the import, the data you want to import and how the table(s) need to look like after importing the data.
July 12, 2010 at 10:16 pm
Hi Lutz,
View table seems like not practical in my case.
In my stored proc, it used to calculate the value
1. by its own table's columns
2. update the cost column. Scenario: if there's no production cost on Product A in Jan 2010, then I need to get the production cost from previous/latest month's data(e.g. Dec 2009) and stored in production cost in Jan 2010. This is used to calculate the sales unit, and the nett profit. Hope this is not confusing you.
3. Another type of calculation is pull data from Table A , B and calculate with Table C and store it into Table C by its condition and specific key.
These 3 types of calculation is not consistence,therefore, i think it's not feasible to use the method of having view table. What do you think? Please advise.
Thank you!
July 12, 2010 at 10:24 pm
Hi Steve,
It's not feasible to run the package in both SSIS packages, it'll slow down the performance. It takes around 10-30mins to finish the package(for around 8k records). Is this the normal processing time?
How to do calculation in the SSIS package instead of using Stored Procedures?
And, how should i validate the data before insert into the database, to avoid the duplication happened?
Please advise. Thank you!
July 13, 2010 at 9:34 am
cold_blue (7/12/2010)
Hi Lutz,View table seems like not practical in my case.
In my stored proc, it used to calculate the value
1. by its own table's columns
2. update the cost column. Scenario: if there's no production cost on Product A in Jan 2010, then I need to get the production cost from previous/latest month's data(e.g. Dec 2009) and stored in production cost in Jan 2010. This is used to calculate the sales unit, and the nett profit. Hope this is not confusing you.
3. Another type of calculation is pull data from Table A , B and calculate with Table C and store it into Table C by its condition and specific key.
These 3 types of calculation is not consistence,therefore, i think it's not feasible to use the method of having view table. What do you think? Please advise.
Thank you!
I don't see anything right now that would prevent using a view (or maybe a function).
To further help you I'd need some test data to play with. therefore, I'd like to ask again for some ready to use sample data.
July 13, 2010 at 10:08 am
cold_blue (7/12/2010)
Hi Steve,It's not feasible to run the package in both SSIS packages, it'll slow down the performance. It takes around 10-30mins to finish the package(for around 8k records). Is this the normal processing time?
How to do calculation in the SSIS package instead of using Stored Procedures?
And, how should i validate the data before insert into the database, to avoid the duplication happened?
Please advise. Thank you!
There's no way to know if 10-30 minutes is good or bad. Depends on hardware, code, etc. You'd have to test. How long does it take to run the proc? Or are you saying you have a child package that calculates and a parent that loads?
SSIS packages can call stored procedures. SSIS can do validation/lookups/cleaning. It can also do the calculations before inserting the data.
It feels like you don't have enough experience here to really understand what is happening or what SQL is capable of. I really would suggest that you ask for a couple days of consulting time from someone to help you build a better system and learn somet higns.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply