Best practices oppinion on: Table that needs to be re-formatted via a view or copied to a new location via trigger

  • I have a table that is in the best format for my app but needs(should be) to be converted for use in reporting. This is a pretty static table, with MAYBE 5 inserts or updates a month. It holds rules for how to pay commissions to people.

    I can manipulate the data in a view to get a more ideal output (pulling csv values out and unpivoting them to rows) which obviously will need computed each time the view is read.

    Alternately I can put a trigger on the table for insert/update/delete to update a table for the manipulated data.

    The source table is about 30 lines now, but will be several thousand after the 'unpivot'

    I have yet to actually do this so I just wanted some advice.

    The sole purpose of this table/view (the results table/view) is to do joins on production sales info to produce commissions based on a very complicated scheme of product category, locations, profit margins( scaled commission on margins), etc, etc, 14 categories in all.

    I have only begun learning how to optimize my sql code.

    FYI, MSSQL 2008

    Thanks!

  • Can you provide the DDL (CREATE TABLE statements) for the table(s), some sample data that is representative of the problem you are trying to solve, and the expected results?

    All of this will help us help you finding a solution for your problem.

  • Thanks for the response Lynn. I did some math and unfortunately, normalizing this data is not an option.

    This table functions as a filter for location, branch, route, delivery method, product category, brand, itemnumber, customer number, and other sales staff exclusion(and a few more). There is a column called salesnumbers that can contain 100+ sales numbers, and column product category which can have a hundred entries, and each other column can also have dozens of entries.

    basically what I am saying is that normalizing this data takes a ~30 row table and produces a hundred million lines (!!!). I suspect my current performance issue of "csvcolumn like '%' + salesnumber + '%'" is many times faster than hitting a 100,000,000+ line table reguardless of proper indexing. What also hurts is that I have about 140,000 lines of sales transactions to compare this table to.

    My current effort has been to do a join using each possible entry for each column as a table and inner joining on csvcolumn like '%' + salesnumber + '%' or charindex(salesnumber,csvcolumn) > 0 (tried both ways, similar performance) producing a line for each possibility. The very first join produces 30,000 lines, the next one takes that to 2M lines and the next join to 6M lines, and by the time I have half the necessary joins, my 8 core * 3.4Ghz, 32GB ram, 10x 15,000rpm raid10 array is unable to complete the query in 10 minutes.

    I know this is probably clear as mud without actual data but my math is sound and I realize that this ~30 row table is superior to the many million line alternative. I can hit my current table with the 'like' or the 'charindex' method in about 1 minute. I think I will just have my current sql write the results to a table and have users hit the results table then schedule my current script to run in my daily batch process.

    I may also look at normalizing the easy columns that only have a few possible entries and leaving the 3 really wide columns as is. still have to deal with some 'likes'.

  • Is it possible to post your current script? That way we can have a better idea of what you are trying to achieve and what might be done to imporve the performance.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply