Business Constraints

  • I have to design an accounting system and I am stucked at the point where i need to design some complex business rules.

    To make it simplier, take the following example.

    I have two classical tables Invoice and InvoiceLines

    CREATE TABLE Invoice (InvoiceID INT, Total decimal)

    CREATE TABLE InvoiceLine(InvoiceLineID INT, InvoiceID INT, Amount decimal)

    I have a procedure, like, InsertInvoiceLine(...)

    How can I make sure that after the InsertInvoiceLine is executed the Invoice.Total is always equal to SUM(InsertInvoiceLine.Amount) ?

    I am not sure that my questions is 100% correct, but currently I have problems with corrupted data - like in production, sometimes the invoice total is different then the sum of all the invoice lines. How can I stop this happening? How can I throw an error if this business rule is not respected? Also, consider that I have hundred of thousands of invoices, so not all the constraints are acceptable.

    What would you recommend?

    Thanks,

    Stelian.

  • First of all, I would not store the total of the Invoice Lines in the Invoice Header record. This can be computed as needed when the header record is read.

    Second, it appears (without seeing the code of the stored procedure) that you are inserting one row at a time into the Invoice Line Item table. This procedure should be able to update the Header record as each Line Item is entered.

    You mention corruption of your data, can you explain how you think this is occuring?

  • stelianx (3/10/2010)


    ...in production, sometimes the invoice total is different then the sum of all the invoice lines. How can I stop this happening?

    Get SQL Server to do the hard work of maintaining the totals for you, automatically. Guaranteed never to be wrong.

    Demonstration script:

    -- For demonstration purposes

    USE tempdb;

    GO

    -- Required connection settings

    SET ANSI_NULLS,

    ANSI_PADDING,

    ANSI_WARNINGS,

    ARITHABORT,

    CONCAT_NULL_YIELDS_NULL,

    QUOTED_IDENTIFIER

    ON;

    SET NUMERIC_ROUNDABORT

    OFF;

    GO

    -- Drop demo objects if they exist

    IF OBJECT_ID(N'dbo.InvoiceTotal', N'V')

    IS NOT NULL

    DROP VIEW dbo.InvoiceTotal;

    IF OBJECT_ID(N'dbo.InvoiceLine', N'U')

    IS NOT NULL

    DROP TABLE dbo.InvoiceLine;

    IF OBJECT_ID(N'dbo.Invoice', N'U')

    IS NOT NULL

    DROP TABLE dbo.Invoice;

    GO

    -- Header table

    CREATE TABLE

    dbo.Invoice

    (

    invoice_id INTEGER NOT NULL PRIMARY KEY,

    other_columns CHAR(1024) NOT NULL DEFAULT (SPACE(1024)),

    );

    GO

    -- Detail table

    CREATE TABLE

    dbo.InvoiceLine

    (

    invoice_id INTEGER NOT NULL

    CONSTRAINT [FK dbo.InvoiceLine dbo.Invoice invoice_id]

    FOREIGN KEY

    REFERENCES dbo.Invoice,

    line_id INTEGER NOT NULL,

    amount MONEY NOT NULL,

    CONSTRAINT [PK dbo.InvoiceLine invoice_id, line_id]

    PRIMARY KEY CLUSTERED (invoice_id, line_id)

    WITH (FILLFACTOR = 100)

    );

    GO

    -- Add 10,000 invoices

    INSERT dbo.Invoice

    (invoice_id)

    SELECT TOP (10000)

    ROW_NUMBER() OVER (ORDER BY (SELECT 0))

    FROM master.sys.columns C1,

    master.sys.columns C2,

    master.sys.columns C3;

    GO

    -- Add 8 lines per invoice, total 80,000 lines

    INSERT dbo.InvoiceLine

    (invoice_id, line_id, amount)

    SELECT TOP (80000)

    (ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1) / 8 + 1,

    (ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1) % 8 + 1,

    CONVERT(MONEY, RAND(CHECKSUM(NEWID())) * 10000)

    FROM master.sys.columns C1,

    master.sys.columns C2,

    master.sys.columns C3;

    GO

    -- Create a view of all invoice totals

    CREATE VIEW

    dbo.InvoiceTotal

    WITH SCHEMABINDING

    AS

    SELECT INV.invoice_id,

    amount = SUM(LINE.amount),

    line_count = COUNT_BIG(*)

    FROM dbo.Invoice INV

    JOIN dbo.InvoiceLine LINE

    ON LINE.invoice_id = INV.invoice_id

    GROUP BY

    INV.invoice_id;

    GO

    -- Materialize the view

    CREATE UNIQUE CLUSTERED INDEX [CUQ dbo.InvoiceTotal invoice_id]

    ON dbo.InvoiceTotal (invoice_id);

    GO

    -- Demonstrate using the view

    SELECT invoice_id, amount, line_count

    FROM dbo.InvoiceTotal WITH (NOEXPAND)

    ORDER BY

    invoice_id;

    GO

    -- Object drops

    IF OBJECT_ID(N'dbo.InvoiceTotal', N'V')

    IS NOT NULL

    DROP VIEW dbo.InvoiceTotal;

    IF OBJECT_ID(N'dbo.InvoiceLine', N'U')

    IS NOT NULL

    DROP TABLE dbo.InvoiceLine;

    IF OBJECT_ID(N'dbo.Invoice', N'U')

    IS NOT NULL

    DROP TABLE dbo.Invoice;

    GO

    Paul

  • Lynn Pettis


    I would not store the total of the Invoice Lines in the Invoice Header record. This can be computed as needed when the header record is read.

    This is a common de-normalization. Computing invoice totals on-the-fly is unlikely to scale well.

    Lynn Pettis


    It appears that you are inserting one row at a time into the Invoice Line Item table. This procedure should be able to update the Header record as each Line Item is entered.

    This is also unlikely to scale, and introduces the potential for deadlocks. Worse, the logic must be repeated wherever a line item might be modified.

    Lynn Pettis


    You mention corruption of your data, can you explain how you think this is occuring?

    My guess is that the current design relies either on triggers or code similar to that suggested above.

    Indexed views are the natural solution, as shown in my demonstration code.

    Paul

  • Thanks guys for your help.

    I came with the invoice example just for reference, to simplify the scenario, but probably I should give you the real example.

    As my first remark - yes I have the the Invoice / InvoiceTotal tables - it is designed by updating totals with triggers and it's working quite well.

    My real scenario is however here:

    - I have to keep track of 2 (or multiple) accounting systems, meaning that the same invoice should be represented differently for different systems. As simple example, an invoice for 3 days of rental of cost of 100 of a car will look like:

    System1:

    ----

    InvoiceLine: 100

    System2:

    -----

    InvoiceLine, day1: 33.33

    InvoiceLine, day3: 33.33

    InvoiceLine, day3: 33.34

    So, first system expects the whole amount as one line, while the other system is splitting the amount per day of rental.

    In my database, whenever a new InvoiceLine is inserted / updated / deleted, I should mirror this operation in the 2nd accounting system and the golden rule is that both accounting systems should always calculate the same amounts

    However, some programming errors can occur, like rounding error and the 2nd system can calculate 99,99 instead of 100.00. And then the customer complains. The rules to respect are very complex, so I cannot guarantee that the accounting will calculate correctly in 100% of cases - however, how can I make sure that an error is thrown if the 2 accounting systems do not calculate the same amounts? I would prefer users to see errors, rather then loosing money...

  • stelianx (3/13/2010)


    it is designed by updating totals with triggers and it's working quite well.

    I used to use triggers for this before there was an alternative. Indexed views are superior for many reasons, including performance. Triggers are vulnerable to being accidentally disabled, to logic errors (particularly for multi-row operations), and concurrency issues. I would encourage you to use indexed views where possible for future development work.

    stelianx (3/13/2010)


    My real scenario is however here: I have to keep track of 2 (or multiple) accounting systems, meaning that the same invoice should be represented differently for different systems.

    The problem is now very different from the original one posted, and I am not going to write another test rig and script. If you provide good-quality set-up code, I might post a solution. Please include all relevant detail this time, including whether the tables concerned exist in the same database or not.

    Paul

  • If you need to figure out where you have different results in your two tables then just use the great sample code Paul provided, join it to your invoice header table and select the rows with a difference <> 0.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Paul White (3/13/2010)


    Lynn Pettis


    I would not store the total of the Invoice Lines in the Invoice Header record. This can be computed as needed when the header record is read.

    This is a common de-normalization. Computing invoice totals on-the-fly is unlikely to scale well.

    Lynn Pettis


    It appears that you are inserting one row at a time into the Invoice Line Item table. This procedure should be able to update the Header record as each Line Item is entered.

    This is also unlikely to scale, and introduces the potential for deadlocks. Worse, the logic must be repeated wherever a line item might be modified.

    Lynn Pettis


    You mention corruption of your data, can you explain how you think this is occuring?

    My guess is that the current design relies either on triggers or code similar to that suggested above.

    Indexed views are the natural solution, as shown in my demonstration code.

    Paul

    First, yes I understand it is a common means of denormalization and I use that when appropriate. However, in the simple scheme of things, it really comes down to "It Depends". How is the total being used in and by the system.

    Also, I'd have to run some tests, but an in-line TVF and a cross apply may be scalable enough for this particular requirement and not need to use an indexed view with it attendant use of additional disk space (I know, disk is cheap, but it also depends on what the business can afford).

    Of course, all of this is mute as this had nothing to do with the real problem.

    I had to write code years ago that solved a similar problem, unfortunately it was in COBOL, not SQL. I had break down tax amounts to the individual taxing authorities and ran into similar problems regarding rounding. The difference, I couldn't kick out errors, I had to incorporate logic to ensure that the break down matched the overall amount.

    This one may take a little thinking to help solve. It will help if the OP posts a test rig for us to work with along with the expected results.

  • Lynn Pettis (3/13/2010)


    Also, I'd have to run some tests, but an in-line TVF and a cross apply may be scalable enough for this particular requirement and not need to use an indexed view with it attendant use of additional disk space (I know, disk is cheap, but it also depends on what the business can afford).

    An in-line user-defined table-valued function and APPLY is exactly equivalent to in-lining the code in the stored procedure you mentioned. The part that does not scale well is the cost of firing triggers and running complex code to maintain the totals. It is also quite easy to produce deadlocks using this approach.

    As anyone who has ever written one of these 'manual' de-normalization strategies knows, it is just hard work to ensure that everything stays consistent, while maintaining good concurrency, and avoiding deadlocks.

    Indexed views absolutely guarantee consistency, with very little effort.

    The disk space used by the indexed view comes to 20 bytes per row.

    Storing the sum and count in the header table would require 16 bytes per row.

    The incremental cost is therefore 4 bytes per row. 😛

    No-brainer.

    Paul

  • First, read what I wrote, I would have to do some testing to know if the in-line TVF would scale well or not.

    Second, it really depends on how someone creates the indexed view. If they create the view properly, primary key and computed value only, then yes. Problem I've seen is that most people create the indexed view over all the columns of the base table and that is what I was actually talking about but didn't articlate properly.

    Now, back to the Air Force v Army hockey game. We are currently tied 1 - 1 in the 2nd period.

  • Lynn Pettis


    First, read what I wrote, I would have to do some testing to know if the in-line TVF would scale well or not.

    :laugh: I did read what you wrote!

    You might have to do some testing, but I have direct experience all the approaches mentioned so far in many large-scale environments.

    Your original recommendation - to calculate the total from the base data every time it is required - quite obviously will not scale.

    Lynn Pettis


    Second, it really depends on how someone creates the indexed view. If they create the view properly, primary key and computed value only, then yes. Problem I've seen is that most people create the indexed view over all the columns of the base table and that is what I was actually talking about but didn't articlate properly.

    What would be the point of that? That just creates a full copy of the table! A bizarre suggestion.

    I have already shown how to do it correctly in the code provided, and explained why it is correct in detail.

    If you wish to continue trying to salvage an untenable position, post some code to back up your statements. Otherwise, you are just hand-waving.

    Paul

  • What is untenable about saying I'd need to do testing? Seems to me that's says I'm not sure and would do more investigating before commiting one way or the other.

    Also, I am also speaking from experience. Most people when first creating indexed views tend to go to the extreme and build indexed views over the entire base table or tables instead of just the coluns that are needed. It is through discussions like this one they learn better methods.

    Once I have done some testing I'll post my findings so that everyone can see the differences. This how we all learn, including myself.

  • Lynn Pettis


    I'm not sure and would do more investigating before commiting one way or the other.

    I find it hard to credit that any reasonably competent person could believe that summing the detail records every time will scale well.

    Does selecting a pre-computed aggregate not seem obviously better to you?

    Nevertheless, should you fail to find the time to produce a test rig, I will happily oblige 🙂

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply