Scalability Question

  • Hi, All

    That was actually interview question I got and I'm dieing to learn what is Answer B could be. I really appreciate any links and you feedback.

    Q.You design db schema to serve existing 100 clients, but you know that 100 more clients will come soon and each of them will require more columns

    for their own biz, how you will design you schema to preserve this scalability and extensibility to add new columns in future.

    --------------------------------

    A. You can add new columns to db tables upon each request, preserving all constraints, etc... (my answer)

    B. ???

    Thanks

    Mario

  • Possible answer #1

    You gotta be kidding me. There are actually clients where we would need to add columns that would be specific to a client? The business rules, and how they have been implemented, need to be revisited.

    That answer may not get you the job, however!

    Possible answer #2

    Can this table be normalized? From what you are describing, there may be a need to re-visit the existing design.

    That answer might. Although you are telling them that their baby is ugly!

    Possible answer #3

    Entity Attribute Value. Although this model may not end up being very scalable.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • I think I'd first verify the sanity of the question, then combine the shock of Michael's #1 and the logic of #2:

    You've got to be kidding me. The design needs to be updated (i.e.: fixed) to allow additional clients to be added without having to create new columns. Normalize the structure so that adding a client involved adding a row to a table instead of adding columns.

  • Thanks SSCarpal,

    Wow!!!, adding rows instead of columns, can you expand this topic a bit, if new client need to add extra info for given transaction, like Sex_ID for customer who place order, how we can put this into the row?

    Thanks

    Mario

  • Gosh, be really careful folks. I'm no purist when it comes to EAVs but there are some serious problems with them to consider. First, putting such an EAV together isn't "normalization". EAVs actually violate the First Normal Form and causes all the same problems that such denormalization would present.

    I'll agree that it's seemingly very convenient to simply add rows to add "logical columns" but it's not very efficient when it comes to lookup criteria and it usually also means "duplication" where one person uses "Cust" for a column name while another uses "Customer", etc, etc. As much as I loath XML, even XML can be indexed so that it almost looks normalized. EAVs can't be. And neither will allow any form of easy/efficient Declared Referential Integrity.

    My recommendation would be to consider something besides either an EAV or XML for this. Even just adding columns in a denormalized fashion could be better. Another possible solution is to simply add "sister" tables to "add columns per client".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hence my comment that EAV's mat not be very scalable.

    Right now, there is a set of EAV's in our current system. The original intent was to solve the problem of user defined data (not my design, BTW).

    They worked great when 10 users were logged into the system, and the DB was small. Now they are the most problematic set of tables in the database.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Agreed. I should have stated that I was emphasizing the problem of scalability that you brought up.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • mario17 (11/14/2014)


    Thanks SSCarpal,

    Wow!!!, adding rows instead of columns, can you expand this topic a bit, if new client need to add extra info for given transaction, like Sex_ID for customer who place order, how we can put this into the row?

    Thanks

    Mario

    There is not enough information here to provide some real answers. Keep that in mind!

    Think about this from the logical, not the physical, perspective. Customers can have many attributes. If the initial design was complete adding new attributes to a customer is something that wouldn't need to occur often.

    But it does happen.

    Use addresses as an easy example. There may be billing addresses, shipping addresses and so forth. And there may be multiple addresses of each type.

    You would not keep adding new columns to the customer table. That would get ugly very fast. A possibly better design would be to create a customer address table.

    In that case, when something changes, you are adding rows to the address table, and possibly new values to the address type look-up table.

    To add to Jeff's concern about an EAV design, this works well for certain situations on a small scale. It can rapidly turn into a performance and data integrity nightmare.

    In my current system, this was originally implemented to allow customers to store custom data. One used it to store department, one used it for supervisor. It was only originally attached to the orders and purchase order table. Now, it's attached to 25 different tables and it contains millions of rows. It can't be tuned. There is no way, without writing a LOT of code, to validate the data being entered.

    Did you get the position?

    I think you attempted to provide a black and white answer to a very gray question. If I was the interviewer, I would have expected a lot more questions as the answer to my question!

    I think they were trying to see if you could analyze the problem properly, and provide a set of possible solutions.

    In a good interview, you will find a set of yes and no questions, but you will also get a set of "it depends" questions. These are where you get the opportunity to display you strengths. Or, where you get the opportunity to fall flat on your face!

    To me, the answers to the yes/no questions are not as important as the answers to the gray questions, especially for an entry to mid level position.

    Good luck!

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply