Refactoring an Entity Attribute Value schema

  • Apologies in advance, as I know this has been discussed a million times already but I think our requirement is slightly different from the norm in that we have a well-defined, static set of attributes up front and our values are rows in a table. But feel free to ignore if you've had enough of this subject already!

    Our scenario is Notes.

    We have a fixed number of entities (each with their own table) such as Person, Service, Organisation, Contract, Invoice, etc.

    We also have a table for Notes - this has quite a few columns in order to support existing functionality.

    Each instance of an entity (e.g. a Person or a Service etc.) can have any number of Notes.

    Mercifully, each specific Note can only relate to one instance of an entity - a given Note can only be referenced by a single Person, Service, etc.

    Currently, we have implemented this via two columns on the Note table:

    EntityID - the PK value of a row in the Person, Service, Organisation, etc. table.

    ContentTypeID - 1 for a Person, 2 for a Service, 3 for an Organisation, etc.

    The main problem with this approach is the inability to establish a FK from EntityID to the Person, Service, Organisation, etc. tables. There might also be a performance implication too.

    The problem with an alternative is that we have over 30 tables in the database that need to have Notes. My understanding is that there are two options for refactoring:

    Add 30+ nullable columns to the Note table; PersonID, ServiceID, OrganisationID, etc. These could be sparse and we could add indexes for only the non-null values, plus FKs to the Person, Service etc. tables. But that's a lot of columns!

    Add an "Entity Note" table for all the entities needing a Note. e.g. Person_Note, Service_Note, Organisation_Note, etc. These would simply contain pairs of https://omegle.onl/ PersonID and NoteID values (or ServiceID and NoteID etc.) But that's a lot of tables!

    I'm assuming that the recommendation to "use XML" wouldn't work here because the "values" part of our existing EAV model isn't simply a value, such as 'red', 20/01/1935 or 27.55 for example - it's an entire row in a table.

    So I'm starting to think we should just stick with the existing EntityID / ContentTypeID approach and hold our noses whenever we come across it.

    But if you know of a better approach, I'd be very interested to hear it. Thanks.

  • The right answer would indeed be to have <Entity>Note tables. I wouldn't make those tables just an association table to Note table w/ a NoteID -- I would actually put the Note varchar/nvarchar column in those tables. Other than easier migration, what is the value of a standalone generic Notes table? Is there a common use case where users need to search for some string in Notes across all entities?

    If necessary to insulate apps from the change, at least temporarily, you could create a unioned Note view that simulates & replaces the existing Note table. If possible,  would recommend modifying procedures/apps to use the real tables.

    A lot of tables isn't really that much of a downside -- it should be relatively easy to generate the DDL and insert scripts for all of them.

  • If you want referential integrity between all those tables and 1 notes table, it will need to be done by a trigger instead of DRI through FKs.

    Much like temporal tables, though, I'd be seriously tempted to make one notes table per entity table so that they can easily be customized to fit the particular needs of each entity.  That would also easily support DRI.

    You could even have a "distribution view" named the same as the original notes table (which would go away) to keep from driving your Developers nuts with this after-the-fact design change.  That view would have an INSTEAD OF trigger on it to do the distribution to the correct table(s).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The problem starts with the approach of thinking about tables, foreign keys and other technical details - it's too low level. Think about the business use case for those notes. What do they use the notes for? Are all columns applicable to all notes? Would the business like to associate more than one note per person? Would there be a need for temporal notes on some entities, with a Valid Timeline to determine validity?

    The problem is not described on the business level, but as a technical exercise. If you have the business requirements, it's easier to solve though.

    That said, the least you could do is refactor the cases where you don't need all columns on all entity types, to separate subtypes. If you find that some subtypes only reference one table, well, that solves that problem and reduces the main issue.

     

  • This is a basic design mistake that many folks make.  OTLT (One True Lookup Table)  is typically a very poor construct and far too often causes one big set of blocking and locking in a busy system.

    I certainly would recommend a set of separate notes tables that correspond to each entity.   But, it depends.

    Here are a few business rules that you may need to answer:

    Is there a need for notes to be viewed or reported on by different groups of users, and restricted from other groups and users?  As an example, a manager can see all notes, and users can only see some notes?  Also, can customers see these notes?

    Adding a  created and modified column would create somewhat of an audit trail.  Do you need to create a full audit trail for the notes that tracks every modification?

    Is there a "related" notes requirement?  This one was a strange requirement that I ran into a while back.  How it worked was Team A was working 0n one part of the "order", Team B was working on another part of the order.  The notes entered by one team was not accessible to the other, and there needed to be a history of the notes that each team was entering.

     

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • That is a classic supertype/subtyte setup.  (Similar to sys.objects being a supertype in SQL Server to subtypes of table, procedure, etc.)

    But you're missing the "supertype" table.   That is, you should have a table above {Person | Service | Organisation | Contract | Invoice | ...}  -- let's call it "bigdog" for now -- that contains the PK and the attributes (columns) common to all other types.  The individual "littledog" tables -- Person, etc. -- have the PK and only columns specific to them. (Like sys.tables vs sys.objects.)

    The FK will then always point back to the supertype table ("bigdog") regardless of type.  Since all the parent keys are in one table, type is irrelevant to that FK.

    The only real rework I see is adding INSTEAD OF triggers to the subtype/"littledog" tables to add rows to the "bigdog" table and the "littledog" table.

    Or, technically a better model, create a FK on the "littledog" table pointing to the "bigdog" table.  Then always first add a row to the supertype table before the subtype table.  You would want to ultimately do that anyway.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply