Entity-Attribute-Value is evil ? how to avoid...

  • Eric M Russell - Tuesday, December 18, 2018 10:04 AM

    I don't think we should shape and force a RDMS like SQL Server to act something like a EAV or document database. We've got more data storage options inside SQL Server today than we did 10 years ago. Having both RowStore and ColumnStore living side by side in SQL Server can make a lot of sense. But really, if you're wanting to store data as documents, blobs, key values, or graph relationships, there are other database platforms better suited for the purpose.

    As always, I'll have to say "It Depends".  I used to think that it was a sin to store blobs in a database until I did some due diligence on a database where they were storing call recordings as a type of highly compressed WAV file.  I was in the process of tearing it all down but wanted to verify that the files did actually exist at the path they had included on the same row.  That's when I discovered that 10% of the files were missing and another 10% were corrupt.  And so I left the WAV files in the database and partitioned the table they're in so I wasn't backing up more than once things that would never change.  And, yeah... I verified that the 10% corrupt items weren't corrupt in the database.

    Now... if I could just get them to let me delete the damned WAV files from disk, we'd be all set.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Tuesday, December 18, 2018 11:49 AM

    Eric M Russell - Tuesday, December 18, 2018 10:04 AM

    I don't think we should shape and force a RDMS like SQL Server to act something like a EAV or document database. We've got more data storage options inside SQL Server today than we did 10 years ago. Having both RowStore and ColumnStore living side by side in SQL Server can make a lot of sense. But really, if you're wanting to store data as documents, blobs, key values, or graph relationships, there are other database platforms better suited for the purpose.

    As always, I'll have to say "It Depends".  I used to think that it was a sin to store blobs in a database until I did some due diligence on a database where they were storing call recordings as a type of highly compressed WAV file.  I was in the process of tearing it all down but wanted to verify that the files did actually exist at the path they had included on the same row.  That's when I discovered that 10% of the files were missing and another 10% were corrupt.  And so I left the WAV files in the database and partitioned the table they're in so I wasn't backing up more than once things that would never change.  And, yeah... I verified that the 10% corrupt items weren't corrupt in the database.

    Now... if I could just get them to let me delete the damned WAV files from disk, we'd be all set.

    A decade ago, I also maintained a SQL Server database that serviced a call center, and we also contained the audio inside the table. This was SQL Server 2000 and I think VARBINARY for the WAV data. If I had to build a similar application from scratch today, I'd probably use DocumentDB for the call notes and WAV attachments and SQL Server or Azure SQL for everything else. The benefit of DocumentDB is extensible schema, high concurrency, cheap storage, and elastic scalability.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Tuesday, December 18, 2018 12:13 PM

    Jeff Moden - Tuesday, December 18, 2018 11:49 AM

    Eric M Russell - Tuesday, December 18, 2018 10:04 AM

    I don't think we should shape and force a RDMS like SQL Server to act something like a EAV or document database. We've got more data storage options inside SQL Server today than we did 10 years ago. Having both RowStore and ColumnStore living side by side in SQL Server can make a lot of sense. But really, if you're wanting to store data as documents, blobs, key values, or graph relationships, there are other database platforms better suited for the purpose.

    As always, I'll have to say "It Depends".  I used to think that it was a sin to store blobs in a database until I did some due diligence on a database where they were storing call recordings as a type of highly compressed WAV file.  I was in the process of tearing it all down but wanted to verify that the files did actually exist at the path they had included on the same row.  That's when I discovered that 10% of the files were missing and another 10% were corrupt.  And so I left the WAV files in the database and partitioned the table they're in so I wasn't backing up more than once things that would never change.  And, yeah... I verified that the 10% corrupt items weren't corrupt in the database.

    Now... if I could just get them to let me delete the damned WAV files from disk, we'd be all set.

    A decade ago, I also maintained a SQL Server database that serviced a call center, and we also contained the audio inside the table. This was SQL Server 2000 and I think VARBINARY for the WAV data. If I had to build a similar application from scratch today, I'd probably use DocumentDB for the call notes and WAV attachments and SQL Server or Azure SQL for everything else. The benefit of DocumentDB is extensible schema, high concurrency, cheap storage, and elastic scalability.

    We're not having any concurrency problems, we store the data on less expensive disks in different file groups, we don't need an "extensible" schema, and when we need to scale, we just bolt on another drive and allocate it.  The call notes are handled in a similar fashion.  Not a challenge... interested in your take on it... how does DocumentDB benefit such an un-needy system?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 3 posts - 46 through 47 (of 47 total)

You must be logged in to reply to this topic. Login to reply