Problems displaying this newsletter? View online.
SQL Server Central
Featured Contents
Question of the Day
The Voice of the DBA
 

Relationally Divided over EAV

This editorial was originally published on Aug 26, 2015. It is being re-run as Steve is out of town.

The EAV (Entity-Attribute-Value) data model gets bad press in the world of the relational database, and with some justification. I'd read a lot of the evidence against it and even published on Simple-talk a tale of EAV gone wrong that is not for the faint-hearted. I regarded EAV designs largely as an "anti-pattern" arising from misguided attempts to transpose the world of object-orientation and loosely-typed languages directly into the relational model.

If EAV design has a "avoid at all costs" reputation among many, the fact remains that sometimes you just can't know in advance all the required attributes, and in such cases there are few alternatives. Peter Larsson, in his EAV session at the recent SQL Rally event in Amsterdam, described one such case, a database application to return medical insurance documentation, where there was simply no way to predict, over time, exactly what sort of data may need to be stored.

When Peter arrived on the project, the database contained 20 million rows, and one of the most important document search algorithms took 1 minute to return its data. When the table reached 1 billion rows, which would happen quite quickly at current growth rates, they estimated, with a high degree of confidence, that the same document search would take 134 days.

Peter then explained – and proved – that after fixing some flaws in their database design, and in their inefficient search algorithm, the same document search query, on a billion rows, returned its data in milliseconds. I nearly fell off my chair. The remainder of the session was going to explain how he did it, and it's safe to say he had his audience's attention.

I wondered briefly if by "tweaking the database design" he really meant "replace it with a proper relational model", but no. He had applied some sensible normalization but the model was a hybrid, with an EAV table containing unique attribute-value pairs alongside the normalized tables. In the EAV table, each row comprises three columns that describe an entity, an attribute, or characteristic, of that entity, and a value for that attribute.

He explained that the key to efficient querying of such a model was a technique called relational division. Let's say you live in a dorm with a number of other students who all own random number of socks of various colors. Your socks are red, green and blue and you want to know which other students have a matching set of socks. In relational division, you query the table (the dorm) for each entity (student) and their attributes (socks) that match your values (colors). The dorm is the dividend and your socks are the divisor. If student A owns yellow, black and green socks, the quotient is zero (not fulfilled). However, if Student B has yellow, red, black, blue, purple and green socks then the quotient is 1, since you and student B own the same subset of red, green and blue socks.

With the right algorithm, Peter proved that relational division can be highly efficient. Coupled with a clever approach to indexing, to facilitate "ordered index scans", and sensible statistics management, he was able to achieve formidable results – at least, I think we can call 134 days to a few milliseconds, for a billion rows, formidable.

Clearly EAV models are "difficult". They can and frequently do cause bad performance and maintenance problems, as the database grows. However, I also appreciated the lesson in why you should rarely form closed opinions on 'good' or 'bad' practices in database and query design. Sometimes there is no alternative to an EAV design, and the techniques do exist to make them work in a relational world.

Cheers,

Tony.

Further reading: Check out Peter's The E, the A and the V PDF

Tony Davis

Join the debate, and respond to today's editorial on the forums

 
 Featured Contents
SQLServerCentral Article

Choosing a File Format - Data Engineering with Fabric

John Miner from SQLServerCentral

In this next article, learn about the different file formats and which work well inside your Microsoft Fabric Lakehouse.

External Article

The full agenda for the Redgate Summit New York is live!

Additional Articles from Redgate

The full agenda for the Redgate Summit New York is live! Join us to stay ahead of the curve and gain valuable insights from industry experts like Bob Ward (Microsoft), Mri Pandit (Navy Federal Credit Union), Erik Darling (Darling Data), Steve Jones (Redgate Software), and many more.

External Article

Auditing SQL Server – Part 2 – Hardware Audit

Additional Articles from SimpleTalk

This is the second part of my series on auditing SQL Server. In the first part, I discussed basic server discovery and documentation. It covered some items to check at the hardware level and configuration items, but this section gets into more detailed hardware auditing details.

Blog Post

From the SQL Server Central Blogs - Updating dbatools–Fixing the Certificate Failure

Steve Jones - SSC Editor from The Voice of the DBA

I was trying to update my dbatools install to test something and go this error. I fixed it with a little help. The Fix The short answer from Chrissy...

Blog Post

From the SQL Server Central Blogs - Classifications and sensitivity labels in Microsoft Purview

James Serra from James Serra's Blog

I see a lot of confusion on how classifications and sensitivity labels work in Microsoft Purview. This blog will help to clear that up, but I first must address...

The Unicorn Project

Site Owners from SQLServerCentral

In The Unicorn Project, we follow Maxine, a senior lead developer and architect, as she is exiled to the Phoenix Project, to the horror of her friends and colleagues, as punishment for contributing to a payroll outage. She tries to survive in what feels like a heartless and uncaring bureaucracy and to work within a system where no one can get anything done without endless committees, paperwork, and approvals.

 

 Question of the Day

Today's question (by Steve Jones - SSC Editor):

 

Versions Supporting Developing User CLR Functions

In which versions of SQL Server can I develop a user CLR function?

Think you know the answer? Click here, and find out if you are right.

 

 

 Yesterday's Question of the Day (by Steve Jones - SSC Editor)

DBCC TRACESTATUS

I have run this code in my SQL Server 2022 session:

DBCC TRACEON (2528) 
DBCC TRACEON (3205, -1)
GO
-- run some queries here
GO
DBCC TRACEOFF (2528)

These are the only traceflags changed on the instance.

Now I run this:

DBCC TRACESTATUS

What is returned?

Answer: 1 row for TF3205 enabled globally

Explanation: This returns 1 row. When the traceflag is turned off, it is not returned by DBCC TRACESTATUS. Ref: DBCC TRACESTATUS - https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-tracestatus-transact-sql?view=sql-server-ver16

Discuss this question and answer on the forums

 

 

 

Database Pros Who Need Your Help

Here's a few of the new posts today on the forums. To see more, visit the forums.


SQL Server 2017 - Administration
Dont want to restart Tempdb not able to shrink datafiles in ag primary - HI All, Dont want to restart Tempdb not able to shrink datafiles in always on server primary is there any alternative way we can reduce the tempdb files and shrink the space from them . can we run this below command to clear the space from tempdb data files without restart and effect always on […]
SQL Server 2016 - Development and T-SQL
How to get a distinct value from my data set - Hi, I need help with writing a script that will allow me to pull a distinct value from column A, depending on the value in column B. Here is an example of the data set: Column A | Column B 5000          |   1 5000          |   1 5001  […]
How To Approach Adding A primary Key To An Existing Table - In our vendor created/managed DB we have a table that is like an audit table the vendor uses for tracking down issues. It's something you can turn on or off via the application that uses the DB.  The table has NO Primary Key. It does have a date stamp like column so we've used it […]
Deadlocks with UPDATE statements using serializable transaction isolation level - We are seeing frequent deadlocks occurring due to a particular stored procedure that is using the SERIALIZABLE transaction isolation level. The stored procedure is essentially trying to ensure that the same reference number (concatenated from multiple fields) is never returned more than once. CREATE PROCEDURE dbo.sp_GenerateNextNumber ( @SequenceKey nvarchar(10), @ReferenceNumber nvarchar(25) = NULL OUTPUT ) […]
Development - SQL Server 2014
View works for me ...but doesn't return results for a user in SSMS but no errors - Hi I have this view to check if a job is running:   SELECT job.NAME ,job.job_id ,job.originating_server ,activity.run_requested_date ,DATEDIFF(SECOND, activity.run_requested_date, GETDATE()) AS Elapsed FROM msdb.dbo.sysjobs_view job JOIN msdb.dbo.sysjobactivity activity ON job.job_id = activity.job_id JOIN msdb.dbo.syssessions sess ON sess.session_id = activity.session_id JOIN ( SELECT MAX(agent_start_date) AS max_agent_start_date FROM msdb.dbo.syssessions ) sess_max ON sess.agent_start_date = sess_max.max_agent_start_date WHERE […]
SQL Server 2019 - Development
where is my commit size in my pkg - hi, while trying to give our operations guys as much info as possible about a separate ssis issue /post we have with losing connections locally and/or run times of 60 x normal when vpn'd, i looked for my commit size under vs 2022 with ssis data tool 16.0.5397.1.   i thought it was supposed to show […]
error in both ssis and ssms - something about losing connections - new error - hi for about 4 or 5 days now, i've been seeing various connections (to our dw server) issues in ssis (excel to sql) under vs 2022 AND SSMS.  the ssis error is shown below.  in ssms it looks like this  ...    The connection is broken and recovery is not possible. The connection is marked […]
SQL Azure - Administration
How to Migrate SQL Logins Between Two Azure SQL Instances Without sp_helprevlogi - Hello everyone, I am currently working on migrating SQL logins between two Azure SQL instances. Unfortunately, I am facing several limitations and would appreciate your help in finding an effective solution. Using sp_helprevlogin: This stored procedure is not supported in Azure SQL. Using dbatools: The dbatools option is not viable in my current environment. HASHED […]
General
AWS RDS MSSQL - Since AWS RDS MSSQL does not support logon trigger, is there any alternative way to implement same in RDS MSSQL. In existing logic, I created a logon trigger which allows only some selected group to logon through windows authentication, also restrict some applications from accessing mssql. I want to implement the same in AWS RDS […]
Anything that is NOT about SQL!
Microsoft Dataverse - So having seen the term dataverse thrown around I pictured it as some sort of marketing buzzword. Recently I've been looking at the MS Power Platform and this term pops up all over the place. So I googled it. Wikipedia talked of an open source product written in Java and I wondered if I may […]
SQL Server 2022 - Administration
Error fetching data via Linked Server - Hi, SS ver: SS 2017, SE; Oracle: 19.23 SE We have a nightly job that connects from our Sql server env to an Oracle database via Linked server that we have created for this purpose. This job had been running fine for the past several years - it pulls  data from multiple tables. However, the […]
Use Polybase with ODBC to create external table - I'm trying to use the installed Polybase service on an  SQL 2019 server to create an external table by using and ODBC  DSN. The connection of the DSN is to a fairly  exotic  BBj server that hosts 3 databases. Somehow I just do not seem to get the proper syntax  for creating the external table. […]
SQL Server 2022 - Development
Dark mode, other color schemes - All, if you are like me and do not care for the built-in color schemes, try www.sqlshades.com.  I just downloaded the EXE and instead of me taking an hour or two to find the right colors and without able to change the docked windows colors, I installed it, restarted SSMS and already love the dark […]
Issues adding and updating a column - Hi all   I'm hoping someone will able to say "you're an idiot because....." on this one.   We download a database but we have to add a column to a table and then update it. The code to add/update is as follows: IF NOT EXISTS ( SELECT * FROM UK_Health_Dimensions_New.INFORMATION_SCHEMA.COLUMNS c WHERE c.TABLE_SCHEMA= 'ODS' […]
how can vs see SSIS under my regular user id but not my admin? - Hi, we run vs 2022.   I'm stumped how when i run VS as admin i cant see ssis after hitting create new project unless i want to import ssis or tabular.   but under my regular id i can see new ssis, import ssis, import tabular and new ssrs after hitting crate new project.   its been […]
 

 

RSS FeedTwitter

This email has been sent to {email}. To be removed from this list, please click here. If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com. This newsletter was sent to you because you signed up at SQLServerCentral.com.
©2019 Redgate Software Ltd, Newnham House, Cambridge Business Park, Cambridge, CB4 0WZ, United Kingdom. All rights reserved.
webmaster@sqlservercentral.com

 

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -