A better way than multiple subqueries?

Question

A better way than multiple subqueries?

Mark Harley

SSCrazy

Points: 2259
More actions
May 22, 2018 at 11:34 am

#342594

I have a table that contains a single record for each of the (several thousand) devices we manage, and a second table that contains lists of properties and associated values for those devices. In the second table multiple records will exist for each property, the only differences being an ID field, the value, and a datetime stamp. I've been trying to figure out the best way to pull back all records from the first table and all the values from the most recent of all property records for each device. I started with a query on the first table containing many subqueries (select top 1... order by...) for each property, but this doesn't really strike me as very efficient. Is there a better way?

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 246960 More actions · Answer 1

Mark Harley - Tuesday, May 22, 2018 11:34 AM
I have a table that contains a single record for each of the (several thousand) devices we manage, and a second table that contains lists of properties and associated values for those devices. In the second table multiple records will exist for each property, the only differences being an ID field, the value, and a datetime stamp. I've been trying to figure out the best way to pull back all records from the first table and all the values from the most recent of all property records for each device. I started with a query on the first table containing many subqueries (select top 1... order by...) for each property, but this doesn't really strike me as very efficient. Is there a better way?

You should know the drill by now: please provide sample data in consumable format and desired results, based on that data.

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Mark Harley SSCrazy Points: 2259 More actions · Answer 2

tbl_devices
device_uid device_id    device_name
162813C1-5BAE-E711-80C4-0CC47A39511D    1 device_1
163513C1-5BAE-E711-80C4-0CC47A39511D    2 device_2
C53A13C1-5BAE-E711-80C4-0CC47A39511D 3 device_3

tbl_device_properties
property_id        device_uid property_type_id    property_value        last_updated
1 162813C1-5BAE-E711-80C4-0CC47A39511D    1 3.0 2018-02-03
2 163513C1-5BAE-E711-80C4-0CC47A39511D    8 1.5 2018-01-24
3 162813C1-5BAE-E711-80C4-0CC47A39511D    1 2.0 2017-09-11
4 162813C1-5BAE-E711-80C4-0CC47A39511D    1 1.1 2016-07-22
5 C53A13C1-5BAE-E711-80C4-0CC47A39511D    5 6 2017-12-27
6 C53A13C1-5BAE-E711-80C4-0CC47A39511D    5 3 2017-01-07

I was hoping for a more theoretical discussion initially, but here's a quick mock up of the setup.

As for desired results, I'm just pilling a list of the devices and their associated properties.
select device_id, device_uid, (select top 1 property_value from tbl_device_properties where property_type_id = 1 and device_uid = d.device_uid order by last_updated desc) as 'Property 1' .... from tbl_devices d ...

Luis Cazares SSC Guru Points: 183706 More actions · Answer 3

Use a cross tabs query using the MAX() function to pivot the table as needed. Here's an article describing how to do so: Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - SQLServerCentral

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Phil Parkin SSC Guru Points: 246960 More actions · Answer 4

Mark Harley - Tuesday, May 22, 2018 11:54 AM
tbl_devices
device_uid device_id    device_name
162813C1-5BAE-E711-80C4-0CC47A39511D    1 device_1
163513C1-5BAE-E711-80C4-0CC47A39511D    2 device_2
C53A13C1-5BAE-E711-80C4-0CC47A39511D 3 device_3
tbl_device_properties
property_id        device_uid property_type_id    property_value        last_updated
1 162813C1-5BAE-E711-80C4-0CC47A39511D    1 3.0 2018-02-03
2 163513C1-5BAE-E711-80C4-0CC47A39511D    8 1.5 2018-01-24
3 162813C1-5BAE-E711-80C4-0CC47A39511D    1 2.0 2017-09-11
4 162813C1-5BAE-E711-80C4-0CC47A39511D    1 1.1 2016-07-22
5 C53A13C1-5BAE-E711-80C4-0CC47A39511D    5 6 2017-12-27
6 C53A13C1-5BAE-E711-80C4-0CC47A39511D    5 3 2017-01-07
I was hoping for a more theoretical discussion initially, but here's a quick mock up of the setup.
As for desired results, I'm just pilling a list of the devices and their associated properties.
select    device_id,    device_uid,    (select top 1 property_value from tbl_device_properties where property_type_id = 1 order by last_updated desc) as 'Property 1' ....

That is not consumable ... in this context, 'consumable' means something that another person can paste into SSMS and execute.
On the theoretical side, there are several ways to approach it.
One is to create a CTE which selects the latest rows only from the second table and join it to the first.
Another is to use APPLY against the second table to fetch only the latest row.

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Mark Harley SSCrazy Points: 2259 More actions · Answer 5

I take your point on the sample not being consumable.

Thank you both for your suggestions. I'll give them a try.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

You could use something like this:
CREATE TABLE [dbo].[Devices]( DeviceUid UNIQUEIDENTIFIER NOT NULL PRIMARY KEY NONCLUSTERED , DeviceId INT NOT NULL , DeviceName VARCHAR(24) NOT NULL );

INSERT INTO [dbo].[Devices]([DeviceUid],[DeviceId],[DeviceName])
VALUES
('162813C1-5BAE-E711-80C4-0CC47A39511D',1,'device_1')
,('163513C1-5BAE-E711-80C4-0CC47A39511D',2,'device_2')
,('C53A13C1-5BAE-E711-80C4-0CC47A39511D',3,'device_3');
GO

CREATE TABLE [dbo].[DeviceProperties](
PropertyId INT NOT NULL PRIMARY KEY NONCLUSTERED
, DeviceUid UNIQUEIDENTIFIER NOT NULL
, PropertyTypeId INT NOT NULL
, PropertyValue SQL_VARIANT NOT NULL
, DatelastUpdated DATETIME NOT NULL
);

INSERT INTO [dbo].[DeviceProperties]([PropertyId],[DeviceUid],[PropertyTypeId],[PropertyValue],[DatelastUpdated])
VALUES
(1,'162813C1-5BAE-E711-80C4-0CC47A39511D',1,3.0,'2018-02-03')
,(2,'163513C1-5BAE-E711-80C4-0CC47A39511D',8,1.5,'2018-01-24')
,(3,'162813C1-5BAE-E711-80C4-0CC47A39511D',1,2.0,'2017-09-11')
,(4,'162813C1-5BAE-E711-80C4-0CC47A39511D',1,1.1,'2016-07-22')
,(5,'C53A13C1-5BAE-E711-80C4-0CC47A39511D',5,6,'2017-12-27')
,(6,'C53A13C1-5BAE-E711-80C4-0CC47A39511D',5,3,'2017-01-07');
GO

SELECT
[d].[DeviceUid]
, [d].[DeviceId]
, [d].[DeviceName]
, [pv].[PropertyTypeId]
, [pv].[PropertyValue]
FROM
[dbo].[Devices] AS [d]
INNER JOIN (SELECT
                [rn] = ROW_NUMBER() OVER (PARTITION BY [dp].[DeviceUid], [dp].[PropertyTypeId] ORDER BY [dp].[DatelastUpdated] DESC)
                , [dp].[DeviceUid]
                , [PropertyTypeId]
                , [dp].[PropertyValue]
              FROM
                [dbo].[DeviceProperties] AS [dp]) AS [pv]
    ON ([d].[DeviceUid] = [pv].[DeviceUid] AND [pv].[rn] = 1);
GO

--DROP TABLE [dbo].[Devices];
--DROP TABLE [dbo].[DeviceProperties];
--GO

sgmunson SSC Guru Points: 110640 More actions · Answer 7

This is what I came up with, but I suspect it's effectively just a combination of original code and Lynn's....
CREATE TABLE #Devices ( DeviceUid uniqueidentifier NOT NULL PRIMARY KEY NONCLUSTERED, DeviceId int NOT NULL, DeviceName varchar(24) NOT NULL, UNIQUE CLUSTERED ( DeviceId ASC, DeviceUid ASC ) ); INSERT INTO #Devices (DeviceUid, DeviceId, DeviceName) VALUES ('162813C1-5BAE-E711-80C4-0CC47A39511D',1,'device_1'), ('163513C1-5BAE-E711-80C4-0CC47A39511D',2,'device_2'), ('C53A13C1-5BAE-E711-80C4-0CC47A39511D',3,'device_3'); GO

CREATE TABLE #DeviceProperties ( PropertyId int NOT NULL PRIMARY KEY NONCLUSTERED, DeviceUid uniqueidentifier NOT NULL, PropertyTypeId int NOT NULL, PropertyValue sql_variant NOT NULL, DatelastUpdated datetime NOT NULL, UNIQUE CLUSTERED ( DatelastUpdated DESC, PropertyID ASC ) ); INSERT INTO #DeviceProperties (PropertyId, DeviceUid, PropertyTypeId, PropertyValue, DatelastUpdated) VALUES (1, '162813C1-5BAE-E711-80C4-0CC47A39511D', 1, 3.0, '2018-02-03'), (2, '163513C1-5BAE-E711-80C4-0CC47A39511D', 8, 1.5, '2018-01-24'), (3, '162813C1-5BAE-E711-80C4-0CC47A39511D', 1, 2.0, '2017-09-11'), (4, '162813C1-5BAE-E711-80C4-0CC47A39511D', 1, 1.1, '2016-07-22'), (5, 'C53A13C1-5BAE-E711-80C4-0CC47A39511D', 5, 6, '2017-12-27'), (6, 'C53A13C1-5BAE-E711-80C4-0CC47A39511D', 5, 3, '2017-01-07'); GO

SELECT D.DeviceUid, D.DeviceId, D.DeviceName, PV.PropertyTypeId, PV.PropertyValue, PV.DatelastUpdated FROM #Devices AS D OUTER APPLY ( SELECT TOP (1) DP.DeviceUid, DP.PropertyTypeId, DP.PropertyValue, DP.DatelastUpdated FROM #DeviceProperties AS DP WHERE DP.DeviceUid = D.DeviceUid ORDER BY DP.DatelastUpdated DESC ) AS PV; GO

DROP TABLE #Devices; DROP TABLE #DeviceProperties;

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

jcelko212 32090 SSCrazy Eights Points: 9303 More actions · Answer 8

Mark Harley - Tuesday, May 22, 2018 11:34 AM

What you have here is called an EAV design, and it's a total nightmare. Google it will find a fair number of articles on it. You also don't seem to know the difference between a row and a record, a column and a field, or what constitutes a valid table (the use of the "tbl_" prefix is another design flaw called a Tibble and is the subject of several humor pieces). You also didn't bother to post DDL, which is just plain rude.

You might want to use something other than RDBMS. If you really are stuck with this kind of disaster.

Please post DDL and follow ANSI/ISO standards when asking for help.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 9

jcelko212 32090 - Thursday, May 24, 2018 9:41 AM
Mark Harley - Tuesday, May 22, 2018 11:34 AM
What you have here is called an EAV design, and it's a total nightmare. Google it will find a fair number of articles on it. You also don't seem to know the difference between a row and a record, a column and a field, or what constitutes a valid table (the use of the "tbl_" prefix is another design flaw called a Tibble and is the subject of several humor pieces). You also didn't bother to post DDL, which is just plain rude.
You might want to use something other than RDBMS. If you really are stuck with this kind of disaster.

You know, you shouldn't bash the EAV model of database design. I personally know of one person that was adamantly anti EAV and has changed his tune and built a product on such a model. Also, it is used extensively by MS in its compliance software system. EAV is a tool, used appropriately it is good, used inappropriately it is evil.

ScottPletcher SSC Guru Points: 101098 More actions · Answer 10

Typically the best way to get great response time for such queries is to cluster the lookup table on the lookup keys.
Unfortunately, when the primary lookup key is a guid, that would lead to massive fragmentation of the table.

You might consider an intermediate table that translates the guid to a bigint for use in lookup / related tables. You could use a SEQUENCE to assign those numbers (for example). That would allow you to cluster the other tables by the bigint, but still relate them to the guid. That avoids a sort to determine the last value, and it becomes a super-fast single key lookup.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".