A better way than multiple subqueries?

  • I have a table that contains a single record for each of the (several thousand) devices we manage, and a second table that contains lists of properties and associated values for those devices.  In the second table multiple records will exist for each property, the only differences being an ID field, the value, and a datetime stamp.  I've been trying to figure out the best way to pull back all records from the first table and all the values from the most recent of all property records for each device.  I started with a query on the first table containing many subqueries (select top 1... order by...) for each property, but this doesn't really strike me as very efficient.  Is there a better way?

  • Mark Harley - Tuesday, May 22, 2018 11:34 AM

    I have a table that contains a single record for each of the (several thousand) devices we manage, and a second table that contains lists of properties and associated values for those devices.  In the second table multiple records will exist for each property, the only differences being an ID field, the value, and a datetime stamp.  I've been trying to figure out the best way to pull back all records from the first table and all the values from the most recent of all property records for each device.  I started with a query on the first table containing many subqueries (select top 1... order by...) for each property, but this doesn't really strike me as very efficient.  Is there a better way?

    You should know the drill by now: please provide sample data in consumable format and desired results, based on that data.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • tbl_devices
    device_uid                                                            device_id    device_name
    162813C1-5BAE-E711-80C4-0CC47A39511D    1                 device_1
    163513C1-5BAE-E711-80C4-0CC47A39511D    2                 device_2
    C53A13C1-5BAE-E711-80C4-0CC47A39511D   3                 device_3

    tbl_device_properties
    property_id        device_uid                                                            property_type_id    property_value        last_updated
    1                        162813C1-5BAE-E711-80C4-0CC47A39511D    1                            3.0                           2018-02-03
    2                        163513C1-5BAE-E711-80C4-0CC47A39511D    8                            1.5                           2018-01-24
    3                        162813C1-5BAE-E711-80C4-0CC47A39511D    1                             2.0                          2017-09-11
    4                        162813C1-5BAE-E711-80C4-0CC47A39511D    1                            1.1                           2016-07-22
    5                        C53A13C1-5BAE-E711-80C4-0CC47A39511D    5                            6                              2017-12-27
    6                         C53A13C1-5BAE-E711-80C4-0CC47A39511D    5                           3                              2017-01-07

    I was hoping for a more theoretical discussion initially, but here's a quick mock up of the setup.

    As for desired results, I'm just pilling a list of the devices and their associated properties.

    select    device_id,
                  device_uid,
                  (select top 1 property_value from tbl_device_properties where property_type_id = 1 and device_uid = d.device_uid order by last_updated desc) as 'Property 1'
    ....
    from tbl_devices d
    ...

  • Use a cross tabs query using the MAX() function to pivot the table as needed. Here's an article describing how to do so: Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - SQLServerCentral

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Mark Harley - Tuesday, May 22, 2018 11:54 AM

    tbl_devices
    device_uid                                                            device_id    device_name
    162813C1-5BAE-E711-80C4-0CC47A39511D    1                 device_1
    163513C1-5BAE-E711-80C4-0CC47A39511D    2                 device_2
    C53A13C1-5BAE-E711-80C4-0CC47A39511D   3                 device_3

    tbl_device_properties
    property_id        device_uid                                                            property_type_id    property_value        last_updated
    1                        162813C1-5BAE-E711-80C4-0CC47A39511D    1                            3.0                           2018-02-03
    2                        163513C1-5BAE-E711-80C4-0CC47A39511D    8                            1.5                           2018-01-24
    3                        162813C1-5BAE-E711-80C4-0CC47A39511D    1                             2.0                          2017-09-11
    4                        162813C1-5BAE-E711-80C4-0CC47A39511D    1                            1.1                           2016-07-22
    5                        C53A13C1-5BAE-E711-80C4-0CC47A39511D    5                            6                              2017-12-27
    6                         C53A13C1-5BAE-E711-80C4-0CC47A39511D    5                           3                              2017-01-07

    I was hoping for a more theoretical discussion initially, but here's a quick mock up of the setup.

    As for desired results, I'm just pilling a list of the devices and their associated properties.

    select    device_id,
                  device_uid,
                  (select top 1 property_value from tbl_device_properties where property_type_id = 1 order by last_updated desc) as 'Property 1'
    ....

    That is not consumable ... in this context, 'consumable' means something that another person can paste into SSMS and execute.
    On the theoretical side, there are several ways to approach it. 
    One is to create a CTE which selects the latest rows only from the second table and join it to the first.
    Another is to use APPLY against the second table to fetch only the latest row.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I take your point on the sample not being consumable.  

    Thank you both for your suggestions.  I'll give them a try.

  • You could use something like this:

    CREATE TABLE [dbo].[Devices](
      DeviceUid UNIQUEIDENTIFIER NOT NULL PRIMARY KEY NONCLUSTERED
      , DeviceId INT NOT NULL
      , DeviceName VARCHAR(24) NOT NULL
    );

    INSERT INTO [dbo].[Devices]([DeviceUid],[DeviceId],[DeviceName])
    VALUES
      ('162813C1-5BAE-E711-80C4-0CC47A39511D',1,'device_1')
     ,('163513C1-5BAE-E711-80C4-0CC47A39511D',2,'device_2')
     ,('C53A13C1-5BAE-E711-80C4-0CC47A39511D',3,'device_3');
    GO

    CREATE TABLE [dbo].[DeviceProperties](
      PropertyId INT NOT NULL PRIMARY KEY NONCLUSTERED
      , DeviceUid UNIQUEIDENTIFIER NOT NULL
      , PropertyTypeId INT NOT NULL
      , PropertyValue SQL_VARIANT NOT NULL
      , DatelastUpdated DATETIME NOT NULL
    );

    INSERT INTO [dbo].[DeviceProperties]([PropertyId],[DeviceUid],[PropertyTypeId],[PropertyValue],[DatelastUpdated])
    VALUES
       (1,'162813C1-5BAE-E711-80C4-0CC47A39511D',1,3.0,'2018-02-03')
      ,(2,'163513C1-5BAE-E711-80C4-0CC47A39511D',8,1.5,'2018-01-24')
      ,(3,'162813C1-5BAE-E711-80C4-0CC47A39511D',1,2.0,'2017-09-11')
      ,(4,'162813C1-5BAE-E711-80C4-0CC47A39511D',1,1.1,'2016-07-22')
      ,(5,'C53A13C1-5BAE-E711-80C4-0CC47A39511D',5,6,'2017-12-27')
      ,(6,'C53A13C1-5BAE-E711-80C4-0CC47A39511D',5,3,'2017-01-07');
    GO

    SELECT
      [d].[DeviceUid]
      , [d].[DeviceId]
      , [d].[DeviceName]
      , [pv].[PropertyTypeId]
      , [pv].[PropertyValue]
    FROM
      [dbo].[Devices] AS [d]
      INNER JOIN (SELECT
                    [rn] = ROW_NUMBER() OVER (PARTITION BY [dp].[DeviceUid], [dp].[PropertyTypeId] ORDER BY [dp].[DatelastUpdated] DESC)
                    , [dp].[DeviceUid]
                    , [PropertyTypeId]
                    , [dp].[PropertyValue]
                  FROM
                    [dbo].[DeviceProperties] AS [dp]) AS [pv]
        ON ([d].[DeviceUid] = [pv].[DeviceUid] AND [pv].[rn] = 1);
    GO

    --DROP TABLE [dbo].[Devices];
    --DROP TABLE [dbo].[DeviceProperties];
    --GO

  • This is what I came up with, but I suspect it's effectively just a combination of original code and Lynn's....
    CREATE TABLE #Devices (
        DeviceUid uniqueidentifier NOT NULL PRIMARY KEY NONCLUSTERED,
        DeviceId int NOT NULL,
        DeviceName varchar(24) NOT NULL,
        UNIQUE CLUSTERED (
            DeviceId ASC,
            DeviceUid ASC
            )
    );
    INSERT INTO #Devices (DeviceUid, DeviceId, DeviceName)
        VALUES    ('162813C1-5BAE-E711-80C4-0CC47A39511D',1,'device_1'),
                ('163513C1-5BAE-E711-80C4-0CC47A39511D',2,'device_2'),
                ('C53A13C1-5BAE-E711-80C4-0CC47A39511D',3,'device_3');
    GO

    CREATE TABLE #DeviceProperties (
        PropertyId int NOT NULL PRIMARY KEY NONCLUSTERED,
        DeviceUid uniqueidentifier NOT NULL,
        PropertyTypeId int NOT NULL,
        PropertyValue sql_variant NOT NULL,
        DatelastUpdated datetime NOT NULL,
        UNIQUE CLUSTERED (
            DatelastUpdated DESC,
            PropertyID ASC
            )
    );
    INSERT INTO #DeviceProperties (PropertyId, DeviceUid, PropertyTypeId, PropertyValue, DatelastUpdated)
        VALUES    (1, '162813C1-5BAE-E711-80C4-0CC47A39511D', 1, 3.0, '2018-02-03'),
                (2, '163513C1-5BAE-E711-80C4-0CC47A39511D', 8, 1.5, '2018-01-24'),
                (3, '162813C1-5BAE-E711-80C4-0CC47A39511D', 1, 2.0, '2017-09-11'),
                (4, '162813C1-5BAE-E711-80C4-0CC47A39511D', 1, 1.1, '2016-07-22'),
                (5, 'C53A13C1-5BAE-E711-80C4-0CC47A39511D', 5, 6, '2017-12-27'),
                (6, 'C53A13C1-5BAE-E711-80C4-0CC47A39511D', 5, 3, '2017-01-07');
    GO

    SELECT
        D.DeviceUid,
        D.DeviceId,
        D.DeviceName,
        PV.PropertyTypeId,
        PV.PropertyValue,
        PV.DatelastUpdated
    FROM #Devices AS D
        OUTER APPLY (
            SELECT TOP (1)
                DP.DeviceUid,
                DP.PropertyTypeId,
                DP.PropertyValue,
                DP.DatelastUpdated
            FROM #DeviceProperties AS DP
            WHERE DP.DeviceUid = D.DeviceUid
            ORDER BY DP.DatelastUpdated DESC
            ) AS PV;
    GO

    DROP TABLE #Devices;
    DROP TABLE #DeviceProperties;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Mark Harley - Tuesday, May 22, 2018 11:34 AM

    What you have here is called an EAV design, and it's a total nightmare. Google it will find a fair number of articles on it. You also don't seem to know the difference between a row and a record, a column and a field, or what constitutes a valid table (the use of the "tbl_" prefix is another design flaw called a Tibble and is the subject of several humor pieces). You also didn't bother to post DDL, which is just plain rude.

    You might want to use something other than RDBMS. If you really are stuck with this kind of disaster.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 - Thursday, May 24, 2018 9:41 AM

    Mark Harley - Tuesday, May 22, 2018 11:34 AM

    What you have here is called an EAV design, and it's a total nightmare. Google it will find a fair number of articles on it. You also don't seem to know the difference between a row and a record, a column and a field, or what constitutes a valid table (the use of the "tbl_" prefix is another design flaw called a Tibble and is the subject of several humor pieces). You also didn't bother to post DDL, which is just plain rude.

    You might want to use something other than RDBMS. If you really are stuck with this kind of disaster.

    You know, you shouldn't bash the EAV model of database design.  I personally know of one person that was adamantly anti EAV and has changed his tune and built a product on such a model.  Also, it is used extensively by MS in its compliance software system.  EAV is a tool, used appropriately it is good, used inappropriately it is evil.

  • Typically the best way to get great response time for such queries is to cluster the lookup table on the lookup keys.
    Unfortunately, when the primary lookup key is a guid, that would lead to massive fragmentation of the table.

    You might consider an intermediate table that translates the guid to a bigint for use in lookup / related tables.  You could use a SEQUENCE to assign those numbers (for example).  That would allow you to cluster the other tables by the bigint, but still relate them to the guid.  That avoids a sort to determine the last value, and it becomes a super-fast single key lookup.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply