How to resolve one to many linkage

  • CREATE TABLE [dbo].[Client](
    [ID] [int] NOT NULL,
    [Full_Name] [char](50) NULL
    PRIMARY KEY CLUSTERED
    (
    [ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO
    CREATE TABLE [dbo].[Subscription](
    [ID] [int] NOT NULL,
    [ClientID] [int] NOT NULL,
    [StartDate] [datetime] NULL,
    [EndDate] [datetime] NULL
    PRIMARY KEY CLUSTERED
    (
    [ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO

    SET ANSI_PADDING OFF
    GO

    Insert into [dbo].[Client] (ID, Full_Name) values (100,'John Smith')

    Insert into [dbo].[Subscription] (ID, ClientID, StartDate, EndDate)
    Values
    (2120,100,'2013-08-29 00:00:00.000','2014-08-29 00:00:00.000')
    ,(6956,100,'2014-09-16 00:00:00.000','2015-09-16 00:00:00.000')
    ,(13899,100,'2015-09-16 00:00:00.000','2016-09-16 00:00:00.000')
    ,(21496,100,'2016-09-16 00:00:00.000','2017-09-16 00:00:00.000')

    The only way to link these two tables is Client.ID = Subscription.ClientID. It gives me four records instead of one. The desirable result to to get link to the latest record ID # 21496. Any suggestions? Thank you.

  • Good job with the table DDL and sample data.  You didn't show us your query that didn't work, though.  I suspect that what you need to do is number the rows in a CTE using ROW_NUMBER, partitioned by Client.ID and ordered by Subscription.ID descending.  Then simply select the rows where the row number is 1.

    John

  • How about using a CROSS APPLY:

    SELECT *
    FROM Client C
      CROSS APPLY (SELECT TOP 1 *
           FROM Subscription ca
          WHERE ca.ClientID = C.ID
          ORDER BY StartDate DESC) S;

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • John Mitchell-245523 - Wednesday, November 15, 2017 9:06 AM

    Good job with the table DDL and sample data.  You didn't show us your query that didn't work, though.  I suspect that what you need to do is number the rows in a CTE using ROW_NUMBER, partitioned by Client.ID and ordered by Subscription.ID descending.  Then simply select the rows where the row number is 1.

    John

    My query is simple:
    Select,
    c.Full_Name,
    s.StartDate,
    s.EndDate
    from Client as c inner join Subscription as s on c.ID = s.ClientID

  • Thom A - Wednesday, November 15, 2017 9:10 AM

    How about using a CROSS APPLY:

    SELECT *
    FROM Client C
      CROSS APPLY (SELECT TOP 1 *
           FROM Subscription ca
          WHERE ca.ClientID = C.ID
          ORDER BY StartDate DESC) S;

    Thank you, Thom A and John Mitchell. It works! I do appreciate it. Now I can move forward.

  • There are two main approaches, both of which have already been suggested.  Which one performs better depends on the density of the child table (subscriptions) with respect to the parent table (clients) and to a lesser extent on the number of records in the parent table.  Both approaches are improved by having an appropriate index on the secondary table.  While your table has an index, it is not the appropriate one for this particular query.  Your index needs to be on ClientID and StartDate (at a minimum).

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen - Wednesday, November 15, 2017 9:21 AM

    There are two main approaches, both of which have already been suggested.  Which one performs better depends on the density of the child table (subscriptions) with respect to the parent table (clients) and to a lesser extent on the number of records in the parent table.  Both approaches are improved by having an appropriate index on the secondary table.  While your table has an index, it is not the appropriate one for this particular query.  Your index needs to be on ClientID and StartDate (at a minimum).

    Drew

    Indeed. With the small amount of data there, the performance of the CROSS APPLY was marginally better over the CTE. At least on my instance. It would be worth, rkordonsky 63916, if you look at the CTE to see which one performs better for you.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • drew.allen - Wednesday, November 15, 2017 9:21 AM

    There are two main approaches, both of which have already been suggested.  Which one performs better depends on the density of the child table (subscriptions) with respect to the parent table (clients) and to a lesser extent on the number of records in the parent table.  Both approaches are improved by having an appropriate index on the secondary table.  While your table has an index, it is not the appropriate one for this particular query.  Your index needs to be on ClientID and StartDate (at a minimum).

    Drew

    Unfortunately this is what I have. I have not being with the company long enough to start making changes. But I will remember your excellent advise and wiliness to help. Thank you, Drew. I do appreciate it.

  • Thom A - Wednesday, November 15, 2017 9:23 AM

    drew.allen - Wednesday, November 15, 2017 9:21 AM

    There are two main approaches, both of which have already been suggested.  Which one performs better depends on the density of the child table (subscriptions) with respect to the parent table (clients) and to a lesser extent on the number of records in the parent table.  Both approaches are improved by having an appropriate index on the secondary table.  While your table has an index, it is not the appropriate one for this particular query.  Your index needs to be on ClientID and StartDate (at a minimum).

    Drew

    Indeed. With the small amount of data there, the performance of the CROSS APPLY was marginally better over the CTE. At least on my instance. It would be worth, rkordonsky 63916, if you look at the CTE to see which one performs better for you.

    I am doing it at this very moment. But amount of data is not big so I do not see much difference in both methods performance.

  • rkordonsky 63916 - Wednesday, November 15, 2017 9:26 AM

    drew.allen - Wednesday, November 15, 2017 9:21 AM

    There are two main approaches, both of which have already been suggested.  Which one performs better depends on the density of the child table (subscriptions) with respect to the parent table (clients) and to a lesser extent on the number of records in the parent table.  Both approaches are improved by having an appropriate index on the secondary table.  While your table has an index, it is not the appropriate one for this particular query.  Your index needs to be on ClientID and StartDate (at a minimum).

    Drew

    Unfortunately this is what I have. I have not being with the company long enough to start making changes. But I will remember your excellent advise and wiliness to help. Thank you, Drew. I do appreciate it.

    Without the index, I believe that the CTE/ROW_NUMBER() version will tend to perform better, but please do test both versions.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • There is another possibility, something like this:
    WITH LatestSubscriptions
    AS
    (
      SELECT
         s.ClientID
      ,   StartDate = MAX(s.StartDate)
      ,   Enddate = MAX(s.EndDate)
      FROM  dbo.Subscription s
      GROUP BY s.ClientID
    )
    SELECT *
    FROM
       dbo.Client    c
    JOIN LatestSubscriptions ls ON c.ID = ls.ClientID;

    As others have mentioned, only through testing and appropriate indexing will you determine which performs best.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin - Wednesday, November 15, 2017 10:51 AM

    There is another possibility, something like this:
    WITH LatestSubscriptions
    AS
    (
      SELECT
         s.ClientID
      ,   StartDate = MAX(s.StartDate)
      ,   Enddate = MAX(s.EndDate)
      FROM  dbo.Subscription s
      GROUP BY s.ClientID
    )
    SELECT *
    FROM
       dbo.Client    c
    JOIN LatestSubscriptions ls ON c.ID = ls.ClientID;

    As others have mentioned, only through testing and appropriate indexing will you determine which performs best.

    Thank you, Phil. I will try it.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply