Many-to-many Junction table opinions?

  • Hello,

    So I was hoping to get some opinions on a design issue I was wondering about (pros, cons, considerations, etc.).

    Here's the background:

    One of the systems I support handles requisitions and the client would like to be able to link certain requisitions together. It doesn't really matter why. This seems to me to be a typical many-to-many relationship issue except that instead of the many-to-many relationship being between 2 different types of things it is between the same type of thing. So I need to be able to join the Requisitions table back to itself.

    My initial thought was to use a Junction table with 2 Requisition ID columns in it - like this:

    CREATE TABLE linkedRequisitions

    ( requisitionID INT NOT NULL,

    linkedRequisitionID INT NOT NULL

    )

    But then I figured I'd have to insert 2 records for each relationship - one with requisition A in the first column and requisition B in the second column, then one with requisition A in the second column and requisition B in the first column (flip-flopped). This would get exponentially worse as the group of linked requisitions grew (i.e., 6 records for 3 linked requisitions, 12 records for 4 linked requisitions, etc.)

    My next idea was to create a table like this:

    CREATE TABLE linkedRequisitions

    ( groupID INT IDENTITY(1,1) NOT NULL,

    requisitionID INT NOT NULL

    )

    This would create one record for each requisition in the group and a group ID. This would make it easier for larger groups of linked requisitions but isn't what I have seen for implementing many-to-many relationships in DBs.

    Ultimately, I will need to be able to easily grab all linked requisitions' data based on a single requisition's ID.

    I'd be interested in seeing what people thought about these alternatives.

    Thanks,

    George

  • George,

    In the first part of your post it sounds like you want to implement a simple recursive relationship, i.e., parent-child.

    CREATE TABLE [Relation](

    requisitionID [int] NOT NULL,

    parentRequisitionId [int] NOT NULL,

    CONSTRAINT [PK_Relation]

    PRIMARY KEY CLUSTERED (requisitionID ASC, parentRequisitionId ASC))

    GO

    ALTER TABLE [Relation]

    WITH CHECK ADD CONSTRAINT [FK_Requisition_Relation]

    FOREIGN KEY(parentRequisitionId)

    REFERENCES [Requisition] (requisitionID)

    GO

    Where it gets confusing is the second part, which begins "But then I figured..."

    What you describe (in a denormalized view) are two records which represent the same thing.

    requisitionId | parentRequisitionId

    A | B

    B | A

    In a typical recursive associative relationship, the second record would violate the primary key, and it is logically the same thing, i.e. "a relationship between 'A' and 'B'".

    Can you better explain the need for the second record? If it has logical value, then the PK would be something else, presumeably the following:

    CONSTRAINT [PK_Relation] PRIMARY KEY CLUSTERED (requisitionID ASC))

    JQAllen

    [font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

  • JQAllen (2/28/2012)


    George,

    In the first part of your post it sounds like you want to implement a simple recursive relationship, i.e., parent-child.

    CREATE TABLE [Relation](

    requisitionID [int] NOT NULL,

    parentRequisitionId [int] NOT NULL,

    CONSTRAINT [PK_Relation]

    PRIMARY KEY CLUSTERED (requisitionID ASC, parentRequisitionId ASC))

    GO

    ALTER TABLE [Relation]

    WITH CHECK ADD CONSTRAINT [FK_Requisition_Relation]

    FOREIGN KEY(parentRequisitionId)

    REFERENCES [Requisition] (requisitionID)

    GO

    Where it gets confusing is the second part, which begins "But then I figured..."

    What you describe (in a denormalized view) are two records which represent the same thing.

    requisitionId | parentRequisitionId

    A | B

    B | A

    In a typical recursive associative relationship, the second record would violate the primary key, and it is logically the same thing, i.e. "a relationship between 'A' and 'B'".

    Can you better explain the need for the second record? If it has logical value, then the PK would be something else, presumeably the following:

    CONSTRAINT [PK_Relation] PRIMARY KEY CLUSTERED (requisitionID ASC))

    JQAllen

    I am not speaking up for one solution or another, and I agree that in a "typical" recursive relationship you would not need to store both unless there was a distinct difference between AB and BA, but how would this violate the PK you have defined above?

    requisitionId | parentRequisitionId

    A | B

    B | A

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • The first PK makes a composite of two requisitions unique. So AB and BA are the same record, therefore violating the PK.

    [font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

  • PRIMARY KEY CLUSTERED (requisitionID ASC, parentRequisitionId ASC))

    The PK above means that AB and BA cannot both be records in that table.

    [font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

  • CONSTRAINT [PK_Relation] PRIMARY KEY CLUSTERED (requisitionID ASC))

    However, for the second PK, it does not violate the key, since "B" is one unique record and "A" is the other, i.e., "BA" and the other being "AB". With those two records, you would have to attempt an insert of "BC" in order to violate the PK.

    [font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

  • My question is, what is the significance of having both "AB" and "BA"? I'm assuming from your original post that it is important to allow for that.

    [font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

  • JQAllen (2/28/2012)


    PRIMARY KEY CLUSTERED (requisitionID ASC, parentRequisitionId ASC))

    The PK above means that AB and BA cannot both be records in that table.

    Ummm, maybe logically speaking, but not in terms of a table in SQL Server.

    USE tempdb

    GO

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'Relation')

    AND type IN (N'U') )

    DROP TABLE Relation

    GO

    CREATE TABLE [Relation]

    (

    requisitionID [int] NOT NULL,

    parentRequisitionId [int] NOT NULL,

    CONSTRAINT [PK_Relation] PRIMARY KEY CLUSTERED (requisitionID ASC, parentRequisitionId ASC)

    )

    GO

    INSERT INTO dbo.Relation

    (requisitionID, parentRequisitionId)

    VALUES (1, 2),

    (2, 1) ;

    GO

    SELECT *

    FROM dbo.Relation

    GO

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • My apologies for miscommunicating. I'll ask it a different way. How does that recursion not satisfy your requirements?

    [font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

  • Heya JQAllen,

    Sorry about the delay...I got pulled into a meeting.

    The relationship isn't a parent-child one...it's one of two equivalents (siblings??). So requisition A isn't the parent of B and requisition B isn't the parent of A...they just need to be marked as linked in the database.

    So that's why I was wondering how to handle it in the Junction table...if I only put in one record, for example:

    RequisitionID1 | RequisitionID2

    -------------------------------------------

    Requisition A | Requisition B

    Then if I want to find the linked requisition for 'B' I will never know if the requisition ID is in column 1 or in column 2, so I would have to do a sql statement like this:

    SELECT *

    FROM LinkedRequisitionTable

    Where RequisitionID1 = ID#

    OR RequisitionID2 = 'B'

    Then I'd have to use a case statement to grab the other requisition ID based on which column I find my requisition ID in. However, if I insert 2 records ... one for each combination...then I can just do a SQL statement like this:

    SELECT RequsitionID2

    FROM LinkedRequisitionTable

    Where RequisitionID1 = 'B'

    Does that make sense?

    Thanks for your input!

    George

  • If you wanted to eventually create a stored procedure that would find related requisitions, regardless of which column your target value might be in, would using a UNION statement fulfill your requirements? Below is an example of what I mean by this.

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'RelatedRequisition')

    AND type IN (N'U') )

    DROP TABLE [RelatedRequisition]

    GO

    CREATE TABLE [RelatedRequisition]

    (

    requisitionId [int] NOT NULL,

    requisitionId2 [int] NOT NULL,

    CONSTRAINT [PK_RelatedRequisition] PRIMARY KEY CLUSTERED (requisitionID ASC, requisitionId2 ASC)

    )

    GO

    INSERT INTO dbo.[RelatedRequisition]

    (requisitionId, requisitionId2)

    VALUES (1, 2),(2, 1),(1, 3),(3, 1),(1, 4),(4, 1) ;

    GO

    DECLARE @Input int

    SET @Input = 1

    SELECT requisitionID

    FROM dbo.[RelatedRequisition]

    WHERErequisitionId2 = @Input

    UNION

    SELECT requisitionId2

    FROM dbo.[RelatedRequisition]

    WHERErequisitionId = @Input

    [font="System"]Jay Quincy Allen, Managing Partner, Relational Models LLC[/font]

  • Hi George

    I wonder if you could clarify something for me please?

    say I create 5 requisitions

    req1

    req2

    req3

    req4

    req5

    now, I decide that i need to link "req5" with "req2"...can I also link "req5" with "req3"?

    also .....I decide that i need to link "req5" with "req2"...but "req2" is already linked to "req1"...is that allowed?

    perhaps it would help if you could provide some sample data that demonstrates the combinations of links that you are expecting and also what your expected output from a query on these links needs to provide.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • You know, after reading everyone's questions and suggestions and thinking about them last night the answer came to me. I think my second table design will work better for several reasons. The design would be something like:

    CREATE TABLE RequisitionGroupTypes

    (

    TypeID INT IDENTITY (1,1) NOT NULL,

    TypeDescription VARCHAR(150) NOT NULL,

    Other columns as needed...

    )

    CREATE TABLE LinkedRequisitions

    ( GroupID INT IDENTITY (1,1) NOT NULL,

    RequisitionID INT NOT NULL,

    RequisitionGroupType INT NOT NULL

    REFERENCES RequisitionGroupTypes(TypeID)

    )

    1. Livingston asked whether various requisitions will be able to be linked with each other in various patterns...right now that's not the intention, but who knows if the client will want that in the future...they probably will. Right now they will only be linked in pairs. but I can foresee them also being linked in other ways. This way the GroupType can be used to distinguish groupings/pairings based on the grouping types. That way req. A and B can be linked together for one reason yet Req. A, C and D can be linked together for a different reason and the links won't be confused.

    2. I will only have to search one column, the RequisitionID column, to determine whether a specific requisition is part of a linked requisition group or not. This is important because the requisitions are billed differently based on whether they are paired or not. Whereas with the other table design I'd have to search 2 columns.

    I was basically just wondering if there was some universally accepted "best practice" for joining records in a table back to itself when the records are treated as equals (siblings).

    I appreciate everyone's input on this, it has helped me think through the issue.

    George

  • There are a few people that come to mind who spend a lot of time on this site that could walk you through solving this in a way that will scale. However they would be deserving of consulting fees because what you're doing is considered advanced and they may not be willing to jump in and give this one away for free.

    That said, if you want to do this one on your own and get this one "right" from the outset get this book: Joe Celko's Trees and Hierarchies in SQL for Smarties, Second Edition (The Morgan Kaufmann Series in Data Management Systems)

    It will describe a data model that will fit your use-case and you can be sure you have something that will handle all the corner cases and scale.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks for the info opc.three!

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply