Introduction
Imagine you want to sort a dataset by something that isn’t stored in the database. Let me give you an example to better imagine the problem: I have a table with songs (i.e. song title, artist, etc.) that has a column called Song_ID as its primary key. I want to create playlists with some songs from that table.
To do that I create another table that contains the main data for these playlists including the names of the playlists. It also has a column as its primary key which is called Playlist_ID. I could easily create a table, say Playlist_Contents that consists of Playlist_ID and Song_ID to model the many-to-many-relationship between songs and playlists. But I like the songs within each playlist to be played in a specific order but there are no criteria I could use for sorting them. Instead the sequence of the songs will be determined by some criteria unknown to the database. My ER diagram would look like this:
Suppose the tables have the following contents:
Playlist_ID | Playlist_Name |
1 | All time favourites |
2 | Christmas songs |
Song_ID | Song | Artist |
10 | My girl | Temptations |
20 | Good vibrations | Beach Boys |
30 | Clocks | Coldplay |
40 | Velouria | Pixies |
50 | Day tripper | Beatles |
60 | Wonderwall | Oasis |
70 | Temptation | Heaven 17 |
80 | Mercy | Duffy |
Now I'd like to put 6 of the songs into the playlist "My favourite songs". I want to play the songs in this order:
- Good vibrations - Beach Boys
- Day tripper - Beatles
- Wonderwall - Oasis
- Clocks - Coldplay
- Velouria - Pixies
- My girl - Temptations
I can simply insert 6 rows into Playlist_Contents but I need to store the order as well. The table Playlist_Contents has to be extended to do this.
First attempt
A straightforward solution would be to add a column to Playlist_Contents that stores the position of the song in the playlist.
Playlist_ID | Song | Position |
---|---|---|
1 | 20 | 1 |
1 | 50 | 2 |
1 | 60 | 3 |
1 | 30 | 4 |
1 | 40 | 5 |
1 | 10 | 6 |
But what if you want to edit the playlist, i.e. adding, deleting or moving songs? You can easily add a song at the end of a playlist or remove the last song. All other operations lead to multiple updates, esp. if you want the position to hold contiguous numbers starting with 1 onwards:
-- add "Mercy" right after "Good vibrations: UPDATE Playlist_Contents SET Position = Position + 1 WHERE Position >= 2 INSERT Playlist_Contents VALUES (1, 80, 2) -- remove "Clocks" from the playlist: DELETE FROM Playlist_Contents WHERE Playlist_ID = 1 AND Song_ID = 30 UPDATE Playlist_Contents SET Position = Position - 1 WHERE Position > 4 -- move "Velouria" to the beginning of the list: UPDATE Playlist_Contents SET Position = Position + 1 WHERE Position >= 1 UPDATE Playlist_Contents SET Position = 1 WHERE Song_ID = 40 UPDATE Playlist_Contents SET Position = Position - 1 WHERE Position > 5
Maybe some playlists get quite long, so performance could become an issue. So using a position column seems to be quite inefficient.
A better solution
What about alternatives? How can we reduce the necessary modifications to a minimum? My solution was to not have a position column in the Playlist_Contents table at all. Instead I tried to use a linked list.
CREATE TABLE Playlist_Contents( Playlist_ID int NOT NULL, Song_IDint NOT NULL, Predecessor int NULL, CONSTRAINT PK_Playlist_Contents PRIMARY KEY CLUSTERED ( Playlist_ID ASC, Song_ID ASC )) GO ALTER TABLE Playlist_Contents WITH CHECK ADD CONSTRAINT FK_Playlist_Contents_Self FOREIGN KEY (Playlist_ID, Predecessor) REFERENCES Playlist_Contents (Playlist_ID, Song_ID) ALTER TABLE Playlist_Contents CHECK CONSTRAINT FK_Playlist_Contents_Self CREATE UNIQUE NONCLUSTERED INDEX AK_Playlist_Contents ON Playlist_Contents ( Playlist_ID ASC, Predecessor ASC ) ALTER TABLE Playlist_Contents WITH CHECK ADD CONSTRAINT CK_Playlist_Contents_Predecessor CHECK (Song_ID!=Predecessor) ALTER TABLE Playlist_Contents CHECK CONSTRAINT CK_Playlist_Contents_Predecessor GO
In order to do that I created a column “Predecessor” that is designed to point to the Song_ID of the preceding song (within the same playlist, of course). The first song does not have a predecessor, so I allowed NULL for this column. Since no two songs can have the same predecessor it must be unique within each playlist. Additionally a song’s predecessor must not point to itself. The playlist from above would now look like this:
Playlist_ID | Song | Predecessor |
---|---|---|
1 | 20 | NULL |
1 | 50 | 20 |
1 | 60 | 50 |
1 | 30 | 60 |
1 | 40 | 30 |
1 | 10 | 40 |
In order to have a position again I created a view which dynamically generates the position number. It is based on a recursive CTE:
CREATE VIEW [dbo].[vPlaylist_Contents] AS WITH List AS ( SELECTPlaylist_ID, 1 AS Position, Song_ID, Predecessor FROMPlaylist_Contents WHERE Predecessor IS NULL UNION ALL SELECTp.Playlist_ID, l.Position + 1, p.Song_ID, p.Predecessor FROMPlaylist_Contents p INNER JOIN List l ONp.Playlist_ID = l.Playlist_ID AND p.Predecessor = l.Song_ID ) SELECT * FROM List GO
Modifying data in a linked list
How can the sequence of songs in a playlist be changed? If we want to insert a new song or change a song’s position in the list, we just name the predecessor of the song or put NULL in it. Of course we cannot really execute the inserts, updates and deletes as they are. So we need a trigger to perform the necessary additional modifications on the table.
This trigger will be an instead-of-trigger for all three types of data modification. The trigger has to determine the additional modifications and execute those in a single statement in order to fulfill all constraints we have on the table, i.e. the primary key, the foreign key, the unique index, and the check constraint. Since we use the view to read the data we create the trigger on the view.
CREATE TRIGGER [dbo].[tr_vPlaylist_Contents] ON [dbo].[vPlaylist_Contents] INSTEAD OF INSERT,DELETE,UPDATE AS BEGIN SET NOCOUNT ON; WITH Modifications AS ( SELECTPlaylist_ID, Song_ID, Predecessor FROMinserted UNION ALL SELECTd.Playlist_ID, d.Song_ID, d.Predecessor FROMinserted i FULL OUTER JOIN deleted d ONi.Playlist_ID = d.Playlist_ID AND i.Song_ID = d.Song_ID WHEREi.Playlist_ID IS NULL UNION ALL SELECTp.Playlist_ID, p.Song_ID, i.Song_ID FROMPlaylist_Contents p INNER JOIN inserted i ONp.Playlist_ID = i.Playlist_ID AND (p.Predecessor = i.Predecessor OR (p.Predecessor IS NULL AND i.Predecessor IS NULL)) UNION ALL SELECTp.Playlist_ID, p.Song_ID, d.Predecessor FROMPlaylist_Contents p INNER JOIN deleted d ONp.Playlist_ID = d.Playlist_ID AND p.Predecessor = d.Song_ID ) MERGE INTOPlaylist_Contents AS targettable USINGModifications AS sourcetable ONtargettable.Playlist_ID = sourcetable.Playlist_ID AND targettable.Song_ID = sourcetable.Song_ID WHEN MATCHED AND targettable.Predecessor = sourcetable.Predecessor OR (targettable.Predecessor IS NULL AND sourcetable.Predecessor IS NULL) THEN DELETE WHEN MATCHED AND (targettable.Predecessor != sourcetable.Predecessor) OR (targettable.Predecessor IS NULL AND sourcetable.Predecessor IS NOT NULL) OR (targettable.Predecessor IS NOT NULL AND sourcetable.Predecessor IS NULL) THEN UPDATE SET targettable.Predecessor = sourcetable.Predecessor WHEN NOT MATCHED BY TARGET THEN INSERT (Playlist_ID, Song_ID, Predecessor) VALUES (sourcetable.Playlist_ID, sourcetable.Song_ID, sourcetable.Predecessor); END
It uses a MERGE statement to perform all necessary modifications at once. The CTE produces all rows that have to be modified. The rows whose primary key values do not match a row in the table will be inserted. The rows which do match a row in the table but have a different predecessor value will be updated. Finally rows where all column values match a row in the table will be deleted.
Testing the solution
Finally let’s try to insert, update and delete some data in the table to test the trigger.
-- add "Mercy" right after "Good vibrations: BEGIN TRAN INSERT vPlaylist_Contents (Playlist_ID, Song_ID, Predecessor) VALUES (1, 80, 20) SELECT * FROM vPlaylist_Contents ROLLBACK -- remove "Clocks" from the playlist: BEGIN TRAN DELETE FROM vPlaylist_Contents WHERE Playlist_ID = 1 AND Song_ID = 30 SELECT * FROM vPlaylist_Contents ROLLBACK -- move "Velouria" to the beginning of the list: BEGIN TRAN UPDATE vPlaylist_Contents SET Predecessor = NULL WHERE Song_ID = 40 SELECT * FROM vPlaylist_Contents ROLLBACK
The difference between this solution and the first attempt is the number of rows that have to be modified. In this solution we have an upper bound of 2 for inserts, 2 for deletes and 3 for updates. These figures are independent of the number of rows in the linked list whereas the figures in our first attempt increase as the number of songs in the playlist increases.
Actually I use this construction in an application that modifies one row at a time. The trigger works fine for some statements that modify multiple rows but not for all. The solution can be extended to allow all possible multiple row modifications but that involves much more effort to make things work.
Restrictions
Apart from multiple row modifications this implementation does not allow to change the primary key values of a row, i.e. you cannot replace a song by another one or move a song from one playlist to another. If you try to do that the trigger will fail due to constraint violations or missing uniqueness within the MERGE statement. To avoid error messages like that you can simply test for changed primary key values at the beginning of the trigger. If moving songs between playlists or replacements of songs within one statement are required you could extend the table by a surrogate primary key and use this new PK in the statement inside the trigger instead.
Don't use hierarchyid for linked lists
A linked list like this can also be regarded as a degenerated tree structure that contains one branch only. I was thinking of using a hierarchyid column in the beginning but discarded this idea quickly. Using a hierarchyid column doesn't reduce the number of updates. It is about the same as with a position column. Also a hierarchy column would use much more space than a predecessor column of type int.