‘Shortest path’ is by far the most feature of SQL Graph for
now. What does this even mean?
‘Shortest path’ is the term accorded to the shortest
distance between any two points, referred to as nodes in graph databases. The
algorithm that helps you find the shortest distance between node A and node B
is called the Shortest Path Algorithm.
Let us go back to the movie database. We have two people,
say Amrish Puri and Harrison Ford. Amrish wants to meet Harrison Ford. He has
not acted with Ford, he may have a few connections in common – or people who
know him. Or people who know him who know him. This is one way to get an
introduction. Or, let us say you are interviewing for a job. You want to see if
someone in your network works at that place – so that you can get an idea of
what the job or the company is like. So you go on linkedin – do a search for the
company, look under ‘people’, and it tells you if anyone in your network is
there, or someone is 2 levels away, or 3. Those numbers are what we get from
the shortest path feature.
Aside from social media examples, what are the specific uses
for this feature? Below are a few ways you can put this to use –
1 Find the path to person you want access to from a large
organizational chart
2 Find connections between specific tables in an ERDiagram (yes that is graph
data too)
3 Find connection between two resources in a data center graph model
4 Find which store is closest to customer or various applications related to
geography
You can even make a graph data model of characters in a complex novel and
explore relationships that way.
And so on.
In this I will illustrate the examples of shortest path with
the movie db:
Below illustrates connections
from Harrison Ford to all other actors in the database
SELECT STRING_AGG(toActor.PersonName, '->') WITHIN GROUP (GRAPH PATH) AS FriendConnections,
LAST_VALUE(toActor.PersonName) WITHIN GROUP (GRAPH PATH) AS FriendName,
COUNT(toActor.PersonName) WITHIN GROUP (GRAPH PATH) AS levels
FROM
PersonNode AS fromActor,
CoActorLink FOR PATH AS f,
PersonNode FOR PATH AS toActor
WHERE
MATCH(SHORTEST_PATH((toActor<-(f)-)+fromActor))
AND fromActor.PersonName = 'Harrison Ford'
The results show us how many levels away the person is as
well, and who are the people on the path if they are more than one level away.
We can filter this of course to find only people who are
only one hop away, or two levels away.
SELECT PersonName, Friends
FROM (
SELECT
Person1.Personname AS PersonName,
STRING_AGG(Person2.Personname, '->') WITHIN GROUP (GRAPH PATH) AS Friends,
COUNT(Person2.Personname) WITHIN GROUP (GRAPH PATH) AS levels
FROM
PersonNode AS Person1,
CoActorLink FOR PATH AS fo,
PersonNode FOR PATH AS Person2
WHERE MATCH(SHORTEST_PATH(Person1(-(fo)->Person2){1,3}))
AND Person1.Personname = 'Harrison Ford'
) Q
WHERE Q.levels = 1
If we want connections between two specific people, we can do
it as below.
SELECT PersonName, Friends, levels
FROM (
SELECT
Person1.Personname AS PersonName,
STRING_AGG(Person2.Personname, ‘->’) WITHIN GROUP (GRAPH PATH) AS Friends,
LAST_VALUE(Person2.Personname) WITHIN GROUP (GRAPH PATH) AS LastNode,
COUNT(Person2.Personname) WITHIN GROUP (GRAPH PATH) AS levels
FROM
PersonNode AS Person1,
CoActorLink FOR PATH AS fo,
PersonNode FOR PATH AS Person2
WHERE MATCH(SHORTEST_PATH(Person1(-(fo)->Person2)+))
AND Person1.Personname = ‘Harrison Ford’
) AS Q
WHERE Q.LastNode = ‘Tom Cruise’
In following posts I will explore more specific, every day
examples of this.