December 17, 2018 at 9:43 am
I have memory-optimized tables in SQL Server 2014 to handle a graph problem. The tables can be queried concurrently for multiple sessions, for as many as 50 concurrent sessions. A Node table contain the node IDs (80,000 nodes). A subset of nodes have measurements on them (about 10,000 of them). An Edge table contains the connectivity of the nodes - about 100,000 records. There are about 200 edges of interest at any given point in time. A C++ application calculates a weighting factor of each measured node on each edge. The results of the C++ application are eventually stored in a memory optimized table as well. In summary, here's what I have:
Table Node with 10,000 records with measurements (per SessionID for a total of about 500,000 records).
Table Edge with 200 records (per SessionID for a total of 10,000 records)
Table Weight with 200 Edges x 10,000 Nodes = 2 million records (per SessionID for a total of 100 million records)
I need to update the Weight table to store in a separate column the product of each Node measurement and the weight. For that, the update statement is:
update W
set Product = W.WeightFactor * N.Mesurement
from Weight W inner join Node N on
N.SessionID = W.SessionID and N.NodeID = W.NodeID
where W.SessionID = @sessionID
Then I need to update the Edge table to store in separate column the sum of the products in the Weight table for each Edge ID. For that, the query is as follows:
update E
set Total = X.Total
from ( select EdgeID, Total = sum(Product) from Weight where SessionID = @sessionID group by EdgeID ) X inner join Edge E on
E.SessionID = @sessionID and
E.EdgeID = X.EdgeID
What indexes should I create for these tables, hash or range? Clearly, all joins are equality joins, so at first I thought of creating hash indexes. The problem I ran into was that the population of the Edge, Node and Weight tables with hash indexes was taking too long creating the indexes. So I switched to nonclustered indexes which were faster at creation time but slower at retrieval.
There's a cleanup job that deletes data as sessions are closed.
December 17, 2018 at 10:09 am
Try these and see what improvements or otherwise they make:CREATE INDEX IX_Edge_SessionId_EdgeID ON dbo.Edge(SessionId, EdgeID);
CREATE INDEX IX_Node_SessionId_NodeId ON dbo.Node(SessionId, NodeId);
CREATE INDEX IX_Weight_SessionId_NodeId ON dbo.Weight(SessionId, NodeId);
CREATE INDEX IX_Weight_SessionId_EdgeID ON dbo.Weight(SessionId, EdgeID) INCLUDE (Product);
You will have to create them with an ALTER TABLE command.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply