Dictionary Lookup

Question

Dictionary Lookup

etl2016

Default port

Points: 1459
More actions
August 31, 2021 at 7:46 pm

#3925582

Hi
I have hundred million ID, NAME pairs stored in SQL Server as a table that is accessed by C# code, which has a responsibility to lookup NAMEs corresponding to million IDs at a time. C# code has these million IDs stored in an array, before invoking SQL module.
Columns of the SQL table:
ID VARCHAR (500) -- this is indexed on
NAME VARCHAR (500)
Could you please share your thoughts on what is the most performant method to do these million lookups?
The approach I am thinking of, is - through C# sqlDataClient feature, create a #temp table in SQL consisting of million IDs ( the "input") Then, JOIN the temp table with these million inputs to main table to create a result temp table. And, fetch this final result table holding NAMEs (output) corresponding to the IDs passed (input) out of SQL into C#, by appending NAMES into an array of results.
Is there a more performant approach? Alternatively, does SQL Server support Dictionary Lookup sort of built-in feature, to avoid JOINs?
thank you

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Zidar SSC Enthusiast Points: 126 More actions · Answer 1

Good idea would be to leave "the lookup" to the SQL server, then pass the results to C# code. Something like:

SELECT ID, NAme FROM DirectoryTable 
WHERE ID = '12345' -- you can supply any value that works for you

The problem is what i just typed works for somebody who sits in front of SSMS (SQL editor). The real thing would be to write a stored procedure, or a user define function with parameter(s). Then C# must provide code to pass given ID to the procedure/function, and to catch the response of the procedure/function. That way, most of the work is done by SQL Server, and C# code just communicates with SQL, receives results and displays them for the user to see.

C# is not my area of expertise, someone else will have to help you with that part. However, SQL part is "my area of expertise" and I can help with that part. However, you need to provide table design and some test data before that. You know, CREATE TABLE... then INSERT INTO...

Cheers

Zidar

Zidar's Theorem: The best code is no code at all...

ratbak Ten Centuries Points: 1381 More actions · Answer 2

Do you really have 500-character IDs? Such a long ID seems unusual and non-ideal. At 500 bytes, only a few rows will fit in an 8K database page. If so, at least make sure these don't get treated as nvarchar -- type conversions and/or storage could further exacerbate the problem.

A bigint can hold 18 quintillion different values, and the number of possible 128-bit guids (32 characters) is almost incomprehensibly larger. A compact indexable ID is going to be your friend here.

Are these ID/Name pairs semi-static/slowly-changing? i.e., assuming this is a repeated process and not one-time-only, would you query the same sets of IDs each time (allowing for additions/deletions)? Or do they vary from run-to-run? If they are consistent, you might be able to add something like a BatchID to facilitate processing w/o having to push the ID's the application is looking for back to SQL and then SQL Server having to join to a temp table to locate those IDs.

ScottPletcher SSC Guru Points: 101118 More actions · Answer 3

(1) Cluster the main table on ID: a nonclus index is worthless here.

(2) Cluster the lookup table on ID. The join of the two tables should now become a MERGE join, which will be most efficient here.

(3) Evaluate page-compressing the main table. That is, exec sys.sp_estimate_data_compression_savings ..., 'PAGE', and if it yields good results, then page compress the main table.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Chris Wooding SSCarpal Tunnel Points: 4558 More actions · Answer 4

You don't need to create a temp table and join to it. You can have the query in a stored procedure that has a parameter which is a "user defined table type". The C# code can populate this parameter from the array of values you want to query. The stored procedure would do a join between this table type parameter and the SQL table that you use for lookups.

In SQL;

-- Create the user-defined table type.
create type [dbo].[udt_MyTableType] as table (
[ID] [varchar](500) not null
)

-- Create the stored procedure
create procedure [dbo].[MyProcedure] @InputTable dbo.udt_MyTableType readonly
as
begin
    begin try
        set nocount on;
        
        select lt.[ID], lt.[Value]
          from LookupTable lt
          inner join @InputTable i on i.[ID] = lt.[ID]

        return 0;
    end try

    begin catch
        declare @ErrStr varchar(5000), @ErrMsg varchar(4000), @ErrSeverity int, @ErrNo int;
        select  @ErrMsg = error_message(), @ErrSeverity = error_severity(), @ErrNo = error_number();
        set @ErrStr = error_procedure()
                    + '. Error = ' + rtrim(ltrim(str(isnull(@ErrNo, 0), 10, 0)))
                    + ': ' + isnull(@ErrMsg, 'Msg is null');
        print @ErrStr;
        raiserror(@ErrStr, @ErrSeverity, 1);
        return -1;
    end catch
end

In C#;

DataTable lookupTable = new DataTable();
lookupTable.Columns.Add("ID", typeof(string));

// Loop through your list of values doing the following

DataRow lookupRow = lookupTable.NewRow();
lookupRow["ID"] = currentID;
lookupTable.Rows.Add(lookupRow);

// End loop

var sqlConnection = new SqlConnection(<your connection details here>)
var sqlCmd = new SqlCommand();
sqlConnection.Open();
SqlTransaction tran = sqlConnection.BeginTransaction(IsolationLevel.ReadCommitted);
sqlCmd.Connection = sqlConnection;
sqlCmd.CommandType = CommandType.StoredProcedure;
sqlCmd.CommandText = "MyProcedure";
sqlCmd.Parameters.Add(new SqlParameter("@InputTable", lookupTable));
sqlCmd.Transaction = tran;
SqlDataAdapter adapter = new SqlDataAdapter();
adapter.SelectCommand = sqlCmd;
adapter.Fill(<locale information>);
tran.Commit();

This reply was modified 3 years, 3 months ago by Chris Wooding.
This reply was modified 3 years, 3 months ago by Chris Wooding.