How to compare strings by letters in certain places

  • Suppose there were tables which pre-split the words so there's an established correct baseline for comparisons.  Also there could be a script with nested loops that executes different procedures dynamically passing in combinations of test names and random tests.  Something like this

    loop 50 letter groupings
    loop 100 letter groupings
    loop 500 letter groupings

    /* ProcStartTestbench(input UDT, output a tracking ID) */
    /* sp_execsql ... (input UDT, input test group X, input test group Y, input tracking ID) */
    /* ProcEndTestbench(input tracking ID) */

    end 500 grp
    end 100 grp
    end 50 grp

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • I did a little work coming up this good test data. Here is what I cam up with...

    It'll generate 100K (adjustable) rows with both 5 char and 6 char arrays (also adjustable {up to 16}).  each array has a randomly generated number of elements (between 10 and 50 {adjustable up to 256}). Each "word" is also randomly generated.

    On my test machine, I can generate all 100K rows in ~4 seconds.

    USE tempdb;
    GO

    IF OBJECT_ID('tempdb.dbo.TestData', 'U') IS NOT NULL
    BEGIN DROP TABLE tempdb.dbo.TestData; END;
    GO

    CREATE TABLE tempdb.dbo.TestData (
    rid int NOT NULL PRIMARY KEY CLUSTERED,
    array_count_5 tinyint NOT NULL,
    word_array_5 varchar(MAX) NOT NULL,
    array_count_6 tinyint NOT NULL,
    word_array_6 varchar(MAX) NOT NULL
    );
    GO

    WITH
    cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),-- 10
    cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),-- 100
    cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n1 b),-- 10,000
    cte_Tally (n) AS (
    SELECT TOP (100000)
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM
    cte_n3 a CROSS JOIN cte_n3 b-- 100,000,000
    )
    INSERT tempdb.dbo.TestData (rid, array_count_5, word_array_5, array_count_6, word_array_6)
    SELECT
    rid = t.n,
    al.array_count_5,
    wa5.word_array_5,
    al.array_count_6,
    wa6.word_array_6
    FROM
    cte_Tally t
    CROSS APPLY ( VALUES (
    ABS(CHECKSUM(NEWID())) % 40 + 10,
    ABS(CHECKSUM(NEWID())) % 40 + 10
    ) ) al (array_count_5, array_count_6)
    CROSS APPLY (
    SELECT
    word_array_5 = STRING_AGG(w.word, ', ')
    FROM
    (
    SELECT TOP (al.array_count_5)
    ROW_NUMBER() OVER (ORDER BY t1.n)
    FROM
    ( VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1) ) t1 (n)
    CROSS APPLY ( VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1) ) t2 (n)
    ) t (n)
    CROSS APPLY (
    SELECT
    word = STRING_AGG(ltr.letter, '') --WITHIN GROUP (ORDER BY t1.n)
    FROM
    ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(4),(15),(16) ) t3 (n)
    CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 96 + (t.n / t.n))) ) ltr (letter)
    WHERE
    t3.n < 5--<<<<<<
    ) w
    ) wa5

    CROSS APPLY (
    SELECT
    word_array_6 = STRING_AGG(w.word, ', ')
    FROM
    (
    SELECT TOP (al.array_count_6)
    ROW_NUMBER() OVER (ORDER BY t1.n)
    FROM
    ( VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1) ) t1 (n)
    CROSS APPLY ( VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1) ) t2 (n)
    ) t (n)
    CROSS APPLY (
    SELECT
    word = STRING_AGG(ltr.letter, '') --WITHIN GROUP (ORDER BY t1.n)
    FROM
    ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(4),(15),(16) ) t3 (n)
    CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 96 + (t.n / t.n))) ) ltr (letter)
    WHERE
    t3.n < 6--<<<<<<
    ) w
    ) wa6;
    GO

    -- DROP TABLE tempdb.dbo.TestData;

     

    • This reply was modified 4 years, 3 months ago by  Jason A. Long.
    • This reply was modified 4 years, 3 months ago by  Jason A. Long.
  • Here are my previous scripts converted to iTVFs...

    USE tempdb;
    GO
    SET QUOTED_IDENTIFIER ON;
    GO
    SET ANSI_NULLS ON;
    GO

    CREATE FUNCTION dbo.find_all_match_positions_JL
    /* ===================================================================
    08/18/2020 JL, Created: will identify all of the position/character matches and tell you what the positions are and the matched characters.
    =================================================================== */
    --===== Define I/O parameters
    (
    @array_6 varchar(MAX),
    @array_5 varchar(MAX),
    @min_match_num int
    )
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    WITH
    cte_a6 AS (
    SELECT
    val_6 = CONVERT(char(6), TRIM(ss6.value)),
    t6.n,
    lp6.letter_in_pos
    FROM
    STRING_SPLIT(@array_6, ',') ss6
    CROSS APPLY (VALUES (1),(2),(3),(4),(5),(6) ) t6 (n)
    CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss6.value), t6.n, 1)) ) lp6 (letter_in_pos)
    ),
    cte_a5 AS (
    SELECT
    val_5 = CONVERT(char(5), TRIM(ss5.value)),
    t5.n,
    lp5.letter_in_pos
    FROM
    STRING_SPLIT(@array_5, ',') ss5
    CROSS APPLY (VALUES (1),(2),(3),(4),(5) ) t5 (n)
    CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss5.value), t5.n, 1)) ) lp5 (letter_in_pos)
    )
    SELECT
    a6.val_6,
    a5.val_5,
    char_pos = CONVERT(varchar(20), STRING_AGG(a6.n, ',') WITHIN GROUP (ORDER BY a6.n)),
    match_chars = CONVERT(varchar(20), STRING_AGG(a6.letter_in_pos, '') WITHIN GROUP (ORDER BY a6.n))
    FROM
    cte_a6 a6
    JOIN cte_a5 a5
    ON a6.n = a5.n
    AND a6.letter_in_pos = a5.letter_in_pos
    GROUP BY
    a6.val_6,
    a5.val_5
    HAVING
    COUNT(1) >= @min_match_num;
    GO
    USE tempdb;
    GO
    SET QUOTED_IDENTIFIER ON;
    GO
    SET ANSI_NULLS ON;
    GO

    CREATE FUNCTION dbo.match_string_positions_JL1
    /* ===================================================================
    08/18/2020 JL, Created: First version with the inline tally tables.
    =================================================================== */
    --===== Define I/O parameters
    (
    @array_5 varchar(MAX),
    @array_6 varchar(MAX),
    @search_positions varchar(12)

    )
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    WITH
    cte_a6 AS (
    SELECT
    val_6 = CONVERT(char(6), TRIM(ss6.value)),
    match_on = STRING_AGG(lp6.letter_in_pos, '') WITHIN GROUP (ORDER BY t6.n)
    FROM
    STRING_SPLIT(@array_6, ',') ss6
    CROSS APPLY (VALUES (1),(2),(3),(4),(5),(6) ) t6 (n)
    CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss6.value), t6.n, 1)) ) lp6 (letter_in_pos)
    WHERE
    t6.n IN (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp)
    GROUP BY
    ss6.value
    ),
    cte_a5 AS (
    SELECT
    val_5 = CONVERT(char(5), TRIM(ss5.value)),
    match_on = STRING_AGG(lp5.letter_in_pos, '') WITHIN GROUP (ORDER BY t5.n)
    FROM
    STRING_SPLIT(@array_5, ',') ss5
    CROSS APPLY (VALUES (1),(2),(3),(4),(5) ) t5 (n)
    CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss5.value), t5.n, 1)) ) lp5 (letter_in_pos)
    WHERE
    t5.n IN (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp)
    GROUP BY
    ss5.value
    )
    SELECT
    a6.val_6,
    --a6.match_on,
    a5.val_5
    --a5.match_on
    FROM
    cte_a6 a6
    JOIN cte_a5 a5
    ON a6.match_on = a5.match_on;
    GO
    USE tempdb;
    GO
    SET QUOTED_IDENTIFIER ON;
    GO
    SET ANSI_NULLS ON;
    GO

    CREATE FUNCTION dbo.match_string_positions_JL2
    /* ===================================================================
    08/18/2020 JL, Created: Second version without the inline tally tables.
    =================================================================== */
    --===== Define I/O parameters
    (
    @array_5 varchar(MAX),
    @array_6 varchar(MAX),
    @search_positions varchar(12)

    )
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    WITH
    cte_a6 AS (
    SELECT
    val_6 = CONVERT(varchar(20), TRIM(ss6.value)),
    match_on = STRING_AGG(lp6.letter_in_pos, '') WITHIN GROUP (ORDER BY sp6.n)
    FROM
    STRING_SPLIT(@array_6, ',') ss6
    CROSS APPLY (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp) sp6 (n)
    CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss6.value), sp6.n, 1)) ) lp6 (letter_in_pos)
    GROUP BY
    ss6.value
    ),
    cte_a5 AS (
    SELECT
    val_5 = CONVERT(varchar(20), TRIM(ss5.value)),
    match_on = STRING_AGG(lp5.letter_in_pos, '') WITHIN GROUP (ORDER BY sp5.n)
    FROM
    STRING_SPLIT(@array_5, ',') ss5
    CROSS APPLY (SELECT TRY_CONVERT(int, sp.value) FROM STRING_SPLIT(@search_positions, ',') sp) sp5 (n)
    CROSS APPLY ( VALUES (SUBSTRING(TRIM(ss5.value), sp5.n, 1)) ) lp5 (letter_in_pos)
    GROUP BY
    ss5.value
    )
    SELECT
    a6.val_6,
    --a6.match_on,
    a5.val_5
    --a5.match_on
    FROM
    cte_a6 a6
    INNER MERGE JOIN cte_a5 a5
    ON a6.match_on = a5.match_on;
    GO

    Testing to follow...

     

     

  • Just to get an idea of what patterns exist in the TestData table...

    SELECT 
    f.char_pos,
    COUNT(1)
    FROM
    dbo.TestData td
    CROSS APPLY dbo.find_all_match_positions_JL(td.word_array_6, td.word_array_5, 2) f
    GROUP BY
    f.char_pos
    ORDER BY
    COUNT(1) DESC;

    Results...

    /*
    char_posoccurance_count
    2,3119002
    2,4115004
    3,4112494
    1,3112334
    3,5112289
    1,4111446
    1,5111083
    1,2111015
    2,5110999
    4,5106304
    1,2,45374
    2,3,54660
    1,2,54635
    1,2,34589
    1,4,54449
    2,4,54443
    2,3,44441
    3,4,54151
    1,3,44124
    1,3,53140
    1,3,4,5329
    1,2,4,5256
    1,2,3,4232
    2,3,4,5213
    1,2,3,5119
    */
  • Using my standard test harness... (query options: Discard results after execution and use DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; between executions)

    SET NOCOUNT ON;
    GO
    -- DBCC FREEPROCCACHE WITH NO_INFOMSGS;
    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
    -- EXEC master.dbo.sp_DBCC_DropCleanBuffers_FreeProcCache;
    PRINT(CONCAT(CHAR(13), CHAR(10), N' wait a moment...', CHAR(13), CHAR(10)));
    WAITFOR DELAY '00:00:01';
    GO
    -- SET STATISTICS XML ON;
    GO
    DECLARE @_clock_start DATETIME2(7) = SYSDATETIME(), @_test_name NVARCHAR(70) = N'match_string_positions_JL1';
    PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(CONCAT(N' Start Time: ', @_clock_start, N' Test Name: ', ISNULL(NULLIF(@_test_name, N' '), N'Test Query 1'),
    REPLICATE(N' ', 100)), 148), N'¦', CHAR(13), CHAR(10), N'+', REPLICATE(N'-', 148), N'+')); SET STATISTICS IO ON;
    -- ____________________________________________________________________________________________________________________________________________
    -- ?????? place tsql here ????????????????????????????????????????????????????????????

    SELECT
    td.rid,
    td.array_count_5,
    --td.word_array_5,
    td.array_count_6,
    --td.word_array_6,
    f.val_6,
    f.val_5
    FROM
    dbo.TestData td
    CROSS APPLY dbo.match_string_positions_JL1(td.word_array_6, td.word_array_5, '1,2,4') f;

    -- ?????? place tsql here ????????????????????????????????????????????????????????????
    -- ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
    DECLARE @_clock_stop DATETIME2(7) = SYSDATETIME(); SET STATISTICS IO OFF; PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(STUFF(CONCAT(N' Finish Time: ', @_clock_stop, N' Duration: ', DATEDIFF(mcs, @_clock_start, @_clock_stop)
    / 1000000.0, N' secs. ', DATEDIFF(mcs, @_clock_start, @_clock_stop) / 1000.0, N' ms.', REPLICATE(N' ', 100)), 76, 3, N''), 148), N'¦
    +', REPLICATE(N'-', 148), N'+'));
    GO
    -- SET STATISTICS XML OFF;
    GO
    -- DBCC FREEPROCCACHE WITH NO_INFOMSGS;
    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
    -- EXEC master.dbo.sp_DBCC_DropCleanBuffers_FreeProcCache;
    PRINT(CONCAT(CHAR(13), CHAR(10), N' wait a moment...', CHAR(13), CHAR(10)));
    WAITFOR DELAY '00:00:01';
    GO
    -- SET STATISTICS XML ON;
    GO
    DECLARE @_clock_start DATETIME2(7) = SYSDATETIME(), @_test_name NVARCHAR(70) = N'match_string_positions_JL2';
    PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(CONCAT(N' Start Time: ', @_clock_start, N' Test Name: ', ISNULL(NULLIF(@_test_name, N' '), N'Test Query 2'),
    REPLICATE(N' ', 100)), 148), N'¦', CHAR(13), CHAR(10), N'+', REPLICATE(N'-', 148), N'+')); SET STATISTICS IO ON;
    -- ____________________________________________________________________________________________________________________________________________
    -- ?????? place tsql here ????????????????????????????????????????????????????????????

    SELECT
    td.rid,
    td.array_count_5,
    --td.word_array_5,
    td.array_count_6,
    --td.word_array_6,
    f.val_6,
    f.val_5
    FROM
    dbo.TestData td
    CROSS APPLY dbo.match_string_positions_JL2(td.word_array_6, td.word_array_5, '1,2,4') f;

    -- ?????? place tsql here ????????????????????????????????????????????????????????????
    -- ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
    DECLARE @_clock_stop DATETIME2(7) = SYSDATETIME(); SET STATISTICS IO OFF; PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(STUFF(CONCAT(N' Finish Time: ', @_clock_stop, N' Duration: ', DATEDIFF(mcs, @_clock_start, @_clock_stop)
    / 1000000.0, N' secs. ', DATEDIFF(mcs, @_clock_start, @_clock_stop) / 1000.0, N' ms.', REPLICATE(N' ', 100)), 76, 3, N''), 148), N'¦
    +', REPLICATE(N'-', 148), N'+'));
    GO
    -- SET STATISTICS XML OFF;
    GO
    -- DBCC FREEPROCCACHE WITH NO_INFOMSGS;
    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
    -- EXEC master.dbo.sp_DBCC_DropCleanBuffers_FreeProcCache;
    PRINT(CONCAT(CHAR(13), CHAR(10), N' wait a moment...', CHAR(13), CHAR(10)));
    WAITFOR DELAY '00:00:01';
    GO
    -- SET STATISTICS XML ON;
    GO
    DECLARE @_clock_start DATETIME2(7) = SYSDATETIME(), @_test_name NVARCHAR(70) = N'MatchStringPositions_JM';
    PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(CONCAT(N' Start Time: ', @_clock_start, N' Test Name: ', ISNULL(NULLIF(@_test_name, N' '), N'Test Query 3'),
    REPLICATE(N' ', 100)), 148), N'¦', CHAR(13), CHAR(10), N'+', REPLICATE(N'-', 148), N'+')); SET STATISTICS IO ON;
    -- ____________________________________________________________________________________________________________________________________________
    -- ?????? place tsql here ????????????????????????????????????????????????????????????

    SELECT
    td.rid,
    td.array_count_5,
    --td.word_array_5,
    td.array_count_6,
    --td.word_array_6,
    f.*
    FROM
    dbo.TestData td
    CROSS APPLY dbo.MatchStringPositions_JM(td.word_array_6, td.word_array_5, ', ', '1,2,4') f


    -- ?????? place tsql here ????????????????????????????????????????????????????????????
    -- ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
    DECLARE @_clock_stop DATETIME2(7) = SYSDATETIME(); SET STATISTICS IO OFF; PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(STUFF(CONCAT(N' Finish Time: ', @_clock_stop, N' Duration: ', DATEDIFF(mcs, @_clock_start, @_clock_stop)
    / 1000000.0, N' secs. ', DATEDIFF(mcs, @_clock_start, @_clock_stop) / 1000.0, N' ms.', REPLICATE(N' ', 100)), 76, 3, N''), 148), N'¦
    +', REPLICATE(N'-', 148), N'+'));
    GO
    -- SET STATISTICS XML OFF;
    GO

    Results...

            wait a moment...
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦ Start Time: 2020-08-18 15:19:44.5871630 Test Name: match_string_positions_JL1 ¦
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    Table 'Worktable'. Scan count 100, logical reads 1376, physical reads 0, read-ahead reads 6422, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'TestData'. Scan count 1, logical reads 5876, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦ Finish Time: 2020-08-18 15:19:51.5247878 Duration: 6.937624 secs. 6937.624000 ms. ¦
    +----------------------------------------------------------------------------------------------------------------------------------------------------+

    wait a moment...
    Warning: The join order has been enforced because a local join hint is used.
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦ Start Time: 2020-08-18 15:19:53.6341631 Test Name: match_string_positions_JL2 ¦
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    Table 'Worktable'. Scan count 100, logical reads 1382, physical reads 0, read-ahead reads 6422, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'TestData'. Scan count 1, logical reads 5876, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦ Finish Time: 2020-08-18 15:19:59.9624268 Duration: 6.328263 secs. 6328.263000 ms. ¦
    +----------------------------------------------------------------------------------------------------------------------------------------------------+

    wait a moment...
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦ Start Time: 2020-08-18 15:20:02.1349730 Test Name: MatchStringPositions_JM ¦
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    Table 'Worktable'. Scan count 1640, logical reads 296452, physical reads 0, read-ahead reads 6422, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'TestData'. Scan count 1, logical reads 5876, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦ Finish Time: 2020-08-18 15:20:08.4787760 Duration: 6.343803 secs. 6343.803000 ms. ¦
    +----------------------------------------------------------------------------------------------------------------------------------------------------+

    Completion time: 2020-08-18T15:20:08.4337812-04:00

    • This reply was modified 4 years, 3 months ago by  Jason A. Long.
    • This reply was modified 4 years, 3 months ago by  Jason A. Long.
  • Steve Collins- I didn't see a "function-ized" version of your code in the thread. If I missed it, point it out and I'll include it in another round of testing.

    • This reply was modified 4 years, 3 months ago by  Jason A. Long.
    • This reply was modified 4 years, 3 months ago by  Jason A. Long.
  • Jeff Moden wrote:

    I've finally had a chance to look at all that more carefully and, man, that has a VERY interesting execution plan.  With a pre-split table, I might be able to do some interesting things as a "fuzzy lookup".  It won't be as good as a Havenstein "Distance Check" but this might even lead to that.

    Like I said, this is one of the reasons I love this community... so many ideas, so little time.

    The best "fuzzy lookup" I've come across to date is the "Double Metaphone" which is like a soundex on steroids.

    When you mention Havenstein "Distance Check", I'm assuming that you mean "Levenshtein Distance"... As far as I know, "Havenstein Distance" is for calculating physical distance on a spherical surface.

     

  • Jason A. Long wrote:

    Jeff Moden wrote:

    I've finally had a chance to look at all that more carefully and, man, that has a VERY interesting execution plan.  With a pre-split table, I might be able to do some interesting things as a "fuzzy lookup".  It won't be as good as a Havenstein "Distance Check" but this might even lead to that.

    Like I said, this is one of the reasons I love this community... so many ideas, so little time.

    The best "fuzzy lookup" I've come across to date is the "Double Metaphone" which is like a soundex on steroids.

    When you mention Havenstein "Distance Check", I'm assuming that you mean "Levenshtein Distance"... As far as I know, "Havenstein Distance" is for calculating physical distance on a spherical surface.

    That's correct.  I certainly did get then names mixed up.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff - I'm looking and comparing my second function to your code to see if I can spot the reason(s) for the difference in execution plans and, in particular, the big difference in scan counts and logical reads... (Still a bit of a mystery)

    I know it's unrelated but I'm curious to know why you are changing the collation to "Latin1_General_BIN" in the final join.

    I ran your code with it and without it and it doesn't appear to be having an impact... at least not with the CE/CL versions my environment is running. That said, I've never know you to do something like this without a compelling reason...

  • Jason A. Long wrote:

    Steve Collins- I didn't see a "function-ized" version of your code in the thread. If I missed it, point it out and I'll include it in another round of testing.

    Why is this based on functions and not procedures?  For my part I'd like to switch to dynamic SQL because I think it can be super fast 🙂

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Steve Collins wrote:

    Jason A. Long wrote:

    Steve Collins- I didn't see a "function-ized" version of your code in the thread. If I missed it, point it out and I'll include it in another round of testing.

    Why is this based on functions and not procedures?  For my part I'd like to switch to dynamic SQL because I think it can be super fast 🙂

    I posted my 100K row data generator (post #3781346) and test harness (post #3781394). The time to beat is in the 6.3 second neighborhood...

Viewing 11 posts - 31 through 40 (of 40 total)

You must be logged in to reply to this topic. Login to reply