Parallelising a Scalar Function Call within SSMS

Question

Parallelising a Scalar Function Call within SSMS

AndrewJacksonZA

SSCarpal Tunnel

Points: 4143
More actions
April 11, 2019 at 7:36 am

#3115063
Hello
Scenario:
I have a script that is the following:
```
INSERT INTO
    StagingTable2 (LotsOfColumns,
                   FunctionResult
                  )
SELECT
    s1.LotsOfColumns,
    MyScalarFuntion(s1.Col1, s1.Col2) AS FunctionResult
FROM
    StagingTable1 AS s1
```
That takes very long to run as it runs in serial. The function only uses those two columns, nothing from any other table. No rows are dependant on any other rows. The way that I parallelised this was to add a column to StagingTable1 with the first character of a string, open up 26 tabs in SSMS, and run the following:
```
INSERT INTO
    StagingTable2 (LotsOfColumns,
                   FunctionResult
                  )
SELECT
    s1.LotsOfColumns,
    MyScalarFuntion(s1.Col1, s1.Col2) AS FunctionResult
FROM
    StagingTable1 AS s1 WITH (NOLOCK)
WHERE
    s1.FirstCharOfString = 'A'
```
This cut the time down drastically, as expected.
Question:
How can I parallelise this function call within SSMS without opening up multiple tabs and calling each letter individually, or using something like SSIS or Alteryx to do what is effectively the same thing? This would be *SO* useful for investigations on large-for-us data sets like I am currently doing..
Thank you
Andrew

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Eirikur Eiriksson SSC Guru Points: 182912 More actions · Answer 1

Quick thought, the best way would be to inline the function's code in the query. This would enable the server to use a parallel execution plan and also remove the execution overhead of the scalar function.

😎

Can you post the full code of the function?

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 2

Unless you're planning to move to the most recent release of Sql Server, you cannot. A good alternative is to rework the UDF as an inline table-valued function. Jeff Moden has an article on this if I can find it...

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

AndrewJacksonZA SSCarpal Tunnel Points: 4143 More actions · Answer 3

Thank you for your replies. I'm taking over from a contractor who put this Levenshtein matching function in place https://stackoverflow.com/questions/36492073/sql-full-text-search-result-priority/36613504#36613504 (literally this function, he copied the code and renamed the two functions to fit into our naming schema.)

If SQL Server has built-in better and faster matching functions, I'm also open to hearing it please.

This reply was modified 5 years, 8 months ago by AndrewJacksonZA.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

It doesn't but there are a few TSQL equivalents around. I've written a Levenshtein workalike which is quite fast, but I'm having some trouble finding it!

Edit - found it: https://www.sqlservercentral.com/forums/topic/how-to-use-scalar-function-without-while

This reply was modified 5 years, 8 months ago by ChrisM@Work.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Eirikur Eiriksson SSC Guru Points: 182912 More actions · Answer 5

ChrisM@Work wrote:

It doesn't but there are a few TSQL equivalents around. I've written a Levenshtein workalike which is quite fast, but I'm having some trouble finding it! Edit - found it: https://www.sqlservercentral.com/forums/topic/how-to-use-scalar-function-without-while

I remember that one, did one about year earlier but at work but not certain that I posted it on SSC

😎

Will dig around and see if I can find it

Eirikur Eiriksson SSC Guru Points: 182912 More actions · Answer 6

Quick update, this is the most efficient code I've come up with so far, beats the bashes out of anything I've seen so far

😎

DECLARE @STRING01   NVARCHAR(100)   = N'levenshtein distance sql'--N'giff' -- ;
DECLARE @STRING02   NVARCHAR(100)   = N'levenshtein sql server' -- N'sitting' -- N'gxaff' --;

;WITH T(N) AS (SELECT X.N FROM ( VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
,MAX_STRING_LEN (MXSTRLEN) AS
(
    SELECT
        CASE WHEN LEN(@STRING01)>= LEN(@STRING02) THEN LEN(@STRING01)
        ELSE LEN(@STRING02)
    END AS MXSTRLEN
)
,NUMS(N) AS 
(
    SELECT TOP((SELECT MSL.MXSTRLEN FROM MAX_STRING_LEN MSL)) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
    FROM T T1 CROSS JOIN T T2
)
,STRING_SET_SIMPLE AS 
(
    SELECT
        NM.N
       ,SUBSTRING(@STRING01,NM.N,1) AS CHAR01
       ,SUBSTRING(@STRING02,NM.N,1) AS CHAR02
    FROM NUMS NM
)
,COMPARE_STRINGS AS
(
    SELECT
        SSS.N 
       ,SSS.CHAR01
       ,SSS.CHAR02
       ,NULLIF((CHARINDEX(SUBSTRING(@STRING01,SSS.N,1),@STRING02,SSS.N) - SSS.N),SSS.N) AS CMP01
       ,NULLIF((CHARINDEX(SUBSTRING(@STRING02,SSS.N,1),@STRING01,SSS.N) - SSS.N),SSS.N) AS CMP02
    FROM    STRING_SET_SIMPLE   SSS
)
SELECT
    (SUM(CASE
        WHEN CS.CMP01 < 0 THEN 1.0 
        ELSE 0
    END)
    + SUM(CASE
        WHEN CS.CMP02 < 0 THEN 1.0 
        ELSE 0
    END)) / 2.0  AS C2SUM
    ,SUM(CASE
        WHEN CS.CMP01 < 0 THEN 1.0 
        ELSE 0
    END) AS C1SUM
    ,SUM(CASE
        WHEN CS.CMP02 < 0 THEN 1.0 
        ELSE 0
    END)  AS C2SUM
FROM    COMPARE_STRINGS CS;