User Defined Function With CTE

  • Calling for TSQL help

    I need help altering a User Defined Function. WHat I need it to do, is take a string of characters and count the number of specific character appears in a specific location. The specific location changes. When I first arrived it was done by cursor...so I updated it to this

    CREATE FUNCTION [dbo].[ufn_SubDomScore_MC] ( @pInput VARCHAR(1000), @SUBDOMAIN VARCHAR(100), @Grade VARCHAR(5), @Subject VARCHAR(5), @form VARCHAR(5), @Qtype VARCHAR(5), @pSearchChar CHAR(1) )

    RETURNS INT

    BEGIN

    DECLARE @GetItemNoCURSOR

    DECLARE @vCount INT

    -- Get position numbers of item in string to count

    DECLARE @ITEMNO TABLE (

    ITEMNO INT NOT NULL)

    INSERT INTO @ITEMNO

    SELECT

    ITEMNO

    FROM

    WK_SDT_QUESTIONS

    WHERE

    SUBDOMAIN = @SUBDOMAIN

    AND grade = @grade

    AND [subject] = @Subject

    AND form = @form

    AND qtype = @qtype

    DECLARE @COUNT TABLE (VCOUNT VARCHAR(5))

    INSERT @COUNT (VCOUNT)

    SELECT SUBSTRING(@pInput, ITEMNo, 1)

    FROM @ITEMNO

    SELECT @vCount = (SELECT COUNT(*) FROM @COUNT WHERE VCOUNT = '+')

    RETURN @vCount

    END

    Which did increase the speed, however, I think using a CTE will really knock this out of the park. I attempted to do with this code

    CREATE FUNCTION [dbo].[ufn_SubDomScore2_MC] ( @pInput VARCHAR(1000), @SUBDOMAIN VARCHAR(100), @Grade VARCHAR(5), @Subject VARCHAR(5), @form VARCHAR(5), @Qtype VARCHAR(5), @pSearchChar CHAR(1) )

    RETURNS INT

    BEGIN

    DECLARE @vCount INT

    SET @vCount = 0;

    DECLARE @N VARCHAR(5);

    WITH GETITEMNO (ITEMNO) AS

    (SELECT

    CAST(ITEMNO AS INT)

    FROM

    WK_SDT_QUESTIONS

    WHERE

    SUBDOMAIN = @SUBDOMAIN

    AND grade = @grade

    AND [subject] = @Subject

    AND form = @form

    AND qtype = @qtype

    )

    SELECT @N = SUBSTRING(@pInput, ITEMNO, 1) FROM GETITEMNO

    BEGIN

    IF @N = @pSearchChar

    SET @vCount = @vCount + 1

    END

    RETURN @vCount

    END

    However, it appears to only bringing back the count from the last line of the CTE. Is there a way to do this with a CTE and if so, what am I missing?

    The DDL for the WK_SDT_QUESTION table is as follows:

    CREATE TABLE [dbo].[WK_SDT_QUESTIONS](

    [QUESTION_ID] [bigint] IDENTITY(1,1) NOT NULL,

    [grade] [varchar](5) NULL,

    [subject] [varchar](5) NULL,

    [form] [varchar](5) NULL,

    [qtype] [varchar](5) NULL,

    [ITEMNO] [varchar](10) NULL,

    [DROPPED] [varchar](5) NULL,

    [ORGANIZER] [varchar](50) NULL,

    [SUBDOMAIN] [varchar](50) NULL

    ) ON [User]

    GO

    I attached 100 rows of data for the table.

    Thanks

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • the function body can be replaced with a single select.

    CREATE FUNCTION [dbo].[ufn_SubDomScore_MC] (

    @pInput VARCHAR(1000), @SUBDOMAIN VARCHAR(100),

    @Grade VARCHAR(5), @Subject VARCHAR(5),

    @form VARCHAR(5), @Qtype VARCHAR(5),

    @pSearchChar CHAR(1) )

    RETURNS INT

    as

    begin

    declare @VCOUNT int

    SELECT @VCOUNT = count(*)

    FROM

    WK_SDT_QUESTIONS

    WHERE

    SUBDOMAIN = @SUBDOMAIN

    AND grade = @grade

    AND [subject] = @Subject

    AND form = @form

    AND qtype = @qtype

    and substring( @pInput, ITEMNO, 1 ) = '+'

    return @VCOUNT

    end

  • Antonio

    Thanks, I'll give it a whirl and compare the response time. I'll post back once I get the results

    MD

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • Antonio,

    This saved about 25% of the time. The SP that uses the function was able to produce 7M rows in 4 Hours.

    Thanks

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • Marvin,

    I'm a bit confused and I'm obviously missing something.... I looked at the 100 rows of test data you attached... Without me wading through all the code you posted, what is it that you're actually trying to do (brief but clear description)? You said you created 7M rows... can you describe the process verbally instead of me reading the code you don't want? Thanks.

    P.S. 4 Hours to gen 7M rows is a bit slow... that's why I'm asking for a clearer definition.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff

    What I have is a stored proc that passes a string of "+" and "-" which represents whether a student received a positive score on a particular question. The string is usually 40 characters. Each character in that string can represent a different skill that was assessed, however, they are not all in a row. Say, on a math test, the first character of the string might represent geometry, the second character addition and subtraction, the third might be geometry again. Normally there are around 4 to 20 different skills represented in the string.

    The table that I posted corresponds to the various skills with the itemno field being the exact location in the string that applies to that certian skill. So what the UDF is doing is taking determining which skill each character in the string represents and rather or not the student received a positive or negative score for that question. After all is said it done, the return provides the raw score for each skill to the stored proc to merge with the other information.

    Hope this definition makes since. 4 Hours was fast, as when I first arrived it was accommplished with a cursor which took about 5 to 6 hours to process the same amount of data.

    Hope this explanation helps.

    Thanks

    Marvin

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • Jeff

    Now you made me think, why even use a UDF. I could just parse the string in a CTE and do a join there. That would eliminate the call to the UDF which should speed things up.

    I'll let you know how it turns out.

    Thanks

    Marvin

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • Heh... post the other data from the other table and let's see if we can get it down to a couple of minutes, eh? 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here is the assessment table

    CREATE TABLE [dbo].[LD_ASSESSMENT](

    [CODEOD] [varchar](50) NULL,

    [CODEOGOD] [varchar](50) NULL,

    [CODEPT] [varchar](50) NULL,

    [CODEOGPT] [varchar](50) NULL,

    [YEAR] [varchar](50) NULL,

    [GRADE] [varchar](50) NULL,

    [LITHO] [varchar](50) NULL,

    [LNAME] [varchar](50) NULL,

    [FNAME] [varchar](50) NULL,

    [MI] [varchar](50) NULL,

    [STUD_ID] [varchar](50) NULL,

    [IDSOURCE] [varchar](50) NULL,

    [MC_SUO] [varchar](50) NULL,

    [BDATE] [varchar](50) NULL,

    [SEX] [varchar](50) NULL,

    [ETHNIC] [varchar](50) NULL,

    [PARTICIPATE] [varchar](50) NULL,

    [ACADEMIC_YR] [varchar](50) NULL,

    [TEST_TYPE] [varchar](50) NULL,

    [EXC] [varchar](50) NULL,

    [EXC_REASON] [varchar](50) NULL,

    [MIG] [varchar](50) NULL,

    [ELIG] [varchar](50) NULL,

    [BAS] [varchar](50) NULL,

    [LEP] [varchar](50) NULL,

    [LEP_ACCOM] [varchar](50) NULL,

    [LEP_PARA] [varchar](50) NULL,

    [LEP_ENGL] [varchar](50) NULL,

    [LEP_INT] [varchar](50) NULL,

    [LEP_BILNG] [varchar](50) NULL,

    [LEP_CUE] [varchar](50) NULL,

    [LEP_ORAL] [varchar](50) NULL,

    [LEP_ONLINE] [varchar](50) NULL,

    [LEP_TECH] [varchar](50) NULL,

    [LEP_DICT] [varchar](50) NULL,

    [LEP_EXTEND] [varchar](50) NULL,

    [LEP_OTH] [varchar](50) NULL,

    [LEP_PSP] [varchar](50) NULL,

    [LEP_EXIT] [varchar](50) NULL,

    [EXT] [varchar](50) NULL,

    [EXT_BEFORE] [varchar](50) NULL,

    [EXT_DURING] [varchar](50) NULL,

    [EXT_AFTER] [varchar](50) NULL,

    [EXT_INTSES] [varchar](50) NULL,

    [EXT_SUMMER] [varchar](50) NULL,

    [IEP] [varchar](50) NULL,

    [P504] [varchar](50) NULL,

    [DISABLTY] [varchar](50) NULL,

    [ACMOD_U] [varchar](50) NULL,

    [ACMOD_C] [varchar](50) NULL,

    [ACM_PARA] [varchar](50) NULL,

    [ACM_INT] [varchar](50) NULL,

    [ACM_ORAL] [varchar](50) NULL,

    [ACM_CUE] [varchar](50) NULL,

    [ACM_TECH] [varchar](50) NULL,

    [ACM_ONLINE] [varchar](50) NULL,

    [ACM_DIC] [varchar](50) NULL,

    [ACM_AUD] [varchar](50) NULL,

    [ACM_OTH] [varchar](50) NULL,

    [ACM_EXTEND] [varchar](50) NULL,

    [GIFT_COG] [varchar](50) NULL,

    [GIFT_NONCOG] [varchar](50) NULL,

    [GIFT_POOL] [varchar](50) NULL,

    [GIFT_TLT] [varchar](50) NULL,

    [GIFT_SRV] [varchar](50) NULL,

    [LUNCH] [varchar](50) NULL,

    [VOC_SEQ] [varchar](50) NULL,

    [VOC_NOT] [varchar](50) NULL,

    [EXCWP] [varchar](50) NULL,

    [EXCWP_REASON] [varchar](50) NULL,

    [EXCAP] [varchar](50) NULL,

    [EXCAP_REASON] [varchar](50) NULL,

    [ORRS_RD] [varchar](50) NULL,

    [ORRS_MA] [varchar](50) NULL,

    [ORRS_SC] [varchar](50) NULL,

    [ORRS_SS] [varchar](50) NULL,

    [ORRS_AH] [varchar](50) NULL,

    [ORRS_PL] [varchar](50) NULL,

    [ORWS_RD] [varchar](50) NULL,

    [ORWS_MA] [varchar](50) NULL,

    [ORWS_SC] [varchar](50) NULL,

    [ORWS_SS] [varchar](50) NULL,

    [ORWS_AH] [varchar](50) NULL,

    [ORWS_PL] [varchar](50) NULL,

    [MCRS_RD] [varchar](50) NULL,

    [MCRS_MA] [varchar](50) NULL,

    [MCRS_SC] [varchar](50) NULL,

    [MCRS_SS] [varchar](50) NULL,

    [MCRS_AH] [varchar](50) NULL,

    [MCRS_PL] [varchar](50) NULL,

    [MCAT_RD] [varchar](50) NULL,

    [MCAT_MA] [varchar](50) NULL,

    [MCAT_SC] [varchar](50) NULL,

    [MCAT_SS] [varchar](50) NULL,

    [MCAT_AH] [varchar](50) NULL,

    [MCAT_PL] [varchar](50) NULL,

    [SSSE_RDE] [varchar](50) NULL,

    [SSSE_MAE] [varchar](50) NULL,

    [SSSE_SCE] [varchar](50) NULL,

    [SSSE_SSE] [varchar](50) NULL,

    [SSSE_AHE] [varchar](50) NULL,

    [SSSE_PLE] [varchar](50) NULL,

    [SS_RDE] [varchar](50) NULL,

    [SS_MAE] [varchar](50) NULL,

    [SS_SCE] [varchar](50) NULL,

    [SS_SSE] [varchar](50) NULL,

    [SS_AHE] [varchar](50) NULL,

    [SS_PLE] [varchar](50) NULL,

    [SS_OWE] [varchar](50) NULL,

    [TOTRD] [varchar](50) NULL,

    [TOTMA] [varchar](50) NULL,

    [TOTSC] [varchar](50) NULL,

    [TOTSS] [varchar](50) NULL,

    [TOTAH] [varchar](50) NULL,

    [TOTPL] [varchar](50) NULL,

    [OR_TASK] [varchar](50) NULL,

    [OR_WR1] [varchar](50) NULL,

    [WR_SCORE] [varchar](50) NULL,

    [WP_FIN] [varchar](50) NULL,

    [WP_SCORE] [varchar](50) NULL,

    [AP_FIN] [varchar](50) NULL,

    [RDFORM] [varchar](50) NULL,

    [RD_MC] [varchar](50) NULL,

    [RD_OR] [varchar](50) NULL,

    [MAFORM] [varchar](50) NULL,

    [MA_MC] [varchar](50) NULL,

    [MA_OR] [varchar](50) NULL,

    [SCFORM] [varchar](50) NULL,

    [SC_MC] [varchar](50) NULL,

    [SC_OR] [varchar](50) NULL,

    [SSFORM] [varchar](50) NULL,

    [SS_MC] [varchar](50) NULL,

    [SS_OR] [varchar](50) NULL,

    [AHFORM] [varchar](50) NULL,

    [AH_MC] [varchar](50) NULL,

    [AH_OR] [varchar](50) NULL,

    [PLFORM] [varchar](50) NULL,

    [PL_MC] [varchar](50) NULL,

    [PL_OR] [varchar](50) NULL,

    [NRT_RDLG_RSP] [varchar](50) NULL,

    [NRT_MA_RSP] [varchar](50) NULL,

    [EPAS_EN] [varchar](50) NULL,

    [EPAS_MA] [varchar](50) NULL,

    [EPAS_RD] [varchar](50) NULL,

    [EPAS_SC] [varchar](50) NULL,

    [EPAS_COMPOSITE] [varchar](50) NULL,

    [WOFORM] [varchar](50) NULL,

    [WO_MC] [varchar](50) NULL,

    [SSID] [varchar](50) NULL

    ) ON [User]

    And this is the current stored proc

    (the snippet for reading. I do a union all for all of the different assessments

    SELECT

    LEFT(CODEOGOD, 3) AS 'District_Code',

    CASE

    WHEN EXCWP_REASON = '06' THEN 'ALT ASMNT'

    WHEN EXCAP_REASON = '06' THEN 'ALT ASMNT'

    ELSE 'KCCT'

    END 'Test_Description',

    LD_ASSESSMENT.[YEAR] + '-06-30' AS 'Assessment_School_Year_Date',

    'READING ' +

    CAST(LD_ASSESSMENT.[GRADE] AS VARCHAR(2)) + ' ' +

    CASE

    WHEN RDFORM IS NULL THEN ''

    ELSE CAST(RDFORM AS VARCHAR(10))

    END

    +' ' + 'SPRING' AS 'Item_Description',

    LD_ASSESSMENT.[YEAR] + '-04-23' AS 'Test_Date',

    SSID AS 'Student_ID',

    'ORGANIZER ID -' + CAST(b.ORGANIZER_ID AS VARCHAR(10)) AS 'Item_Response_Description',

    NULL AS 'Alpha_Value',

    [dbo].[ufn_Organizer_MC](RD_MC, b.ORGANIZER, b.grade, b.subject, b.form, b.qtype, '+') AS 'Numeric_Value',

    'N' AS 'Level_Of_Aggregation',

    LD_ASSESSMENT.[YEAR] + '-06-30' AS 'School_Year',

    CASE

    WHEN LEN(CODEOGOD) >= 4 THEN RIGHT(CODEOGOD, 3)

    ELSE '000'

    END 'Location_Code',

    NULL AS 'Achievement_Level',

    NULL AS 'Staff_ID'

    FROM LD_SDT

    LEFT OUTER JOIN WK_SDT_ORGANIZER B ON

    LD_ASSESSMENT.GRADE = b.grade

    AND b.FORM = LD_ASSESSMENT.RDFORM

    AND b.Subject = 'RD'

    AND b.qtype = 'MC'

    WHERE LD_ASSESSMENT.GRADE IN (SELECT GRADE FROM WK_ASSESSMENT_GRADE_05 WHERE SUBJECT_CD ='RD')

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • Jeff

    I just did the same with a CTE instead of a call to a UDF. Here is the snippet.

    WITH GETRDMCSCORE (SSID, GRADE, RD_MC_SCORE, organizer) AS

    (SELECT b.SSID, a.GRADE, COUNT(b.RD_MC) AS RD_MC_SCORE, a.ORGANIZER

    FROM

    WK_SDT_QUESTIONS a

    LEFT OUTER JOIN LD_SDT b ON

    a.form = b.rdFORM

    AND a.GRADE = b.GRADE

    WHERE a.subject = 'RD'

    AND a.qtype = 'MC'

    and substring(b.RD_MC, CAST(a.ITEMNO AS INT), 1 ) = '+'

    GROUP BY b.SSID, a.ORGANIZER, a.grade)

    SELECT

    LEFT(CODEOGOD, 3) AS 'District_Code',

    CASE

    WHEN EXCWP_REASON = '06' THEN 'ALT ASMNT'

    WHEN EXCAP_REASON = '06' THEN 'ALT ASMNT'

    ELSE 'KCCT'

    END 'Test_Description',

    LD_ASSESSMENT.[YEAR] + '-06-30' AS 'Assessment_School_Year_Date',

    'READING ' +

    CAST(LD_ASSESSMENT.[GRADE] AS VARCHAR(2)) + ' ' +

    CASE

    WHEN RDFORM IS NULL THEN ''

    ELSE CAST(RDFORM AS VARCHAR(10))

    END

    +' ' + 'SPRING' AS 'Item_Description',

    LD_ASSESSMENT.[YEAR] + '-04-23' AS 'Test_Date',

    LD_ASSESSMENT.SSID AS 'Student_ID',

    'ORGANIZER ID -' + CAST(b.ORGANIZER_ID AS VARCHAR(10)) AS 'Item_Response_Description',

    NULL AS 'Alpha_Value',

    c.RD_MC_SCORE AS 'Numeric_Value',

    'N' AS 'Level_Of_Aggregation',

    LD_ASSESSMENT.[YEAR] + '-06-30' AS 'School_Year',

    CASE

    WHEN LEN(CODEOGOD) >= 4 THEN RIGHT(CODEOGOD, 3)

    ELSE '000'

    END 'Location_Code',

    NULL AS 'Achievement_Level',

    NULL AS 'Staff_ID'

    FROM LD_SDT

    LEFT OUTER JOIN WK_SDT_ORGANIZER B ON

    LD_ASSESSMENT.GRADE = b.grade

    AND b.FORM = LD_ASSESSMENT.RDFORM

    AND b.Subject = 'RD'

    AND b.qtype = 'MC'

    LEFT OUTER JOIN GETRDMCSCORE c ON

    LD_ASSESSMENT.SSID = c.SSID

    AND c.ORGANIZER = b.ORGANIZER

    WHERE LD_ASSESSMENT.GRADE IN (SELECT GRADE FROM WK_ASSESSMENT_GRADE_05 WHERE SUBJECT_CD ='RD')

    The results were outstanding, it returned 543165 rows in a little over 9 Minutes. At that rate, 7M rows should return in less than two hours which is half the time of calling the UDF 7M times.

    Thanks for prompting me to look outside of the UDF

    Marvin

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • FROM LD_SDT

    LEFT OUTER JOIN WK_SDT_ORGANIZER B ON

    LD_ASSESSMENT.GRADE = b.grade

    AND b.FORM = LD_ASSESSMENT.RDFORM

    AND b.Subject = 'RD'

    AND b.qtype = 'MC'

    WHERE LD_ASSESSMENT.GRADE IN (SELECT GRADE FROM WK_ASSESSMENT_GRADE_05 WHERE SUBJECT_CD ='RD')

    ... and therein lies the source of the performance problem... a correlated subquery on the table called LD_ASSESSMENT. The outer query (that you've not included) has a join to LD_ADDESSMENT and you are using it in these inner unioned queries as a correlation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff

    Actually, I was mocking up the query. I guess I missed that one. LD_SDT was changed to LD_ASSESSMENT but I guess my search and replace missed that one. So in reality they are both one in the same.

    Marvin Dillard
    Senior Consultant
    Claraview Inc

  • Marvin,

    So, is the CTE a good enough solution for you?

    Shifting gears a bit... its pretty clever how they put the test question assessment list together so that it works against a string of +/- answers. It's a PITA, but it's still clever, none the less.

    One of the things I'm not quickly seeing in the code, is a filter to ensure you don't pick up any dropped questions.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff

    The CTE is good enough, heck, to go from 6 to 7 hours down to two is a vast improvement. There is a mechanism to ID the dropped questions, but I haven't added it yet but will to the final code I present on Tuesday.

    Thanks for all of your help

    Marvin

    Marvin Dillard
    Senior Consultant
    Claraview Inc

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply