Stored Procedure Performance

  • cbrammer1219 (6/27/2014)


    So let me ask what is the best approach for setting up a Data mart as it looks like I am going to be the one to do this, and I want to do it right??

    Uh, sorry, but that goes WAY beyond a forum question!!! I have known projects that took MULTIPLE YEARS to construct a reporting system. Even the simplest of "data marts" can easily be several man-months of effort to get right.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I guess I should have asked differently, what are some good resources for reading to do this?

  • You might start with this

    http://sqlblog.com/blogs/stacia_misner/archive/2010/09/13/28742.aspx

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • there are no pk or fk in the tables

    Don't care about pk/fk specifically. What indexes currently exist on these tables? Could you add a computed column(s) and create an index on it(them)?

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Here are the indexes on these tables, so how do I eliminate CONFLICT_EXPLICIT?

  • Do you mean convert_implicit?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • yes..The only thing I am converting is the date, it is of course a string in the text file and when I substring the text file I nvarchar it and then write to the calltrace table as RTRIM(LTRIM(REPLACE(REPLACE(REPLACE(REPLACE(SUBSTRING(calldata,0,7),'-',''),'%',''),'+',''),'/','-'))) + '-' + CONVERT(nvarchar(4),YEAR(GetDate())) because the date in the text file only has 'mmdd' and need to convert that to a date for SSRS datepicker to compare between startdate and enddate.

  • When you do the conversion that is an explicit conversion.

    If you allow SQL Server do the conversion, that is an implicit conversion.

    Common places for that to happen is in joins and predicates.

    So, to fix it, manually convert the data in the joins to the same types. Or change the data type in the tables to match data types.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • cbrammer1219 (6/27/2014)


    It is even messier than that, I created a ETL package that cleaned it up even more, but after importing all of the files I am processing, found there was even more things that needed to be done to the data. There is no data validation on any of the data entry screens because, the users weren't trained properly and didn't know how to move forward in the data entry screens, so Management allowed the data validation to be removed, and now this has caused extremely messy data as you can imagine, so another developer and myself are looking at creating a datamart to do our reporting off, but as you can imagine, they want everything now, but don't understand the nightmare they have created. "Oh forgot to mention this is 2012 SQL."

    All of that seems to mean that you're large-and-incharge of what the tables might look like. I recommend that you create a persisted computed column that removes all the non-digit junk they've handed you in the data and then index that column along with some (possibly) INCLUDEd columns but do include the PK column and make the index UNIQUE for even better performance. You won't have to do all that nasty removal on every criteria that you write after that. And, if the number if digits is less than 7, simply have the persisted computed column contain a NULL.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Looking back at the original query again, you might also want the persisted computed column to return a NULL if the first digit is a "0" or if the first 5 digits are "1's".

    As for the datamart stuff. I find that usually overkill for simple reporting requirements. Some intelligent additions of things like persisted computed columns and some good code are all that you need for such reporting.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • cbrammer1219 (6/28/2014)


    yes..The only thing I am converting is the date, it is of course a string in the text file and when I substring the text file I nvarchar it and then write to the calltrace table as

    RTRIM(LTRIM(REPLACE(REPLACE(REPLACE(REPLACE(SUBSTRING(calldata,0,7),'-',''),'%',''),'+',''),'/','-'))) + '-' + CONVERT(nvarchar(4),YEAR(GetDate()))

    because the date in the text file only has 'mmdd' and need to convert that to a date for SSRS datepicker to compare between startdate and enddate.

    If you insist on storing a date as an NVARCHAR, you'll continue to have that problem and several other problems that haven't yet happened, but will.

    Once again, make a persisted computed column that does all the cleanup and the addition of the current year but have it convert that end result to a DATETIME. After that, your sorting and enduser formatting will become trivial.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Could you provide any sort of example of what you mean, like I said I'm new to SQL development. I have always had a competent DBA that had things setup the right way.

  • cbrammer1219 (6/28/2014)


    Could you provide any sort of example of what you mean, like I said I'm new to SQL development. I have always had a competent DBA that had things setup the right way.

    Ugh! I feel your pain. With the idea of teaching a man to fish, the first step would be to become familiar with what you're working with. The tool to do that is Books Online and, while in SSMS, press the {f1} key to get there. If someone neglected to install it, then an internet search for PERSISTED COMPUTED COLUMNS SQL SERVER would get you there.

    First of all, let's talk a bit about what a PERSISTED COMPUTED COLUMN actually is. By the name, you might have guessed that it contains a calculation and you'd be correct. You can't actually insert data into a computed column. "Normal" computed columns have two problems, though. The first is, you can't index a non-persisted column. The second is that every time you read a row with non-persisted computed columns in them, they all get recalculated and that can make for some wicked slow code kind of like the original code that you're trying to speed up.

    A PERSISTED Computed Column is like a single column mini data wharehouse. The column actually materializes (which allows you to put an index on it) when you insert a new row. If you change what the computed column is based on, then it updates but it doesn't need to recalculate just because you read from it.

    Ok... what forumula to use? Computed Columns require a scalar result from a single expression. We not only want to clean out the characters that you replaced in your code but, because humans are involved, we want to clean out any non-digit characters. We also want to apply some basic rules for what is considered to be a "valid" phone number. While we could probably get it all done in the single scalar expression that a Computed Column must contain, but it's far easier to build it in a function. That would also allow us to easily apply all of the same rules to other columns, as well.

    With that in mind, you should build a function similar to this one or just use this one if it meets your requirements.

    CREATE FUNCTION dbo.PhoneNumberDigitsOnly

    /***************************************************************

    Purpose;

    Strips all non-digit characters and then returns a NULL if any

    of the following are true. Otherwise, returns the cleaned

    phone number consisting of only digits.

    1. Phone number starts with a zero.

    2. Phone mumber starts with 11111.

    3. Phone number is less than 7 digits.

    Note: This is necessarily a Scalar Function to support certain

    Persisted Computed Columns. You'll also find the WHILE

    loop in this particular function pretty tough to beat

    because of the short ciruiting it does.

    Revision History:

    Rev 00 - Jeff Moden - 28 Jun 2014

    ***************************************************************/

    --===== Declare the I/O for this function

    (@pPhoneNumber VARCHAR(30))

    RETURNS VARCHAR(30) WITH SCHEMABINDING AS

    BEGIN

    --===== See if there are ANY non-numeric-digit

    -- characters in the string. Remember the position

    -- of the first one (will be zero if there are none)

    DECLARE @Position INT;

    SELECT @Position = PATINDEX('%[^0-9]%',@pPhoneNumber)

    ;

    --===== Strip any non-numeric-digit characters by

    -- replacing them with nothing. This will short

    -- circuit if there are none.

    WHILE @Position > 0

    SELECT @pPhoneNumber = STUFF(@pPhoneNumber,@Position,1,'')

    ,@Position = PATINDEX('%[^0-9]%',@pPhoneNumber)

    ;

    --===== Apply some other rules that make for bad numbers.

    -- If any of the rules apply, return a NULL. If none

    -- of the rules apply, return the cleaned phone number.

    SELECT @pPhoneNumber =

    CASE

    WHEN @pPhoneNumber LIKE '0%'

    OR @pPhoneNumber LIKE '11111%'

    OR LEN(@pPhoneNumber) < 7

    THEN NULL

    ELSE @pPhoneNumber

    END

    ;

    RETURN @pPhoneNumber;

    END

    ;

    Up next, a demonstration. We first need a table and some data to test on. As is normal for me, details are in the code.

    --===== Create the test table.

    -- Note that this is NOT a part of the solution!

    -- We're just creating a table for demonstration.

    CREATE TABLE #MyHead

    (

    SomeID INT IDENTITY(1,1)

    ,Contact_Phone VARCHAR(30)

    )

    ;

    --===== Now we'll populate the table with some "good"

    -- and some "bad" phone numbers according to the

    -- various rules.

    INSERT INTO #MyHead

    (Contact_Phone)

    SELECT '(123) 456-7890' UNION ALL

    SELECT '0(937) 250-8988' UNION ALL

    SELECT '111118391520' UNION ALL

    SELECT '9375151212' UNION ALL

    SELECT '123456' UNION ALL

    SELECT '(012) 345=6789'

    ;

    --===== This shows what we have right now.

    SELECT * FROM #MyHead

    ;

    If you're not running this code as we're going along here, then you're not bagging what I'm raking. 😛

    And now for the demo of how to add the PERSISTED COMPUTED COLUMN...

    --===== This is how we would add a PERSISTED

    -- COMPUTED COLUMN for each phone number

    -- column in the various tables and this

    -- is what you'd do your phone number joins

    -- and criteria on.

    ALTER TABLE #MyHead

    ADD Cleaned_Contact_Phone AS

    (

    dbo.PhoneNumberDigitsOnly(Contact_Phone)

    ) PERSISTED

    ;

    --===== This is what the table looks like now.

    -- Notice the NULLs in the new column. Those

    -- are for the phone numbers that don't validate

    -- according to the rules you gave.

    SELECT * FROM #MyHead

    ;

    You should add an index to the new computed column(s) to help your queries. I'd do that for you but a single column index probably isn't the right thing to do in this case and I don't know your code nor your data nor your tables well enough to hazard a good guess on that.

    Anyway, once you have that type of thing done to all the columns that need it, your criteria would go from this....

    WHERE RTRIM(REPLACE(REPLACE(REPLACE(REPLACE(a.CONTACT_PHONE, '(', ''), ')', ''), ' ', ''), '-', '')) <> ''

    AND a.Default_list_no = '00'

    AND LEN(REPLACE(REPLACE(REPLACE(REPLACE(a.CONTACT_PHONE, '(', ''), ')', ''), ' ', ''), '-', '')) > 6

    AND REPLACE(REPLACE(REPLACE(REPLACE(a.CONTACT_PHONE, '(', ''), ')', ''), ' ', ''), '-', '') NOT LIKE '0%'

    AND REPLACE(REPLACE(REPLACE(REPLACE(a.CONTACT_PHONE, '(', ''), ')', ''), ' ', ''), '-', '') NOT LIKE '1111%'

    ... to this ...

    WHERE Cleaned_Contact_Phone > '' --NOT NULL and NOT BLANK

    AND a.Default_list_no = '00'

    ... and actually stand a chance of using an index. Because the column is persisted, it also alleviates all of the time that you spend on every query by totally eliminating all of the calculations you had to do for every row in the table.

    Some things to study for in Books Online (affectionately known as "BoL"):

    Scalar Functions (should normally be avoided like the plague but are required here)

    While Loops (should normally be avoided like the plague but works very well in this case especially with the built in short circuit)

    PATINDEX function

    STUFF function

    CASE function

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s. Just to be sure, you would need to change the column names in the code that creates the PERSISTED COMPUTED COLUMN to match whatever column it is that you're working with. Yeah... I know that's pretty obvious but I've seen lots worse happen. Gotta make sure. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This is flat out a problem:

    RTRIM(REPLACE(REPLACE(REPLACE(REPLACE(a.CONTACT_PHONE

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 16 through 30 (of 53 total)

You must be logged in to reply to this topic. Login to reply