Better TSQL Performance

Question

Better TSQL Performance

PrettyMegaDBA

SSC Veteran

Points: 274
More actions
November 21, 2017 at 10:51 am

#409784

Hi everyone,
I am trying to improve query performance, I have encapsulated the following query into a stored procedure and have made some structural changes, added indices, etc (I an inherited a poorly designed db), but I feel I can do more, looking for advice: Here is the proc
IF OBJECT_ID ( 'dbo.MA_GET_LOAN_FUNDED_MARKET', 'P' ) IS NOT NULL
DROP PROCEDURE dbo.MA_GET_LOAN_FUNDED_MARKET;
GO
CREATE PROCEDURE dbo.MA_GET_LOAN_FUNDED_MARKET
@LoanFundedMarket nvarchar(50),
@AppVolMarket nvarchar(50)
AS
BEGIN
SET NOCOUNT ON;
SELECT DISTINCT Cast(currentYear.respondent AS BIGINT) AS respondent,
      appvol,
      c.institution_name       AS Institution_Name,
      marketsharebyno,
      approved,
      denied,
      funded,
      approvedbyapp,
      fundedbyapp,
      fundedbyapproved,
      currentYear.volfundeddollar,
      MarketShareDollar = CASE
             WHEN @LoanFundedMarket > 0 THEN
             currentYear.volfundeddollar /
             @LoanFundedMarket
             ELSE 0
            END,
      GrowthByVolFunded = CASE
             WHEN priorYear.volfundeddollar > 0 THEN
      (
      currentYear.volfundeddollar - priorYear.volfundeddollar ) / priorYear.volfundeddollar
      ELSE 0
            END,
      Cast(currentYear.respondent AS BIGINT) AS respondent,
      Rank()
      OVER (
       ORDER BY appvol DESC)     AS ranking
FROM (SELECT Cast(instdtnew_2016.respondent AS BIGINT) AS respondent,
     Sum(noofapp)          AS appVol,
     instdtnew_2016.agency,
     Sum(Isnull(approved, 0))      AS approved,
     Sum(denied)          AS Denied,
     Sum(funded)          AS Funded,
     ApprovedByApp = CASE
           WHEN Sum(noofapp) > 0 THEN
           Sum(approved) / Sum(Cast(
           noofapp AS FLOAT))
           ELSE 0
           END,
     FundedByApp = CASE
           WHEN Sum(noofapp) > 0 THEN
           Sum(funded) / Sum(Cast(
           noofapp AS FLOAT))
           ELSE 0
          END,
     FundedByApproved = CASE
            WHEN Sum(approved) > 0 THEN
            Sum(funded) / Sum(
            Cast(approved AS FLOAT))
            ELSE 0
            END,
     MarketSharebyNo = CASE
            WHEN @AppVolMarket > 0 THEN
            Sum(noofapp) / @AppVolMarket
            ELSE 0
           END,
     Sum(volfundeddollar)       AS VolFundedDollar,
     MarketShareDollar = CASE
             WHEN @LoanFundedMarket > 0 THEN (
             Sum(volfundeddollar) / @LoanFundedMarket )
             ELSE 0
            END
   FROM instdtnew_2016
     INNER JOIN ma_respondent_lookup
        ON instdtnew_2016.respondent =
         ma_respondent_lookup.respondent
         AND instdtnew_2016.agency =
          ma_respondent_lookup.agency
   WHERE (( instdtnew_2016.state IN ( 2, 1, 5, 4,
               6, 8, 9, 11,
               10, 12, 13, 15,
               19, 16, 17, 18,
               20, 21, 22, 25,
               24, 23, 26, 27,
               29, 28, 30, 37,
               38, 31, 33, 34,
               35, 32, 36, 39,
               40, 41, 42, 72,
               44, 45, 46, 47,
               48, 49, 51, 50,
               53, 55, 54, 56 ) ))
   GROUP BY instdtnew_2016.respondent,
      instdtnew_2016.agency) currentYear
   LEFT JOIN (SELECT Cast(instdtnew_2015.respondent AS BIGINT) AS respondent,
         instdtnew_2015.agency       AS Agency,
         Sum(Cast(volfundeddollar AS FLOAT))  AS
         VolFundedDollar
      FROM instdtnew_2015
      WHERE (( instdtnew_2015.state IN ( 2, 1, 5, 4,
                  6, 8, 9, 11,
                  10, 12, 13, 15,
                  19, 16, 17, 18,
                  20, 21, 22, 25,
                  24, 23, 26, 27,
                  29, 28, 30, 37,
                  38, 31, 33, 34,
                  35, 32, 36, 39,
                  40, 41, 42, 72,
                  44, 45, 46, 47,
                  48, 49, 51, 50,
                  53, 55, 54, 56 ) ))
      GROUP BY instdtnew_2015.respondent,
          instdtnew_2015.agency) priorYear
     ON Cast(currentYear.respondent AS BIGINT) = Cast(
      priorYear.respondent AS BIGINT)
      AND currentYear.agency = priorYear.agency
   INNER JOIN (SELECT Cast(a.respondent AS BIGINT) AS respondent,
         a.agency,
         CASE
          WHEN a.cds_id IS NOT NULL
           AND b.charter_type = 'Federal'
           AND Upper(a.institution_name) NOT LIKE
             '%CREDIT UNION%'
           AND Upper(a.institution_name) NOT LIKE '%CU'
         THEN
          a.institution_name + 'FCU'
          WHEN a.cds_id IS NOT NULL
           AND b.charter_type = 'State'
           AND Upper(a.institution_name) NOT LIKE
             '%CREDIT UNION%'
           AND Upper(a.institution_name) NOT LIKE '%CU'
         THEN
          a.institution_name + 'CU'
          ELSE a.institution_name
         END         AS institution_name
       FROM [dbo].[ma_vw_respondents] a
         LEFT OUTER JOIN header b
             ON a.cds_id = b.cds_recnum) c
     ON Cast(currentYear.respondent AS BIGINT) = Cast(
      c.respondent AS BIGINT)
      AND c.agency = currentYear.agency
ORDER BY marketsharebyno DESC
END
GO
---------------------------------------------------------
Thoughts, comments, advice appreciated. Anything I might be missing here?
TIA,
PMD

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Umer Akram SSC-Insane Points: 22209 More actions · Answer 1

First thing first, lets format your sp in a readable form so that if someone interested to help, he/she could easily read the code.

SELECT DISTINCT Cast(currentYear.respondent AS BIGINT) AS respondent, appvol, c.institution_name AS Institution_Name, marketsharebyno, approved, denied, funded, approvedbyapp, fundedbyapp, fundedbyapproved, currentYear.volfundeddollar, MarketShareDollar = CASE WHEN @LoanFundedMarket > 0 THEN currentYear.volfundeddollar / @LoanFundedMarket ELSE 0 END, GrowthByVolFunded = CASE WHEN priorYear.volfundeddollar > 0 THEN ( currentYear.volfundeddollar - priorYear.volfundeddollar ) / priorYear.volfundeddollar ELSE 0 END, Cast(currentYear.respondent AS BIGINT) AS respondent, Rank() OVER ( ORDER BY appvol DESC ) AS ranking FROM (SELECT Cast(instdtnew_2016.respondent AS BIGINT) AS respondent, instdtnew_2016.agency, Sum(noofapp) AS appVol, Sum(Isnull(approved, 0)) AS approved, Sum(denied) AS Denied, Sum(funded) AS Funded, ApprovedByApp = CASE WHEN Sum(noofapp) > 0 THEN Sum(approved) / Sum(Cast( noofapp AS FLOAT)) ELSE 0 END, FundedByApp = CASE WHEN Sum(noofapp) > 0 THEN Sum(funded) / Sum(Cast( noofapp AS FLOAT)) ELSE 0 END, FundedByApproved = CASE WHEN Sum(approved) > 0 THEN Sum(funded) / Sum( Cast(approved AS FLOAT)) ELSE 0 END, MarketSharebyNo = CASE WHEN @AppVolMarket > 0 THEN Sum(noofapp) / @AppVolMarket ELSE 0 END, Sum(volfundeddollar) AS VolFundedDollar, MarketShareDollar = CASE WHEN @LoanFundedMarket > 0 THEN ( Sum(volfundeddollar) / @LoanFundedMarket ) ELSE 0 END FROM instdtnew_2016 INNER JOIN ma_respondent_lookup ON instdtnew_2016.respondent = ma_respondent_lookup.respondent AND instdtnew_2016.agency = ma_respondent_lookup.agency WHERE (( instdtnew_2016.state IN ( 2, 1, 5, 4, 6, 8, 9, 11, 10, 12, 13, 15, 19, 16, 17, 18, 20, 21, 22, 25, 24, 23, 26, 27, 29, 28, 30, 37, 38, 31, 33, 34, 35, 32, 36, 39, 40, 41, 42, 72, 44, 45, 46, 47, 48, 49, 51, 50, 53, 55, 54, 56 ) )) GROUP BY instdtnew_2016.respondent, instdtnew_2016.agency) currentYear LEFT JOIN (SELECT Cast(instdtnew_2015.respondent AS BIGINT) AS respondent , instdtnew_2015.agency AS Agency, Sum(Cast(volfundeddollar AS FLOAT)) AS VolFundedDollar FROM instdtnew_2015 WHERE (( instdtnew_2015.state IN ( 2, 1, 5, 4, 6, 8, 9, 11, 10, 12, 13, 15, 19, 16, 17, 18, 20, 21, 22, 25, 24, 23, 26, 27, 29, 28, 30, 37, 38, 31, 33, 34, 35, 32, 36, 39, 40, 41, 42, 72, 44, 45, 46, 47, 48, 49, 51, 50, 53, 55, 54, 56 ) )) GROUP BY instdtnew_2015.respondent, instdtnew_2015.agency) priorYear ON Cast(currentYear.respondent AS BIGINT) = Cast( priorYear.respondent AS BIGINT) AND currentYear.agency = priorYear.agency INNER JOIN (SELECT Cast(a.respondent AS BIGINT) AS respondent, a.agency, CASE WHEN a.cds_id IS NOT NULL AND b.charter_type = 'Federal' AND Upper(a.institution_name) NOT LIKE '%CREDIT UNION%' AND Upper(a.institution_name) NOT LIKE '%CU' THEN a.institution_name + 'FCU' WHEN a.cds_id IS NOT NULL AND b.charter_type = 'State' AND Upper(a.institution_name) NOT LIKE '%CREDIT UNION%' AND Upper(a.institution_name) NOT LIKE '%CU' THEN a.institution_name + 'CU' ELSE a.institution_name END AS institution_name FROM [dbo].[ma_vw_respondents] a LEFT JOIN header b ON a.cds_id = b.cds_recnum) c ON Cast(currentYear.respondent AS BIGINT) = Cast( c.respondent AS BIGINT) AND c.agency = currentYear.agency ORDER BY marketsharebyno DESC;

Jeff Moden SSC Guru Points: 1004401 More actions · Answer 2

See that DISTINCT? Figure out which join(s) are causing the need for that and fix it. That will almost certainly be the root of your performance problem. If you want deeper help on such a performance problem, please see the second link under "Helpful Links" for what else we need to help here.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Umer Akram SSC-Insane Points: 22209 More actions · Answer 3

Seems like you have manual partition and your report is a comparison between this two years. Which means you have fair amount of data to play with.
How much data do you have in these tables and how much data is being return once you execute this SP?

Following are few things which you can start your working on.

1. Did you check which part of query is taking long time? By just looking at your code you are using Cast while joining the table.
Which is might be hurting you. You should be joining them without Casting and if you don't have index on it then i would suggest to create index on it.
2. Instead of doing Aggregation and Decision at the same time, it is always better to do PRE-AGGREGATION and then do the CASE Statement Something like this:
SELECT a.respondent , a.agency , Calc.ApprovedByApp , calc.FundedByApp , Calc.FundedByApproved , Calc.MarketSharebyNo , Calc.MarketShareDollar FROM ( SELECT FT16.respondent , FT16.agency , ISNULL(Sum(noofapp) ,0) AS noofapp, , ISNULL(Sum(approved),0) AS approved, , ISNULL(Sum(denied),0) AS Denied, , ISNULL(Sum(funded),0) AS Funded, , ISNULL(Sum(volfundeddollar),0) AS volfundeddollar

FROM instdtnew_2016 AS FT16 INNER JOIN ma_respondent_lookup lkp ON FT16.respondent = lkp.respondent AND FT16.agency = lkp.agency WHERE (( FT16.state IN ( 2, 1, 5, 4, 6, 8, 9, 11, 10, 12, 13, 15, 19, 16, 17, 18, 20, 21, 22, 25, 24, 23, 26, 27, 29, 28, 30, 37, 38, 31, 33, 34, 35, 32, 36, 39, 40, 41, 42, 72, 44, 45, 46, 47, 48, 49, 51, 50, 53, 55, 54, 56 ) ))

GROUP BY FT16.respondent, FT16.agency ) A CROSS APPLY ( SELECT

ApprovedByApp = CASE WHEN noofapp > 0 THEN approved / noofapp * 1.0 ELSE 0 END, FundedByApp = CASE WHEN noofapp > 0 THEN funded / noofapp * 1.0 ELSE 0 END, FundedByApproved = CASE WHEN approved > 0 THEN funded / approved * 1.0 ELSE 0 END, MarketSharebyNo = CASE WHEN @AppVolMarket > 0 THEN noofapp / @AppVolMarket * 1.0 ELSE 0 END, MarketShareDollar = CASE WHEN @LoanFundedMarket > 0 THEN volfundeddollar / @LoanFundedMarket ELSE 0 END ) Calc

3. You are also using Distinct, which means your data is getting duplicated due to a join or something. Did you check which part of your query is generating duplicate entries? it seems that your Last DriveTable in your query might be causing this issue. Try to get ride of these duplicates and Remove the DISTINCT Clause from the SELECT.

Grant Fritchey SSC Guru Points: 398904 More actions · Answer 4

I'd probably move this stuff:
FT16..state IN state IN (( 22,, 11,, 55,, 44,, 66,, 88,, 99,, 1111,, 1010,, 1212,, 1313,, 1515,, 1919,, 1616,, 1717,, 1818,, 2020,, 2121,, 2222,, 2525,, 2424,, 2323,, 2626,, 2727,, 2929,, 2828,, 3030,, 3737,, 3838,, 3131,, 3333,, 3434,, 3535,, 3232,, 3636,, 3939,, 4040,, 4141,, 4242,, 7272,, 4444,, 4545,, 4646,, 4747,, 4848,, 4949,, 5151,, 5050,, 5353,, 5555,, 5454,, 5656
Into a temp table, load it and do a join on it instead of an IN clause. Test it.

Calculations like this on columns in filtering clauses (WHERE, HAVING, ON) are going to absolutely kill performance:
ON CastCast((currentYearcurrentYear..respondent AS BIGINTrespondent AS BIGINT)) == CastCast(( priorYear priorYear..respondent AS BIGINTrespondent AS BIGINT))
If those are different data types, to fix performance, you may need to change the structure.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 5

Definitely would help to have the DDL (CREATE TABLE statement) for each of the tables used in the query. Would also help to see the execution plan for the query, uploaded as a .sqlplan file.
It looks like there are quite a few things that could be changed, but we don't have the information needed to really help at the moment.

PrettyMegaDBA SSC Veteran Points: 274 More actions · Answer 6

Thank you all for your responses. All have been very helpful. And forgive me, I lost my formatting somewhere between copy & paste, I will ensure its formatted better in the future. I will take the comments into consideration and see if that helps.

PMD

Jeff Moden SSC Guru Points: 1004401 More actions · Answer 7

PrettyMegaDBA - Wednesday, November 22, 2017 8:16 AM
Thank you all for your responses. All have been very helpful. And forgive me, I lost my formatting somewhere between copy & paste, I will ensure its formatted better in the future. I will take the comments into consideration and see if that helps.
PMD

The formatting thing has been a problem ever since they "improved" the forum software. :angry:

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

drew.allen SSC Guru Points: 76996 More actions · Answer 8

Jeff Moden - Wednesday, November 22, 2017 8:20 AM
PrettyMegaDBA - Wednesday, November 22, 2017 8:16 AM
Thank you all for your responses. All have been very helpful. And forgive me, I lost my formatting somewhere between copy & paste, I will ensure its formatted better in the future. I will take the comments into consideration and see if that helps.
PMD
The formatting thing has been a problem ever since they "improved" the forum software. :angry:

You can't blame this one on the forum software. The code was not included in the proper SQL Code tags.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

PrettyMegaDBA SSC Veteran Points: 274 More actions · Answer 9

Thanks Jeff. And go easy on me J. Drew, I don't post very often scripts for assistance, and as I've mentioned I inherited this code and DB. Clearly, there are quite a few issues with the code. Some db structure related and some just poor coding. But I knew the experts here could advise. I will try to do better in the future, no need to bash.

PMD

Luis Cazares SSC Guru Points: 183706 More actions · Answer 10

PrettyMegaDBA - Wednesday, November 22, 2017 8:36 AM
Thanks Jeff. And go easy on me J. Drew, I don't post very often scripts for assistance, and as I've mentioned I inherited this code and DB. Clearly, there are quite a few issues with the code. Some db structure related and some just poor coding. But I knew the experts here could advise. I will try to do better in the future, no need to bash.
PMD

I'm sure he wasn't trying to bash. Jeff pointed out that since the new forum software was implemented, we all have trouble with formatting. However, if the code is not included in the SQL Code tags, all format is loss and that has always happened.
This is something that happens a lot, especially to people that are new and we dislike it, but understand it. As long as you try to do better next time, we can all remain friendly in here.
Also, this is a helpful guide to post good performance related questions: How to Post Performance Problems - SQLServerCentral

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

PrettyMegaDBA SSC Veteran Points: 274 More actions · Answer 11

Ok, I hope I am doing this correctly. I am providing the Execution Plan as .xml, and DDL of the tables and view in the queries. There are no, indices, no PK's and no FK's. Before you all go nuts, I've already smacked the hands. They know it's crap, I know it's crap, we know it's crap. But as I mentioned, I've inherited it this way and unfortunately, it's in production. But here is some additional background info.

1. The data is loaded into a db yearly from a .csv provided from the gooberment (yes I spelled it that way, because their goobers 😉
2. I already know it is poorly designed and have addressed those issues with the team.
3. The db will be redesigned and the beginning of the year.
4. In the meantime, the current db is in production, not very large and doesn't have too many users.

Here are my number of rows returned and timing:

--18 secs to return 6761 rows (before encapsulated in proc)
--15 secs to return 6761 rows (after encapsulation)
--13 secs to return 6761 rows (by using bigint instead of float - making ddl change on table)

Thanks again for all of your very helpful suggestions!

PrettyMegaDBA SSC Veteran Points: 274 More actions · Answer 12

Sorry all, I provided EP as .sql-plan not .xml

No hard feelings here, and thanks for the tips on how to properly posts perf issues.

PMD

Grant Fritchey SSC Guru Points: 398904 More actions · Answer 13

PrettyMegaDBA - Wednesday, November 22, 2017 9:11 AM
Ok, I hope I am doing this correctly. I am providing the Execution Plan as .xml, and DDL of the tables and view in the queries. There are no, indices, no PK's and no FK's. Before you all go nuts, I've already smacked the hands. They know it's crap, I know it's crap, we know it's crap. But as I mentioned, I've inherited it this way and unfortunately, it's in production. But here is some additional background info.
1. The data is loaded into a db yearly from a .csv provided from the gooberment (yes I spelled it that way, because their goobers 😉
2. I already know it is poorly designed and have addressed those issues with the team.
3. The db will be redesigned and the beginning of the year.
4. In the meantime, the current db is in production, not very large and doesn't have too many users.
Here are my number of rows returned and timing:
--18 secs to return 6761 rows (before encapsulated in proc)
--15 secs to return 6761 rows (after encapsulation)
--13 secs to return 6761 rows (by using bigint instead of float - making ddl change on table)
Thanks again for all of your very helpful suggestions!

Yow!

I feel your pain. I'll try to get a moment to look at the plan, but, I'd say in addition to the code changes I suggested, you need to focus on clustered indexes first. The plan is probably a bit of a waste of time at the moment with no indexes on the tables. It's going to be all scans.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

PrettyMegaDBA SSC Veteran Points: 274 More actions · Answer 14

Grant Fritchey - Wednesday, November 22, 2017 9:19 AM
PrettyMegaDBA - Wednesday, November 22, 2017 9:11 AM
Ok, I hope I am doing this correctly. I am providing the Execution Plan as .xml, and DDL of the tables and view in the queries. There are no, indices, no PK's and no FK's. Before you all go nuts, I've already smacked the hands. They know it's crap, I know it's crap, we know it's crap. But as I mentioned, I've inherited it this way and unfortunately, it's in production. But here is some additional background info.
1. The data is loaded into a db yearly from a .csv provided from the gooberment (yes I spelled it that way, because their goobers 😉
2. I already know it is poorly designed and have addressed those issues with the team.
3. The db will be redesigned and the beginning of the year.
4. In the meantime, the current db is in production, not very large and doesn't have too many users.
Here are my number of rows returned and timing:
--18 secs to return 6761 rows (before encapsulated in proc)
--15 secs to return 6761 rows (after encapsulation)
--13 secs to return 6761 rows (by using bigint instead of float - making ddl change on table)
Thanks again for all of your very helpful suggestions!
Yow!
I feel your pain. I'll try to get a moment to look at the plan, but, I'd say in addition to the code changes I suggested, you need to focus on clustered indexes first. The plan is probably a bit of a waste of time at the moment with no indexes on the tables. It's going to be all scans.

Thanks so much!