Boost TSQL query code

  • Hello comunity

    I need to run the following TSQL query to return a balance, but this script take a very long time to run, someone have an ideia to optimize :

    select distinct

    ml.conta as 'Integração',

    (select top 1 descricao from pc where pc.conta=ml.conta and pc.ano=2008) as 'Tipo Conta',

    CL.NOME as 'Cliente',

    ml.u_subconta as 'Subconta',

    DEBMES,CRDMES,SLDMES,

    DEBANT,CRDANT,SLDACCANT,

    DEBACUM,CRDACUM,SLDACUM

    from ml (NOLOCK) inner join

    pc (nolock) on pc.conta=ml.conta inner join

    cl (nolock) on ml.u_subconta=cast(cl.no as char) inner join

    (select

    conta,

    u_subConta,

    sum(case when mes >= 0 and mes <=12 then edeb else 0 end) DEBMES,

    sum(case when mes >= 0 and mes <=12 then ecre else 0 end) crdMES,

    sum(case when mes >= 0 and mes <=12 then EDEB-ecre else 0 end) SLDMES,

    sum(case when mes <= 0 then EDEB else 0 end) DEBANT,

    sum(case when mes <= 0 then ecre else 0 end) CRDANT,

    sum(case when mes <= 0 then edeb-ecre else 0 end) SLDACCANT ,

    sum(case when mes <= 12 then EDEB else 0 end) DEBACUM,

    sum(case when mes <= 12 then ecre else 0 end) CRDACUM,

    sum(case when mes <= 12 then edeb-ecre else 0 end) SLDACUM

    from ml where year(ml.data)=2008

    group by conta, u_subConta) MLL on mll.conta+mll.u_subconta = ml.conta+ml.u_subconta

    where ml.mes between 0 and 12 and year(ml.data)=2008 and

    ( ml.conta LIKE '21%' or ml.conta LIKE '269%' )

    UNION ALL

    select distinct

    ml.conta as 'Integração',

    (select top 1 descricao from pc where pc.conta=ml.conta and pc.ano=2008) as 'Tipo Conta',

    ' ' as 'Cliente',

    ' ' AS 'Subconta',

    DEBMES,CRDMES,SLDMES,

    DEBANT,CRDANT,SLDACCANT,

    DEBACUM,CRDACUM,SLDACUM

    from ml (NOLOCK) inner join

    pc (nolock) on pc.conta=ml.conta inner join

    cl (nolock) on ml.u_subconta=cast(cl.no as char) inner join

    (select CONTA,

    sum(case when mes >= 0 and mes <=12 then edeb else 0 end) DEBMES,

    sum(case when mes >= 0 and mes <=12 then ecre else 0 end) crdMES,

    sum(case when mes >= 0 and mes <=12 then EDEB-ecre else 0 end) SLDMES,

    sum(case when mes <= 0 then EDEB else 0 end) DEBANT,

    sum(case when mes <= 0 then ecre else 0 end) CRDANT,

    sum(case when mes <= 0 then edeb-ecre else 0 end) SLDACCANT ,

    sum(case when mes <= 12 then EDEB else 0 end) DEBACUM,

    sum(case when mes <= 12 then ecre else 0 end) CRDACUM,

    sum(case when mes <= 12 then edeb-ecre else 0 end) SLDACUM

    from ml where year(ml.data)=2008 GROUP BY CONTA

    ) MLL on mll.conta= ml.conta

    where ml.mes between 0 and 12 and year(ml.data)=2008 and

    ( ml.conta LIKE '21%' or ml.conta LIKE '269%' )

    order by 1,4

    Many thanks

    Luis Santos

  • Here is my thoughts from a quick scan of your code:

    If the following index does not exist, I would create it:

    create index IDX_Balance on dbo.ml (

    data asc,

    mes asc,

    conta asc

    ) include (

    edeb,

    ecre

    )

    I would then change:

    where

    ml.mes between 0 and 12

    and year(ml.data)=2008

    and (ml.conta LIKE '21%'

    or ml.conta LIKE '269%')

    to:

    where

    ml.mes between 0 and 12

    and ml.data >= dateadd(yyyy,datediff(yyyy,0,ml.data),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,ml.data) + 1,0)

    and (ml.conta LIKE '21%'

    or ml.conta LIKE '269%')

    There is probably more that could be done to optimize your query, but we would need to have the DDL for the table(s) involved, some sample data (preferably made up test data), and what the results of the query should be based on that data.

    This article is a great read on how to ask for help: http://www.sqlservercentral.com/articles/Best+Practices/61537/

    😎

  • Found another WHERE clause that should be changed:

    where year(ml.data)=2008

    to:

    where ml.data >= dateadd(yyyy,datediff(yyyy,0,ml.data),0) and ml.data < dateadd(yyyy,datediff(yyyy,0,ml.data) + 1,0)

    This would use the index I suggested in my previous post as well.

    😎

  • Hello Lynn

    thanks for your reply.

    I test my tsql code with some index that you mencioned, but the maximum performance for this TSQL is about 6,30 minutes. Is too much

    Also, i run the database engine tuning advisor, and i create some indexes, but it´s run also slower.

    Is it possible to create a temporary table with a SQL cursor to generate all the data , do you think that is better or not.

    Also, if you think that is better , could you give some help to create them.

    Many thanks again

    Luis Santos

  • [font="Verdana"]

    select distinct ml.conta as 'Integração'

    ,Max(pc.descricao) As 'Tipo Conta'

    ,CL.NOME as 'Cliente'

    ,ml.u_subconta as 'Subconta'

    ,DEBMES, CRDMES, SLDMES

    ,DEBANT, CRDANT, SLDACCANT

    ,DEBACUM, CRDACUM, SLDACUM

    from ml (NOLOCK) inner join pc (nolock) on pc.conta=ml.conta

    inner join cl (nolock) on ml.u_subconta=cast(cl.no as char)

    inner join (select conta

    ,u_subConta

    ,sum(case when mes >= 0 and mes <=12 then edeb else 0 end) DEBMES

    ,sum(case when mes >= 0 and mes <=12 then ecre else 0 end) crdMES

    ,sum(case when mes >= 0 and mes <=12 then EDEB-ecre else 0 end) SLDMES

    ,sum(case when mes <= 0 then EDEB else 0 end) DEBANT

    ,sum(case when mes <= 0 then ecre else 0 end) CRDANT

    ,sum(case when mes <= 0 then edeb-ecre else 0 end) SLDACCANT

    ,sum(case when mes <= 12 then EDEB else 0 end) DEBACUM

    ,sum(case when mes <= 12 then ecre else 0 end) CRDACUM

    ,sum(case when mes <= 12 then edeb-ecre else 0 end) SLDACUM

    from ml where year(ml.data)=2008

    group by conta, u_subConta

    ) MLL on mll.conta+mll.u_subconta = ml.conta+ml.u_subconta

    where ml.mes between 0 and 12

    and year(ml.data)=2008

    and (ml.conta LIKE '21%' or ml.conta LIKE '269%')

    Group by ml.conta as 'Integração'

    ,CL.NOME as 'Cliente'

    ,ml.u_subconta as 'Subconta'

    ,DEBMES, CRDMES, SLDMES

    ,DEBANT, CRDANT, SLDACCANT

    ,DEBACUM, CRDACUM, SLDACUM

    Add the above mentioned changed to your query as well as the changes suggested by prev poster and try.

    conform on this.

    Mahesh

    [/font]

    MH-09-AM-8694

  • - do you rebuild the indexes on a regular basis ?

    (if you have SQL2005 EE, you can do that online !)

    - Check if you really need the 'distinct' in every select.

    - do you have indexes to support your joins ?

    - do you have indexes to support your where clauses ?

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • luissantos (4/28/2008)


    Hello Lynn

    thanks for your reply.

    I test my tsql code with some index that you mencioned, but the maximum performance for this TSQL is about 6,30 minutes. Is too much

    Also, i run the database engine tuning advisor, and i create some indexes, but it´s run also slower.

    Is it possible to create a temporary table with a SQL cursor to generate all the data , do you think that is better or not.

    Also, if you think that is better , could you give some help to create them.

    Many thanks again

    Luis Santos

    You said you created the indexes but you didn't indicate if you also made the T-SQL changes I suggested. If you didn't, I don't think creating indexes will be of any use. When you have code in a where clause like this

    and year(ml.data)=2008

    , SQL can't make use of the indexes so it usually reverts to table scans.

    As I asked earlier, without the DDL for the tables, some sample data (preferably fake data), and what the results of the query should be based on the sample dta, I can't do much more.

    Please take the time to read the following article, it provides very good info on how best to ask for help and get the results you are looking for:

    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    😎

  • Also suggest reviewing this join condition as derived columns cannot use an index.

    Original:

    on mll.conta+mll.u_subconta = ml.conta+ml.u_subconta

    Suggested:

    on mll.conta = ml.conta

    and mll.u_subconta = ml.u_subconta

    SQL = Scarcely Qualifies as a Language

  • I believe every one has hit all the hot spots... Carl's last post should be a big help, as well. The problem is that the derived table in the FROM clauses are the equivalent of aggregated views and because they are being joined to on an aggregated column, the entire derived table must be resolved to see if the join criteria qualifies.

    As well as all the other good suggestions about how to properly write criteria to find a year and still use and index, how to group by, and how to join without adding two columns together, might I suggest you lookup INDEXED VIEWS in Books Online? It seems like this would be the perfect thing to do what you want with some good amount of speed.

    If that's too troublesome, then at least summarize the identical derived tables into a temp table where it can be used more than once without recalculation and can also be indexed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s. For future reference, it's almost always going to be a huge performance drain to try to join to aggregated columns like you have. Preaggregation using a temp table or indexed view is almost always required to recover any semblence of performance when doing such a thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hello again

    Thanks for everybody that have make suggestions to help me. I write my TSQL in another way:

    select distinct

    ml.conta as 'Integração',

    /*(select top 1 descricao from pc where pc.conta=ml.conta and pc.ano >= dateadd(yyyy,datediff(yyyy,0,pc.ano),0)

    and pc.ano < dateadd(yyyy,datediff(yyyy,0,pc.ano) + 1,0)) as 'Tipo Conta',*/

    pc.descricao as 'Tipo conta',

    CL.NOME as 'Cliente',

    ml.u_subconta as 'Subconta',

    DEBMES,CRDMES,SLDMES,

    DEBANT,CRDANT,SLDACCANT,

    DEBACUM,CRDACUM,SLDACUM

    from ml (NOLOCK) inner join

    pc (nolock) on pc.conta=ml.conta inner join

    cl (nolock) on ltrim(rtrim(ml.u_subconta))=cast(cl.no as char) and cl.estab =0 inner join

    (select

    conta,

    u_subConta,

    sum(case when mes >= 0 and mes <=12 then edeb else 0 end) DEBMES,

    sum(case when mes >= 0 and mes <=12 then ecre else 0 end) crdMES,

    sum(case when mes >= 0 and mes <=12 then EDEB-ecre else 0 end) SLDMES,

    sum(case when mes <= 0 then EDEB else 0 end) DEBANT,

    sum(case when mes <= 0 then ecre else 0 end) CRDANT,

    sum(case when mes <= 0 then edeb-ecre else 0 end) SLDACCANT ,

    sum(case when mes <= 12 then EDEB else 0 end) DEBACUM,

    sum(case when mes <= 12 then ecre else 0 end) CRDACUM,

    sum(case when mes <= 12 then edeb-ecre else 0 end) SLDACUM

    from ml where ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,'20081231') + 1,0)

    group by conta, u_subConta) MLL on mll.conta= ml.conta and mll.u_subconta = ml.u_subconta

    where

    ml.mes between 0 and 12

    and ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,'20081231') + 1,0)

    and (ml.conta LIKE '21%'

    or ml.conta LIKE '269%')

    UNION ALL

    select distinct

    ml.conta as 'Integração',

    /*(select top 1 descricao from pc where pc.conta=ml.conta and pc.ano >= dateadd(yyyy,datediff(yyyy,0,pc.ano),0)

    and pc.ano < dateadd(yyyy,datediff(yyyy,0,pc.ano) + 1,0)) as 'Tipo Conta',*/

    pc.descricao as 'Tipo Conta',

    ' ' as 'Cliente',

    ' ' AS 'Subconta',

    DEBMES,CRDMES,SLDMES,

    DEBANT,CRDANT,SLDACCANT,

    DEBACUM,CRDACUM,SLDACUM

    from ml (NOLOCK) inner join

    pc (nolock) on pc.conta=ml.conta inner join

    cl (nolock) on ltrim(rtrim(ml.u_subconta))=cast(cl.no as char) inner join

    (select CONTA,

    sum(case when mes >= 0 and mes <=12 then edeb else 0 end) DEBMES,

    sum(case when mes >= 0 and mes <=12 then ecre else 0 end) crdMES,

    sum(case when mes >= 0 and mes <=12 then EDEB-ecre else 0 end) SLDMES,

    sum(case when mes <= 0 then EDEB else 0 end) DEBANT,

    sum(case when mes <= 0 then ecre else 0 end) CRDANT,

    sum(case when mes <= 0 then edeb-ecre else 0 end) SLDACCANT ,

    sum(case when mes <= 12 then EDEB else 0 end) DEBACUM,

    sum(case when mes <= 12 then ecre else 0 end) CRDACUM,

    sum(case when mes <= 12 then edeb-ecre else 0 end) SLDACUM

    from ml where ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0) GROUP BY CONTA

    ) MLL on mll.conta= ml.conta

    where

    ml.mes between 0 and 12

    and ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    and (ml.conta LIKE '21%'

    or ml.conta LIKE '269%')

    order by 1,4

    It´s run on 2,45 seconds.

    Also, i have a question :

    when i write ml.year=2008, it´s because "2008" is a variable that iám introduced before running the script.

    Do i must write the extended date like

    and ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    an also, i' am not understand very well the result of :

    ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0) and ml.data< dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    Do i must on my variable declare the entire date , for example:

    Inicial date : 01012008

    Final date : 31122008

    Many thanks again for all of us

    Luis Santos

  • luissantos (4/30/2008)


    an also, i' am not understand very well the result of :

    ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0) and ml.data< dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    Lynn... I'm thinking that one is all yours 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/30/2008)


    luissantos (4/30/2008)


    an also, i' am not understand very well the result of :

    ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0) and ml.data< dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    Lynn... I'm thinking that one is all yours 😀

    Geez, thanks. I learned this date stuff from you, you know. I'll do my best, but jump in if I start going off in the wrong direction.

    😎

  • Also, i have a question :

    when i write ml.year=2008, it´s because "2008" is a variable that iám introduced before running the script.

    Do i must write the extended date like

    and ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    an also, i' am not understand very well the result of :

    ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0) and ml.data< dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    Do i must on my variable declare the entire date , for example:

    Inicial date : 01012008

    Final date : 31122008

    Luis,

    First, have you read the article I asked you to read yet? Real important, it will give you excellent guidance on how to ask for help and get the results you want. With that, I see that we now have a little more information as shown above. Based on your original post, it was not know how the 2008 was passed to the query. This leads to another question, is this provided by the user as just the year, or is a complete date provided?

    I will go on the assumpation (bad as that my be) that this is provided only as the year. Before I go on, however, I will show you some date math I learned from Jeff and hopefully some background on why it works (at this time, any way).

    If you do the following query in SSMS:

    select cast(0 as datetime) -- 0 is a zero

    you will get:

    1900-01-01 00:00:00.000

    as a return value.

    Now some simple date math.

    To remove the time portion (i.e. make it 00:00:00.000) from getdate(), try the following:

    select getdate(), dateadd(dd,datediff(dd,0,getdate()),0)

    you will get:

    2008-04-30 21:39:37.0932008-04-30 00:00:00.000

    This part, datediff(dd,0,getdate()), gives you the number of days between the zero date (1900-01-01) and getdate() (or you can substitute a date-valued column in a query).

    This part, dateadd(dd,datediff(dd,0,getdate()),0), then adds the number of days back to the zero date, returning your current day, with a zerod time.

    To get the first of the year given any date (I'll use getdate()):

    select getdate(), dateadd(yyyy, (datediff(yyyy, 0, getdate())), 0)

    returns:

    2008-04-30 21:45:05.5772008-01-01 00:00:00.000

    The datediff provides you with the number of years between the zero date and the date provided. This is then added back to the zero date giving you the 1st of the year.

    To get the first of the following year, you simply add one to the value of the datediff function,like this:

    select getdate(), dateadd(yyyy, (datediff(yyyy, 0, getdate())) + 1, 0)

    and you get this:

    2008-04-30 21:48:01.6402009-01-01 00:00:00.000

    This can be done to calculate the beginning date of a month, and so on. I would suggest experimenting with the functions and see what you can learn. If you have questions, just pop on over to SSC, and post a question in a new thread.

    Back to this:

    and ml.data >= dateadd(yyyy,datediff(yyyy,0,'20080101'),0)

    and ml.data < dateadd(yyyy,datediff(yyyy,0,'20080131') + 1,0)

    When querying date ranges, I have found that using BETWEEN, is not always good. A better way to query a date range is to start at the lower bound using >= and to use the date following the date range in question with a = and < operators. When you do this, year(ml.data) = 2008, SQL won't be able to use the index anymore, resulting in a table scan (possibly an index scan, but it would have to scan the entire index).

    If you are passing in the year (as an integer or string (I'd convert a string to an int before using it, just makes it cleaner)). you can modify what I have above slightly to achieve the same result (how it works I will leave for you to figure out as an exercise):

    ...

    declare @DateIn smallint

    set @DateIn = 2008

    ...

    and ml.data >= dateadd(yyyy,@DateIn - 1900, 0)

    and ml.data < dateadd(yyyy,(@DateIn - 1900) + 1, 0)

    Hope this helps, and if you have any questions, ask, I'm sure you will get a response from someone.

    😎

  • Nice job, Lynn!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply