Query Optimization

  • Hello everyone. Please critique the follow query and give me your suggestions on how to optimize it. The query below was written to retrieve summary information for a report. The problem I am having is that instead of summing the balance, the query lists each item to be summed. The PO_BALANCE column alias below is an example of each item being listed. The only way I was able to get the desired results was to write the sub-query PO_AMOUNT. The problem I am having with this sub-query is that it causes the entire query to run extremely slow when executed. I know there is a better way. Please review this query and have a look at the results from the query in the attachment.

    Thanks for your help.

    SELECT

    ROW_NUMBER() OVER(ORDER BY MAX(gltrxdet.journal_ctrl_num)DESC)AS RowNumber,

    journal_type,

    apply_to_num,

    amt_paid_to_date,

    DATEADD(dd,gltrx.date_applied-657072,'1/1/1800')AS Date_Applied,

    po_ctrl_num,

    CONVERT(VARCHAR(10),DATEADD(dd,apvohdr.date_applied-657072,'1/1/1800'),101)AS

    Voucher_Apply_Date,

    LTRIM(RTRIM((SUBSTRING(description,8,50)))) as Vendor,

    [description],

    account_code,

    SUM(balance)AS PO_Balance, /*This doesn't sum*/

    (SELECT SUM(balance) FROM gltrxdet /*This gives me the desired results*/

    WHERE description like + po_ctrl_num +'%'and account_code = '2700000') AS PO_AMOUNT,

    (SELECT SUM(amt_paid_to_date) FROM apvohdr

    WHERE po_ctrl_num = LTRIM(RTRIM((SUBSTRING(description,1,5))))) AS Payment

    FROM apvohdr

    INNER JOIN gltrxdet

    ON po_ctrl_num = LTRIM(RTRIM((SUBSTRING(description,1,5))))

    INNER JOIN gltrx

    ON gltrxdet.journal_ctrl_num = gltrx.journal_ctrl_num

    WHERE po_ctrl_num BETWEEN '00000' AND '99999' and account_code = '2700000' and po_ctrl_num !='2700'

    AND po_ctrl_num = '76429'

    GROUP BY

    [description],

    account_code,

    gltrx.journal_ctrl_num,

    journal_type,

    gltrx.date_applied,

    apvohdr.date_applied,

    amt_paid_to_date,

    po_ctrl_num,

    apply_to_num

  • The simple "sum" function won't work here because you have to group by unique values in order to get the rest of the query to work.

    If you can post the actual execution plan for the query (as an .sqlplan file), we can probably help optimize the sub-query version.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • The query execution plan is attached.

    Thanks for you help.

  • Try dropping the sum() / group by.

    Then use outer apply to get the total.

    That should boost it by a heck of a margin.

  • The execution plan tells a bunch of the story of what's going on here. It's scanning the same clustered indexes over and over, and doing a lot of work two or more times.

    Can you provide the table definitions for the three tables it's pulling data from?

    Also, your Where clause that you posted seems to be kind of kludged together, since it has a number of redundant rules on column po_ctrl_num. What do you need it to really do?

    Should be easy enough to simplify this so it queries more effectively, but I need table definitions before I can start on it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • The definitions for the tables are attached.

    Thank GSquared.

  • Sorry. Wasn't clear enough. What I meant was I need the create scripts for them. Then I can create copies in a test database, and rewrite the query on those.

    It makes it a lot easier if you provide those. Can sometimes cope without them, but makes answering more difficult and time consuming.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I'm sorry about that. Here's a script with a test db, the tables, test data, and the procedure.

    Thanks.

  • Thanks, that helps.

    I've started re-writing, but I'm struggling with the relationships between these tables.

    There's a connection from spvohdr_all to gltrxdet by the PO control number, which is denormalized in gltrxdet inside the description column. You're using LTRIM, RTRIM, and SUBSTRING on that, but in the sample data you provide, all it needs is LEFT(description, 5). LEFT instead of SUBSTRING can make a simply huge performance difference, because it can take advantage of index seeks if the indexes to support it exist. Will that work, or are there more complexities, like left-padding of some numbers?

    What I need to know is the business definition of the relationship between gltrxdet and gltrx_all.

    Also, you have a sub-query to sum a value from apvohdr_all, which seems to indicate there are multiple rows per PO Control Number. If that's true, then you have to split the query of that table out into its own query, so that you aren't joining every row from a multi-row table to every row from another multi-row table, and then running aggregates on the resulting semi-cartesian join.

    It looks like it's meant to be a dataset of applied payments from apvohdr_all, and some details about transactions from the other two, connected by a PO control number. But the end result isn't really an aggregation, even if it has sums of various columns in it. But it's not clear that it should even be a single dataset.

    Can you clarify the intent of the query a bit?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks for the suggestions. The purpose of this procedure is to reconcile expense account records in the General Ledger (gltrx_all/gltrxdet tables) against voucher payments (apvohdr table). The General Ledger consists of the gltrx_all and gltrxdet tables with the gltrx_all table listing the type of journal entry, and the gltrxdet table listing the debit and credit details of the entry. The procedure is supposed to help our Finance department determine if balances need to be paid to any expense account listed in the General Ledger. All of the payments made against journal entries are held in the apvohdr table.

    The relationships between the tables are as follows:

    gltrx_all: this table is the header for a general ledger record in our financial system and this table is joined to the gltrxdet table with the journal_ctrl_num.

    gltrxdet: this table references the header record in the gltrx_all table using the journal_ctrl_num.

    apvohdr: this table didn't have a natural link to either of the general ledger tables so in order for me to join this table with the gltrxdet table, I had to use SUBSTRING to extract the po_ctrl_num from the description column in the gltrxdet table.

    I hope this makes sense, if not I'll try to elaborate more.

    Thanks for the help

  • Try this, see if it does what you need. It seems to work on the sample data you gave me, but that's pretty limited.

    SELECT ROW_NUMBER() OVER (ORDER BY MAX(LedgerDetail.journal_ctrl_num) DESC) AS Row_Number,

    journal_type,

    apply_to_num,

    amt_paid_to_date,

    DATEADD(dd, LedgerHeader.date_applied - 657072, '1/1/1800') AS Date_Applied,

    po_ctrl_num,

    Voucher_Apply_Date,

    LTRIM(RTRIM((SUBSTRING(description, 8, 50)))) AS Vendor,

    account_code,

    SUM(balance) AS PO_Balance,

    amt_paid_to_date AS PaidToDate

    FROM dbo.gltrx_all AS LedgerHeader

    INNER JOIN dbo.gltrxdet AS LedgerDetail

    ON LedgerHeader.journal_ctrl_num = LedgerDetail.journal_ctrl_num

    CROSS APPLY (SELECT apply_to_num,

    po_ctrl_num,

    CONVERT(VARCHAR(10), DATEADD(dd,

    MAX(date_applied)

    - 657072,

    '1/1/1800'), 101) AS Voucher_Apply_Date,

    MAX(amt_paid_to_date) AS amt_paid_to_date

    FROM dbo.apvohdr_all

    WHERE po_ctrl_num = LEFT(description, 5)

    GROUP BY apply_to_num,

    po_ctrl_num) AS VoucherPayments

    WHERE journal_type IN ('po', 'ap', 'adj', 'cr', 'gj', 'pr-m')

    AND po_ctrl_num BETWEEN '00000' AND '99999'

    AND account_code = '2700000'

    AND po_ctrl_num != '2700'

    GROUP BY journal_type,

    VoucherPayments.apply_to_num,

    VoucherPayments.amt_paid_to_date,

    DATEADD(dd, LedgerHeader.date_applied - 657072, '1/1/1800'),

    VoucherPayments.po_ctrl_num,

    Voucher_Apply_Date,

    LTRIM(RTRIM((SUBSTRING(description, 8, 50)))),

    account_code,

    VoucherPayments.amt_paid_to_date ;

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thank you so much GSquared!! The procedure ran less than 1 second when previously it took over a minute to run.

  • Does it get the right results?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/3/2011)


    Does it get the right results?

    Who cares - it ran in one second!!! :w00t:

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • It's doing what is supposed to do.

    Thanks again!!

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply