Better way to write this?

  • A Report writer sent this query to me asking to either make it a view that they could JOIN to or create some sort of an ETL process that would load the data (approx. 425K rows) into a table before the daily reporting process begins.

    Currently this query runs in about 37 seconds and have about 150,000 reads (not too, too bad) so I was wondering if anyone has a better way of pulling data for queries that "re-join" back on the same table repeatedly. I've done some index tuning and for the most part I feel this has been optimized "enough"...the main points of contention in the SQL is on the JOINS (or at least appears to be)

    FYI - the reason for the numerous LEFT JOINS back to the same table (different alias) is because that's how the 3rd part application looks back up to find the parent of the account. Fun, fun, fun.

    The question I'm asking is whether or not any of the experts out there have a better way of getting around all these joins? In some queries, these guys are writing 25-table JOINS!

    Here's the result of the STATISTICS IO

    Table 'StsAccounts'. Scan count 8, logical reads 16, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'EAccounts'. Scan count 5, logical reads 4505, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'NAAccounts'. Scan count 10, logical reads 13580, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'BSPrimary'. Scan count 39875, logical reads 130325, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SELECT

    bpc.acctid,

    bpc.fleetnumber,

    nac.ContactNameDBA,

    asamc.StatusDescription,

    asagc.StatusDescription,

    CASE WHEN asagc.priority > asamc.Priority THEN asagc.StatusDescription

    ELSE asamc.StatusDescription

    END,

    CASE WHEN (CASE WHEN asagc.priority > asamc.Priority THEN asagc.StatusDescription

    ELSE asamc.StatusDescription

    END) LIKE 'Active%' THEN 'Active'

    ELSE 'InActive'

    END,

    bpd.acctid,

    bpd.accountname,

    nad.ContactNameDBA,

    asamd.StatusDescription,

    asagd.StatusDescription,

    CASE WHEN asagd.priority > asamd.Priority THEN asagd.StatusDescription

    ELSE asamd.StatusDescription

    END,

    CASE WHEN bpd.acctid IS NULL THEN NULL

    WHEN (CASE WHEN asagd.priority > asamd.Priority THEN asagd.StatusDescription

    ELSE asamd.StatusDescription

    END) LIKE 'Active%' THEN 'Active'

    ELSE 'InActive'

    END,

    bpch.acctId,

    asamch.StatusDescription,

    asagch.StatusDescription,

    CASE WHEN asagch.priority > asamch.Priority THEN asagch.StatusDescription

    ELSE asamch.StatusDescription

    END,

    CASE WHEN (CASE WHEN asagch.priority > asamch.Priority THEN asagch.StatusDescription

    ELSE asamch.StatusDescription

    END) LIKE 'Active%' THEN 'Active'

    ELSE 'InActive'

    END,

    ea.acctid,

    ea.PrimaryAccountNumber,

    ea.AbbrCardNumber,

    asamea.StatusDescription,

    asagea.StatusDescription,

    CASE WHEN asagea.priority > asamea.Priority THEN asagea.StatusDescription

    ELSE asamea.StatusDescription

    END,

    CASE WHEN (CASE WHEN asagea.priority > asamea.Priority THEN asagea.StatusDescription

    ELSE asamea.StatusDescription

    END) LIKE 'Active%' THEN 'Active'

    ELSE 'InActive'

    END

    FROM

    dbo..BSPrimary AS bpc

    LEFT JOIN dbo..BSPrimary AS bpd

    ON bpc.acctId = bpd.parent01AID

    AND bpd.AccountType = '0'

    AND bpd.accountLevel = '1'

    LEFT JOIN dbo..BSPrimary AS bpch

    ON ISNULL(bpd.acctId, bpc.acctId) = bpch.parent01AID

    AND bpch.AccountType = '1'

    AND bpch.AccountLevel = '0'

    LEFT JOIN dbo..NAAccounts AS nac

    ON bpc.acctId = nac.parent02AID

    AND nac.addresstype = '0'

    LEFT JOIN dbo..NAAccounts AS nad

    ON bpd.acctId = nad.parent02AID

    AND nad.addresstype = '0'

    LEFT JOIN dbo.EAccounts AS ea

    ON bpch.acctId = ea.parent01AID

    LEFT JOIN dbo.StsAccounts AS asamc

    ON bpc.ccinhparent125AID = asamc.acctId

    LEFT JOIN dbo.StsAccounts AS asagc

    ON bpc.SystemStatus = asagc.acctId

    LEFT JOIN dbo.StsAccounts AS asamd

    ON bpd.ccinhparent125AID = asamd.acctId

    LEFT JOIN dbo.StsAccounts AS asagd

    ON bpd.SystemStatus = asagd.acctId

    LEFT JOIN dbo.StsAccounts AS asamch

    ON bpch.ccinhparent125AID = asamch.acctId

    LEFT JOIN dbo.StsAccounts AS asagch

    ON bpch.SystemStatus = asagch.acctId

    LEFT JOIN dbo.StsAccounts AS asamea

    ON ea.ccinhparent125AID = asamea.acctId

    LEFT JOIN dbo.StsAccounts AS asagea

    ON ea.SystemStatus = asagea.acctId

    WHERE

    bpc.AccountType = '0'

    AND bpc.AccountLevel = '0'

    I was going to put DDL and sample data but in all honesty, that's just not feasible for this example (it would take me hours to parse out the data/DDL to allow for a working example) :w00t:

    Any suggestions from anyone having experience with these types of crazy queries?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Only 1 daily use and only 37 secs I wouldn't bother much with it.

    Truncate reporting table, reload the data. With good indexing in place just reuse that loaded table in the rest of the ETLs.

    The only way you can speed this up is to drop the 2nd left join on the big table.

    This can be done by using some sort of or in the join. Obviously this only works if you can always and only get 1 match in the left join and not 2 (the 2 ors).

    Select union all select also works nice because you can tune both queries to the max like with covering indexes.

    But with that many self joins on 2 different tables it could be a hell of a query to write. My guess is that you'd only need to do this one the really big table with 1 gazillion reads.

    In theory you could go from 40K scans to possibly only 3. But I would need to take a lot longer look at this.

  • Yeah that's kind of what I was thinking at one point but was hoping for a "magical" solution that someone might have out there based upon a similar situation 🙂

    It does run fairly fast and has pretty low reads and will run only once a day...

    However, because of the backend DB's we're stuck with and how it has to look up the information from the same table until it "rolls up" to the parent, I've found much worse queries that this one out there.

    Still hoping for some CTE magic or somethin' 😉

    Thanks!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • I would still like further info on the indexes. Please post the results of this query run from that db (a subset of what I use to analyze index usage/needs):

    SELECT

    GETDATE() AS capture_date,

    DB_NAME(mid.database_id) AS Db_Name,

    OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,

    mid.equality_columns, mid.inequality_columns, mid.included_columns,

    migs.*,

    mid.statement, mid.object_id, mid.index_handle

    FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)

    LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON

    mig.index_handle = mid.index_handle

    LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON

    migs.group_handle = mig.index_group_handle

    WHERE

    1 = 1

    AND mid.database_id = DB_ID()

    AND mid.object_id IN (OBJECT_ID(N'BSPrimary'), OBJECT_ID(N'EAAccounts'), OBJECT_ID(N'NAAccounts'), OBJECT_ID(N'StsAccounts'))

    ORDER BY

    Db_Name, Table_Name, equality_columns, inequality_columns

    SELECT

    ius2.row_num, DB_NAME() AS db_name, o.name AS table_name, i.name AS index_name,

    FILEGROUP_NAME(i.data_space_id) AS filegroup_name,

    (SELECT DATEDIFF(DAY, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb') AS sql_up_days,

    dps.row_count,

    ius.index_id, --ius.user_seeks + ius.user_scans AS total_reads,

    ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,

    ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,

    ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,

    ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update

    FROM sys.indexes i WITH (NOLOCK)

    INNER JOIN sys.objects o WITH (NOLOCK) ON

    o.object_id = i.object_id

    LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON

    dps.object_id = i.object_id AND

    dps.index_id = i.index_id

    LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON

    ius.database_id = DB_ID() AND

    ius.object_id = i.object_id AND

    ius.index_id = i.index_id

    LEFT OUTER JOIN (

    SELECT

    database_id, object_id, MAX(user_scans) AS user_scans,

    ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans

    FROM sys.dm_db_index_usage_stats WITH (NOLOCK)

    WHERE

    database_id = DB_ID()

    --AND index_id > 0

    GROUP BY

    database_id, object_id

    ) AS ius2 ON

    ius2.database_id = DB_ID() AND

    ius2.object_id = i.object_id

    WHERE

    o.name IN ( N'BSPrimary', N'EAAccounts', N'NAAccounts', N'StsAccounts' )

    ORDER BY

    db_name, table_name, CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END, index_name

    Edit: Removed actual production table name from comments.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • To reiterate, the main purpose of me posting this was to see if there was a better way to accomplish these multiple table self joins, not particularly getting it "faster", however, I am open to anything/everything that makes things run smoother! So appreciate anyone willing to take a look at it.

    Scott, the results are attached 🙂

    Edit: forgot to actually "upload" the attachments

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Don't see anything magical to let you get rid of any of the lookup joins :-).

    There typically won't be, so all you can do is tune the lookups as much as possible.

    The indexes do indeed look very good overall.

    However, this index definitely needs changed:

    NAccounts.idx_ContactNameDBA ::to:: ( parent02AID, addresstype, ContactNameDBA )

    You can review the "Missing" index info from SQL for possible adjustments to the NAccounts indexes, while never taking SQL's lists as more than a guide for a review of course.

    The StsAccounts clustered index very likely should be changed to ::

    ( AcctID [, identity_column if any] )

    if the current clustered index is a unique value, like an identity; the current clustered index then becomes a nonclusted index.

    For BSPrimary, you just need to determine if it's worth building a covering index for this query.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Actual execution plan? I could look at that tonight or over the weekend.

  • Lynn, the actual execution plan is the first first attachment in my original post

    BTW - what is up with SSC today not delivering notifications for posts???

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • I get them just fine.

  • MyDoggieJessie (12/21/2012)


    Lynn, the actual execution plan is the first first attachment in my original post

    BTW - what is up with SSC today not delivering notifications for posts???

    I thought it was just me. I have noticed it being very sporadic the last 2-3 days. Some threads are fine and others I get no emails. And Steve is out for the rest of the year. If there is a bug it is likely going to be around for at least a couple weeks.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Ninja's_RGR'us (12/21/2012)


    I get them just fine.

    I get them, they're just coming in several hours after-the-fact, last night it was after 10pm CST before I got post notifications from 4-5pm earlier.

    Just been noticing the entire site being a tad slower that usual

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (12/21/2012)


    Ninja's_RGR'us (12/21/2012)


    I get them just fine.

    I get them, they're just coming in several hours after-the-fact, last night it was after 10pm CST before I got post notifications from 4-5pm earlier.

    Just been noticing the entire site being a tad slower that usual

    Maybe it is because so many people are out for the holiday already and the system doesn't know what to do when there isn't much traffic. 😛

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply