Need help with tuning a join query

  • I need help with tuning this query. This below takes about 55 minutes to run :

    User_Login_test is a VIEW - the view structure is union to multiple tables.

    SELECT COUNT(DISTINCT ul.SessionID) AS [Number of Logins],

    COUNT(DISTINCT ul.UserID) AS [Unique CANS to Login]

    FROM User_Login_test ul

    INNER JOIN Site_User su ON

    su.SiteID = ul.SiteID

    AND su.UserID = ul.UserID

    AND su.UserTypeID IN (1,2)

    LEFT OUTER JOIN User_Login_AcctLookup j ON

    j.SiteID = ul.SiteID

    AND j.UserID = ul.UserID

    WHERE ul.LoginDate

    BETWEEN '2014-01-11 00:00:00' AND '2015-04-30 23:59:59'

    AND j.Subfirm = '001'

    AND (ul.SessionIdent IS NULL OR ul.SessionIdent IN ('O',' '))

    Any help is appreciated !

    Thanks,

  • Prem-321257 (5/13/2015)


    I need help with tuning this query. This below takes about 55 minutes to run :

    User_Login_test is a VIEW - the view structure is union to multiple tables.

    SELECT COUNT(DISTINCT ul.SessionID) AS [Number of Logins],

    COUNT(DISTINCT ul.UserID) AS [Unique CANS to Login]

    FROM User_Login_test ul

    INNER JOIN Site_User su ON

    su.SiteID = ul.SiteID

    AND su.UserID = ul.UserID

    AND su.UserTypeID IN (1,2)

    LEFT OUTER JOIN User_Login_AcctLookup j ON

    j.SiteID = ul.SiteID

    AND j.UserID = ul.UserID

    WHERE ul.LoginDate

    BETWEEN '2014-01-11 00:00:00' AND '2015-04-30 23:59:59'

    AND j.Subfirm = '001'

    AND (ul.SessionIdent IS NULL OR ul.SessionIdent IN ('O',' '))

    Any help is appreciated !

    Thanks,

    Hi and welcome to SSC. With only the information you posted there is absolutely nothing anybody can do to help. We will need to see the actual execution plan for this query at the very least. For more detailed info we will need to see the table and index definitions for all the tables involved. We also will want to see the definition for your view and those tables.

    Check out this article for details about what to post for performance problems. http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • 55 minutes to run? Good grief!

    Well, without seeing the structure of the underlying view (which is likely the cause of quite a few of your performance woes) there are some general suggestions I could make.

    First off, with regards to view design:

    1. You may want to add WITH(NOLOCK) to your FROM statements, if they're not already there and you can tolerate the possibility of a dirty read.

    a. As above, you may want to do this for your queries, too.

    2. Multiple joins (or filters) really are what Views are for, but they can still affect performance. Try checking to see if there's a bad cached plan for the SELECT it uses.

    3. Also try and run the View SELECT in a regular query window with the execution plan on. It might also be worth SET STATISTICS IO ON and SET STATISTICS TIME ON at the beginning (with corresponding OFFs at the end) to give you a bit more insight as to where things are lagging.

    4. It's possible you might be able to add filtered indexes to some of the underlying tables to speed up the View's SELECT which would, in turn, speed up this monster.

    For your query:

    1. Would it be worth trying to break that behemoth up into smaller statements? That's a pretty big date range and a LEFT OUTER is definitely going to slow things down.

    2. While I don't necessarily see anything here, I've definitely gotten some benefits from pre-filtering my data sets in CTEs and replacing some JOINs with CROSS APPLYs, though I'm not sure it'd do you any good here.

    Hope those are some good starting points for you!

    I think for anyone else to help, you're probably going to need to give a bit more information. How many tables are joined in the view? What kind of indexes and keys are on the underyling tables or the view itself? Any other, more specific information you could provide (without giving away too much business-specific stuff) would be helpful.

  • Thanks for the quick response !!

    The problem is with the VIEW in the query. Here is the view structure

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER VIEW [dbo].[login_test] AS

    SELECT * FROM DB_arch.dbo.login_CUR

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M01

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M02

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M3

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M4

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M5

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M6

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M7

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M8

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M9

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M10

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M11

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2007_M12

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M1

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M2

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M3

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M4

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M5

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M6

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M7

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M8

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M9

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M10

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M11

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch.dbo.login_2008_M12

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M1

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M2

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M3

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M4

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M5

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M6

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M7

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M8

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M9

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M10

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M11

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2009.dbo.login_2009_M12

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M1

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M2

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M3

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M4

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M5

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M6

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M7

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M8

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M9

    UNION ALL SELECT *,NULL,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M10

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M11

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2010.dbo.login_2010_M12

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M1

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M2

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M3

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M4

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M5

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M6

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M7

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M8

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M9

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M10

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M11

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2011.dbo.login_2011_M12

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2012.dbo.login_2012_M1

    UNION ALL SELECT *,NULL,NULL FROM DB_arch_2012.dbo.login_2012_M2

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M3

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M4

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M5

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M6

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M7

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M8

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M9

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M10

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M11

    UNION ALL SELECT * FROM DB_arch_2012.dbo.login_2012_M12

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M1

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M2

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M3

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M4

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M5

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M6

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M7

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M8

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M9

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M10

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M11

    UNION ALL SELECT * FROM DB_arch_2013.dbo.login_2013_M12

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M1

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M2

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M3

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M4

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M5

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M6

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M7

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M8

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M9

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M10

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M11

    UNION ALL SELECT * FROM DB_arch_2014.dbo.login_2014_M12

    UNION ALL SELECT * FROM DB_arch_2015.dbo.login_2015_M1

    UNION ALL SELECT * FROM DB_arch_2015.dbo.login_2015_M2

    UNION ALL SELECT * FROM DB_arch_2015.dbo.login_2015_M3

    UNION ALL SELECT * FROM DB_arch_2015.dbo.login_2015_M4

    GO

  • First off, with regards to view design:

    1. You may want to add WITH(NOLOCK) to your FROM statements, if they're not already there and you can tolerate the possibility of a dirty read.

    a. As above, you may want to do this for your queries, too.

    That is absolutely awful advice. The NOLOCK hint is NOT a magic go faster button. It has its place but it needs to be considered carefully and a full understanding of what that hint really does is critical. Dirty reads are only 1 possibility and even those are largely misunderstood.

    There are plenty of other articles on this topic available but this one does a nice job of summarizing all the issues. http://blogs.sqlsentry.com/aaronbertrand/bad-habits-nolock-everywhere/[/url]

    My guess is we will find some missing indexes, stale statistics, nonSARGable predicates etc. This should all happen long before we start throwing around NOLOCKS.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • 1) The LEFT OUTER JOIN is functioning like an INNER JOIN because of the WHERE condition on a column from that table. For better performance, move the WHERE condition into the join. Thus:

    ...

    INNER JOIN User_Login_AcctLookup j ON --LEFT OUTER JOIN if that is really needed

    j.SiteID = ul.SiteID

    AND j.UserID = ul.UserID

    AND j.Subfirm = '001'

    ...

    WHERE ul.LoginDate >= '20140111'

    AND ul.LoginDate < '20150501'

    AND (ul.SessionIdent IS NULL OR ul.SessionIdent IN ('O',' ')) --proper coding here, well done!

    ...

    2) As always, best table clustering is the key to performance. Based solely on this one query (which naturally would need further investigation/confirmation), the tables should be clustered as below. If they are instead clustered by an identity, I can almost guarantee that is not the best clustering. If it's by another column, then the existing clustering might be better.

    User_Login_test: LoginDate

    Site_User: ( SiteID, UserID )

    User_Login_AcctLookup ( SiteID, UserID )

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Thanks for the tips!

    I changed the outer join to inner join and then using >= and <= operator instead of between. The execution time is now less then 10 minutes.

    I am thinking of rewriting this query into CTE expression.

  • DOH, just saw this:

    User_Login_test is a VIEW

    So obviously clustering would not apply to it, but to the underlying tables.

    It's still fundamentally true that identity is usually NOT the best clustering key for most tables. Review the underlying tables for better choices if they are clustered by ident. I can give you scripts and otherwise assist with that if you want me to.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Can you check your data types? Specifically, check your join columns:

    su.SiteID = ul.SiteID

    j.SiteID = ul.SiteID

    j.UserID = ul.UserID

    I'm hoping that each pair have matching data types. If a pair doesn't match, then the resulting implicit cast will certainly slow things down quite a bit.

  • Prem-321257 (5/13/2015)


    Thanks for the tips!

    I changed the outer join to inner join and then using >= and <= operator instead of between. The execution time is now less then 10 minutes.

    I am thinking of rewriting this query into CTE expression.

    10 minutes is surely better than 55 minutes but, unless we're talking about 10's or 100's of millions of rows, that certainly seems like a long time for a reasonably simple select query. Did you take a look at the article Sean suggested? If you can post the query plan for this query, and perhaps provide some DDL for the underlying tables, I bet we can help make this query much faster, perhaps get it down to seconds or miliseconds.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • It does have matching parameter.

    SiteID - int

    UserID - varchar(32)

    If we remove the user_login_test view in the query and replace it with one of the permanent table - it runs in few seconds. The performance of the query seems to be related with the View structure user_login .

  • Prem-321257 (5/14/2015)


    It does have matching parameter.

    SiteID - int

    UserID - varchar(32)

    If we remove the user_login_test view in the query and replace it with one of the permanent table - it runs in few seconds. The performance of the query seems to be related with the View structure user_login .

    Well, you were asked for the DDL for the view and underlying tables. Please remember, we can't see what you see unless you show it to us.

  • Lynn's exactly right - We can't know what your table structure is unless you give us the CREATE TABLE statements for them. The mismatched data types in your join means that the values have to be CASTed to a matching data type in order to join the tables. This is undoubtedly taking taking a lot of time to run.

    The best way I know of to handle this is to look at the data in the tables and then change the inappropriate data type to an appropriate one. Be aware, though, that this will have implications for the rest of the system. We have no way of knowing what's reading from or writing to your tables, but we can say that they'll be affected. Be careful with this and test it thoroughly in a test database before you make the change in production.

  • I'd strongly recommend getting the execution plan and understanding how SQL Server is interpreting your T-SQL. That will drive you towards making positive choices in optimizing the code.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply