Logic to check for empty table before joining

  • Hi All,

    I have a requirement which requires me to display report as per the dimension access to the user trying to pull the report. Just to give an example.. My dimension "Dimension.Sites" can have more than 20 sites. But a user has allowed access to only 5 sites. Accordingly they need to have the information presented to them when they request a report.

    Sites that the user has access to I am getting though a security module to my temp table #Sites(SiteKey) (It got records only if user has limited access. In case of full access this table wont contain any record) and I have written below query to have join to my fact table.

    SELECT SiteKey, SiteName, ...FROM fact.Sales f
    INNER JOIN Dimension.Sites s ON f.SiteKey = s.SiteKey
    WHERE (@SiteFullAccessFlag = 1 OR
    (s.SiteKey IN (Select SiteKey FROM #Sites)))

    @SiteFullAccessFlag = 1 in case of full access on the site dimension.

    The problem with this approach is performance. As my fact table has huge set of rows and its getting joined with multiple dimensions hence performance is very poor.

    When I remove "s.SiteKey IN (Select SiteKey FROM #Sites)" from the query performance improves by 50%. Fact and dimension got index created on all the key column.

    So, need your suggestion if there is any better approach to write this join. I cant directly join with this temp table as they may or may not have security filtered records. What I am trying to achive is join fact with temp table if got record else join with actual sites dimension.

    Hope I have put my query in appropriate way 🙂

    Thanks,
    Anjan Wahwar

    __________________________________________
    ---------------------------------------------------
    Save our mother Earth. Go Green !!!

  • Instead of using an IN clause, use another JOIN to retrieve that data. You would have to build a dynamic statement to perform a count before you tried to JOIN, requiring two different queries. That's an option, but not always a good one. Also, to help with performance, capturing and post the actual execution plan makes it easier to understand how your structures and query are working within the optimizer.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank Grant. I will try your suggestion. Just wanted to make sure I' doing it in a correct way.

    __________________________________________
    ---------------------------------------------------
    Save our mother Earth. Go Green !!!

  • Just wondering why you would populate said temp table with 0 records when the user has access to ALL sites.   If you simply copied all the PK values for the site dimension into your temp table when a user has access to all, and only the sites they have access to when their access is limited, into such temp table; and then index that temp table; I'm not sure why you'd have a problem.   Also, have you tried the alternate form of an IN, which is to use EXISTS ?   Sometimes that performs better.

    Post an execution plan as a .sqlplan file and someone here can probably help you figure out where the problem is.

    EDIT: grammar fix for clarity.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Be sure to create a unique clustered key / primary clustered key on SiteKey in the #Sites tables.  With only ~20+ rows it may not matter, but it can't hurt.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Or something like this as only one of the queries will return data:


    SELECT
      SiteKey
      , SiteName
      , ...
    FROM
      fact.Sales f
      INNER JOIN Dimension.Sites s
        ON f.SiteKey = s.SiteKey
    WHERE
      @SiteFullAccessFlag = 1
    UNION ALL
    SELECT
      SiteKey
      , SiteName
      , ...
    FROM
      fact.Sales f
      INNER JOIN Dimension.Sites s
        ON f.SiteKey = s.SiteKey
      INNER JOIN #Sites ss
        ON f.SiteKey = ss.SiteKey;

  • Anjan Wahwar - Thursday, December 7, 2017 10:41 PM

    Hello Anjan,
    How are you doing sir?
    You can use an OUTER JOIN as follows -


    declare @SiteFullAccessFlag int = 1
    declare @dimsite  table (sitekey int, site varchar(20))
    declare @fact   table (id int, sitekey int)
    declare @siteaccess table (sitekey int)

    insert into @dimsite(sitekey, site)
    values
    (1, 'site 1'),
    (2, 'site 2'),
    (3, 'site 3'),
    (4, 'site 4'),
    (5, 'site 5')

    insert into @fact (id, sitekey)
    values
    (1, 1),
    (2, 2),
    (3, 3),
    (4, 4),
    (5, 5)

    insert into @siteaccess (sitekey)
    values
    (1),
    (2)

    select * from @dimsite
    select * from @fact
    select * from @siteaccess

    /* query */
    select *
    from @fact as f
    inner join @dimsite as d
    on f.sitekey = d.sitekey
    left join @siteaccess as a
    on f.sitekey = a.sitekey
    where
    (@SiteFullAccessFlag = 1) or
    (@SiteFullAccessFlag = 0 and a.sitekey is not null)

    /* In the case where the user has full access, if
    you can change the logic to populate the
    @SiteAccess table with ALL sitekeys instead of
    keeping it blank, then the query would be a
    simple INNER JOIN, which I guess would be faster
    because a WHERE would not be needed

    e.g. query */
    select *
    from @fact as f
    inner join @dimsite as d on f.sitekey = d.sitekey
    inner join @siteaccess as a on f.sitekey = a.sitekey

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply