Takes a very long time to query a table/view

  • There’s table with over 6.5 million records (it has about 20 columns) in our data warehouse. ‘SELECT TOP 100 PERCENT * FROM TABLENAME’ takes over 10 minutes. Also, there is a view which reads data from this table with a couple of join tables. It takes about an hour to query the data from this view.

     

    We have done our due diligence by running ‘Index Tuning Wizard’ and creating couple of indexes (cluster and non clustered indexes) on couple of columns such as client_id, etc We also followed ‘Efficient Index Design’  concepts and ‘Efficient Query Design’ concepts when creating view and indexes.

     

    Since this table has historical data (data warehouse), we can’t CREATE UNIQUE CLUSTERED INDEX, there are no unique records as well.

     

    I tried to index the view, since there are quite a few requirements indexing views such as it can’t have outer joins, I have given-up that idea as well. The next option is to create a table from the view, then we have to populate this table from the view every time data gets added to the base tables (the tables that the view is based on). We don’t want to use triggers to keep it up-to-date as data gets added on batch mode. What are the other options do I have?

     

    The other questions is, how can we increase the performance of this view/table? Is the time it takes right now to retrieve data is reasonable?

     

    Any help would be appreciated..

     

    Thanks,

    Chandi

  • Could you post the view create statement and the create statements for your indexes?

    But just for my own edification, how long do you expect the server to take to deliver the 6.5 million rows? 


    And then again, I might be wrong ...
    David Webb

  • This recent article http://www.sqlservercentral.com/columnists/jBulinckx/indexedviewswithouterjoins.asp showed how to do outer joins with and indexed view.

    If you replace your SELECT * statement with explicit column names do you get any improvement in performance?

  • Well, I can certainly try that, but it wouldn't make a difference for us. Users access this view through MS Access as an 'ODBC' import table, and it times out. Also, we create Cognos cubes from this view, it takes 6 hours. In both cases we have no control over how we access the view, in fact Cognos reads the data column by column I think…

     

    Thanks,

    Chandi

  • Here is the view and the index script. Most of the joins are lookup values, ie for '01' we want to display 'Jan' in the view.

    /*  The View  */

    CREATE VIEW dbo.ClaimTableforCube WITH SCHEMABINDING AS

     SELECT dbo.ClaimsTable.CLIENT_ID,    dbo.dp_care_level_code.TD_CODE_LONG_DESCRIPTION AS CI_Care_Level_Code_Description,

       dbo.ALL_PROVIDERS.PR_NAME, dbo.ClaimsTable.CI_SERVICE_YEAR,

       dbo.ClaimsTable.CI_SERVICE_MONTH, dbo.ClaimsTable.CI_DAYS_OF_SERVICE_AMOUNT,

       dbo.ClaimsTable.CI_HOURS_OF_SERVICE_AMOUNT, dbo.ClaimsTable.CI_PAID_AMOUNT,

       dbo.calendar_to_fiscal.fiscal, dbo.ALL_CLIENTS.CL_SEX_CODE, dbo.Month_Name.MonthName, H_Unit_1.[Hunit Name] AS ProviderHUnit,

       H_Unit2_1.[Hunit name] AS ProviderHUnit2, (CASE dbo.ClaimsTable.CI_CLIENT_CONTRIBUTION_AMT WHEN 999.99 THEN 0.00

        ELSE dbo.ClaimsTable.CI_CLIENT_CONTRIBUTION_AMT

          END) AS CI_CLIENT_CONTRIBUTION_AMT, dbo.[SERVICE_TYPE_CODE_LOOKUP].TD_CODE_LONG_DESCRIPTION AS service_type,

       dbo.[org_code_lookup].TD_CODE_LONG_DESCRIPTION AS org_type, dbo.ALL_CLIENTS.CL_BIRTH_DATE, DATEPART(yyyy,

       dbo.ALL_CLIENTS.CL_BIRTH_DATE) AS DOB_Year, DATEPART(m, dbo.ALL_CLIENTS.CL_BIRTH_DATE) AS DOB_Month,

       H_Unit_2.[Hunit Name] AS ClientHUnit, H_Unit2_2.[Hunit name] AS ClientHUnit2,

       tblPostalCodes_2.PostalCode AS Client_PC, '05/03/2005' AS DateofExtract,

       (CASE WHEN [CI_Service_Month] >= DATEPART(m, dbo.ALL_CLIENTS.CL_BIRTH_DATE)

       THEN dbo.ClaimsTable.CI_SERVICE_YEAR - DATEPART(yyyy, dbo.ALL_CLIENTS.CL_BIRTH_DATE)

       ELSE dbo.ClaimsTable.CI_SERVICE_YEAR - DATEPART(yyyy, dbo.ALL_CLIENTS.CL_BIRTH_DATE) - 1 END) AS Age

     FROM dbo.[org_code_lookup] 

    -- lookup Units (1 and 2)

    RIGHT OUTER JOIN dbo.H_Unit H_Unit_2

         RIGHT OUTER JOIN dbo.ClaimsTable

          LEFT OUTER JOIN dbo.H_Unit2 H_Unit2_2 ON dbo.ClaimsTable.CI_HEALTH_UNIT2_ID = H_Unit2_2.Hunit ON

           H_Unit_2.Hunit = dbo.ClaimsTable.CI_HEALTH_UNIT_ID ON 

    --lookup Organization

    dbo.[org_code_lookup].TD_ID = dbo.ClaimsTable.CI_ORGANIZATION_CODE

       -- lookup Month

    LEFT OUTER JOIN dbo.Month_Name ON dbo.ClaimsTable.CI_SERVICE_MONTH = dbo.Month_Name.MonthCode

    --lookup Year

       LEFT OUTER JOIN dbo.calendar_to_fiscal ON dbo.ClaimsTable.CI_SERVICE_MONTH = dbo.calendar_to_fiscal.[month] AND

          dbo.ClaimsTable.CI_SERVICE_YEAR = dbo.calendar_to_fiscal.[year]

         --lookup Level Of Care

    LEFT OUTER JOIN dbo.dp_care_level_code ON dbo.ClaimsTable.CI_CARE_LEVEL_CODE = dbo.dp_care_level_code.TD_ID

        -- Providers

     LEFT OUTER JOIN   dbo.ALL_PROVIDERS

            --lookup Providers by postal code ( within the ALL_PROVIDERS lookup table)

    LEFT OUTER JOIN dbo.tblPostalCodes tblPostalCodes_1 ON dbo.ALL_PROVIDERS.PR_POSTAL_CODE_ADDRESS = tblPostalCodes_1.PostalCode

            --lookup Providers by Units 1 and 2 ( within the ALL_PROVIDERS lookup table)LEFT OUTER JOIN dbo.H_Unit H_Unit_1 ON dbo.ALL_PROVIDERS.PR_HEALTH_UNIT_ID = H_Unit_1.Hunit

            LEFT OUTER JOIN dbo.H_Unit2 H_Unit2_1 ON dbo.ALL_PROVIDERS.PR_HEALTH_UNIT2_ID = H_Unit2_1.Hunit ON

           dbo.ClaimsTable.PROVIDER_ID = dbo.ALL_PROVIDERS.PROVIDER_ID

    --lookup Services by service code   

    LEFT OUTER JOIN   dbo.[SERVICE_TYPE_CODE_LOOKUP] ON  dbo.ClaimsTable.CI_SERVICE_TYPE_CODE = dbo.[SERVICE_TYPE_CODE_LOOKUP].TD_ID

    --lookup Clients by postal code and Client ID     

    FULL OUTER JOIN   dbo.tblPostalCodes tblPostalCodes_2

           RIGHT OUTER JOIN dbo.ALL_CLIENTS ON tblPostalCodes_2.PostalCode = dbo.ALL_CLIENTS.CL_POSTAL_CODE_ADDRESS ON

            dbo.ClaimsTable.CLIENT_ID = dbo.ALL_CLIENTS.CLIENT_ID

     WHERE     (dbo.ClaimsTable.PROVIDER_ID <> '12345') 

     

     

     

     / *  Clustered Index  */

    CREATE UNIQUE CLUSTERED INDEX ClaimTableforCube _UNIQUE ON ClaimTableforCube (client_id, provider_pc,CI_PAID_AMOUNT, CI_Service_Month)

     

    Note :

    ClaimTable - Claim Table which has about 6,692,500 records

    CL_ALL_CLIENT - Master Client table which has about 600,000 clients

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

  • Do you really need all those outer joins?  You're doing a lot more than just looking up matching values with this.   


    And then again, I might be wrong ...
    David Webb

  • Yes, we need all the joins. We look up Organization, Units(1 and 2) ,  Month and Year, 'Level Of Care', providers by postal codes and Units, Services by service code, Clients by postal code and Client ID. So, they all are lookup joins.

    Chandi

  • Yes, you need the joins to do the lookups, but do they have to be OUTER joins?  Outer joins may return more data than you need and hamper the optimizer's ability to use indexes efficiently.


    And then again, I might be wrong ...
    David Webb

  • So, what's your solusion? INNER JOIN? and what are they?

    Thanks,

    Chandi

  • If you don't need all the rows from the lookup tables returned, whether you find a match or not, you shouldn't be using OUTER joins. 

    Here's a clip from Books Online:

     

    Join conditions can be specified in either the FROM or WHERE clauses; specifying them in the FROM clause is recommended. WHERE and HAVING clauses can also contain search conditions to further filter the rows selected by the join conditions.

    Joins can be categorized as:

    • Inner joins (the typical join operation, which uses some comparison operator like = or <&gt. These include equi-joins and natural joins.

      Inner joins use a comparison operator to match rows from two tables based on the values in common columns from each table. For example, retrieving all rows where the student identification number is the same in both the students and courses tables.

    • Outer joins. Outer joins can be a left, a right, or full outer join.

      Outer joins are specified with one of the following sets of keywords when they are specified in the FROM clause:

      • LEFT JOIN or LEFT OUTER JOIN

        The result set of a left outer join includes all the rows from the left table specified in the LEFT OUTER clause, not just the ones in which the joined columns match. When a row in the left table has no matching rows in the right table, the associated result set row contains null values for all select list columns coming from the right table.

      • RIGHT JOIN or RIGHT OUTER JOIN.

        A right outer join is the reverse of a left outer join. All rows from the right table are returned. Null values are returned for the left table any time a right table row has no matching row in the left table.

      • FULL JOIN or FULL OUTER JOIN.

        A full outer join returns all rows in both the left and right tables. Any time a row has no match in the other table, the select list columns from the other table contain null values. When there is a match between the tables, the entire result set row contains data values from the base tables.

    • Cross joins.

      Cross joins return all rows from the left table, each row from the left table is combined with all rows from the right table. Cross joins are also called Cartesian products.

    For example, here is an inner join retrieving the authors who live in the same city and state as a publisher:

    USE pubsSELECT a.au_fname, a.au_lname, p.pub_nameFROM authors AS a INNER JOIN publishers AS p   ON a.city = p.city   AND a.state = p.stateORDER BY a.au_lname ASC, a.au_fname ASC

    The tables or views in the FROM clause can be specified in any order with an inner join or full outer join; however, the order of tables or views specified when using either a left or right outer join is important. For more information about table ordering with left or right outer joins, see Using Outer Joins.


    And then again, I might be wrong ...
    David Webb

  • I've read this many many times. But, it won't work for me. We need all the data

    from 'ClaimsTable' table regardless if we find a value in the lookup table or not.

     

  • Maybe I've missed something but shouldn't the lookup table have all possible values. I can imagine having a states table with only 42 states in it because they are not all used...

    Maybe you should make sure that all the lookup values are in the table and then use an inner join (or maybe I'm completly off with this topic... sorry if it's the case).

    If you're not too sure about the performance boost, try rerunning the query with only inner joins (where possible), just to see the difference in speed.

  • Loosely speaking an INNER join requires there to be a matching record in both tables within the join.

    If the codes in your main table always exist in your lookup table then you can use inner joins.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply