Recursive CTE vs UNION ALL in a VIEW

  • I tried that but, because of RECIPE.ANTL * DOP.ANT I was getting an arithmetic overflow.

  • Hi Jonathan, still a lot of logical reads.

    I've tried this:

    SET STATISTICS TIME ON
    SET STATISTICS IO ON
    SELECT *
    FROM dbo.vRecipe

    SELECT *
    FROM dbo.vRecipe3 -- with indexed view

    SELECT *
    FROM dbo.Recipe -- Original, with UNION ALL

    SET STATISTICS IO OFF
    SET STATISTICS TIME OFF

    The results:

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    (626387 rows affected)

    Table 'DBDOPSK'. Scan count 1138331, logical reads 2290520, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBRVARE'. Scan count 512506, logical reads 1040233, physical reads 5, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 4, logical reads 5305402, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBOPSK'. Scan count 1, logical reads 447, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBVARE'. Scan count 1, logical reads 43, physical reads 0, read-ahead reads 40, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 20734 ms, elapsed time = 23744 ms.

    (626387 rows affected)

    Table 'DBDOPSK'. Scan count 1138330, logical reads 2290291, physical reads 7, read-ahead reads 17, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBRVARE'. Scan count 512506, logical reads 1040233, physical reads 17, read-ahead reads 97, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 4, logical reads 5305402, physical reads 133, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'vDBDOPSK_Sumant'. Scan count 1, logical reads 40, physical reads 1, read-ahead reads 38, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBOPSK'. Scan count 1, logical reads 447, physical reads 80, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBVARE'. Scan count 1, logical reads 43, physical reads 14, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 19063 ms, elapsed time = 23572 ms.

    (626387 rows affected)

    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 110, logical reads 3589, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBDOPSK'. Scan count 72, logical reads 16468, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBRVARE'. Scan count 42, logical reads 2322, physical reads 1, read-ahead reads 64, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'DBOPSK'. Scan count 6, logical reads 2682, physical reads 3, read-ahead reads 8, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3234 ms, elapsed time = 12401 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    As you can see, last one is the fastest. Probably I should leave it as it, I thought that recursive was going to be faster but or I was wrong or I'm missing something.

  • I also think they are logically different queries and produce different results. I ran the original query and it returned over 400k rows (in over 20 minutes). The rCTE version only returned 78k rows.

  • Not sure why you are focusing on logical reads.  You do realize that those are reads from memory, and depending on how much data you are accessing, that number could be high.

     

  • No, I think that's because I had some problems executing the script for exporting data (it was too huge). In my databases both views produce the same result.

  • Yes, but the amount of logical reads is too high and if you compare with the UNION ALL version, you can see the difference.

  • Mauricio_ wrote:

    No, I think that's because I had some problems executing the script for exporting data (it was too huge). In my databases both views produce the same result.

    Well, there's not the right data to test it from your dump. The rCTE version takes about 1 second on my machine and the original version takes over 20 minutes.

    Normally people try to improve performance by removing rCTE functions (not adding them) as they are generally less efficient.

  • So, from your point of view, in this case it's better to use UNION ALL even when some levels are not used than to use a recursive CTE?. I know CTEs can be less efficient, I really thought that in this case wouldn't be the case. Thanks for your advice, Jonathan.

  • Mauricio_ wrote:

    So, from your point of view, in this case it's better to use UNION ALL even when some levels are not used than to use a recursive CTE?. I know CTEs can be less efficient, I really thought that in this case wouldn't be the case. Thanks for your advice, Jonathan.

    You can never be sure without testing and experimenting. So far it looks like you can't get the same performance from the rCTE version. It doesn't look too far off the original version though, but if you've done all you can think of with that method and there's already a better (faster) method then you need to throw it away and try something else.

  • Which comes back to your sample database.  I haven't had a chance to download it yet, but from the discussion I am wondering if the results from your original code are correct based on the data in the sample data.  Could you validate that the results are correct?  If so, it is quite possible some of us may play with your sample data and see what we maybe able to do to improve the performance.  All we need is to be confident that the results of the original code are correct so we have something to test against, both in performance and results returned.  Performance doesn't matter if the results are wrong.

  • you also need to consider what the usage of the view is going to be. if majority of access will use a filter which you stated is faster than the union then maybe using it is still the best option even if the unfiltered is slower.

  • I will answer Frederico and Lynn. First, I've checked that output is the same, at least in my test database (not the one that I've uploaded, which has only a few tables). The use of the view, at least in the stored procedures I could find, it's as JOIN (INNER and LEFT). Trying to optimize one of them I've noted that using the old view, performance takes 6 seconds (for only 33 rows). With the new view (and many other changes in the stored procedure, to be honest), it takes less than 1 second.

  • Mauricio_ wrote:

    I will answer Frederico and Lynn. First, I've checked that output is the same, at least in my test database (not the one that I've uploaded, which has only a few tables). The use of the view, at least in the stored procedures I could find, it's as JOIN (INNER and LEFT). Trying to optimize one of them I've noted that using the old view, performance takes 6 seconds (for only 33 rows). With the new view (and many other changes in the stored procedure, to be honest), it takes less than 1 second.

     

    Are you saying that the results for the original query, using the data available in the uploaded database, is not correct?  That is what I asked you to verify.  If it isn't returning the correct results based on the data available then it is of no value if anyone wants to try and improve the original query.

     

  • You can try this link. Both views (vRecipe and Recipe) have 626387 rows: https://www.dropbox.com/s/pgdjg9pel9z2fbc/TEST_DB.zip?dl=0

  • I ran your original view on the backup you've just provided and it returned all rows (626387) in 1.4 seconds.

    select * into #a from dbo.RECIPE
     SQL Server Execution Times:
    CPU time = 0 ms, elapsed time = 0 ms.
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.
    Table 'DBRVARE'. Scan count 210, logical reads 12264, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'DBDOPSK'. Scan count 360, logical reads 85824, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'DBOPSK'. Scan count 30, logical reads 6114, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (626387 rows affected)

    (1 row affected)

    SQL Server Execution Times:
    CPU time = 4315 ms, elapsed time = 1414 ms.
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:
    CPU time = 0 ms, elapsed time = 0 ms.

    I don't think you'll be able to improve on that by much.

    An odd thing though. When I created the following schema bound view with an index:

    CREATE VIEW [dbo].[vwDBDOPSKSumant]
    WITH SCHEMABINDING
    AS
    SELECT DOP.NR,
    SUM(ISNULL(DOP.ANT,0)) SUMANT
    ,COUNT_BIG(*) as z
    FROM dbo.DBDOPSK DOP
    WHERE DOP.NR IS NOT NULL
    GROUP BY DOP.NR
    GO
    CREATE UNIQUE CLUSTERED INDEX IX_vwDBDOPSKSumant ON [vwDBDOPSKSumant](NR)

    The same query that ran in 1.4 seconds slows down to just about 10 minutes. This is because it then takes 534363 ms to compile:

    SQL Server parse and compile time: 
    CPU time = 0 ms, elapsed time = 0 ms.
    SQL Server parse and compile time:
    CPU time = 530219 ms, elapsed time = 534363 ms.
    Table 'DBRVARE'. Scan count 210, logical reads 12264, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'DBDOPSK'. Scan count 360, logical reads 85824, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'DBOPSK'. Scan count 30, logical reads 6114, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (626387 rows affected)

     

Viewing 15 posts - 46 through 60 (of 60 total)

You must be logged in to reply to this topic. Login to reply