CXConsumer wait type issue

Question

Post reply

CXConsumer wait type issue

saum70

Ten Centuries

Points: 1002
More actions
September 24, 2023 at 10:48 am

#4292985

hello,
my query is select * into #TmpA from tblA (nolock). tblA contains 116000000 records.
This query sometime takes 6 minutes and some time takes less 2 minutes. I observed CXConsumer wait types mostly and sometimes CXPacket. On observing the execution plan, parallel execution is taking place.
kindly guide what can be done in this case
Regards,
Saumik Vora

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Site Owners SSC Guru Points: 80411 More actions · Answer 1

Thanks for posting your issue and hopefully someone will answer soon.

This is an automated bump to increase visibility of your question.

kaj SSCommitted Points: 1619 More actions · Answer 2

I don't think there is a pre-baked recipe for you, but FWIW, Brent Ozar has a few recommandations, which you can read about in this blog post:

https://blog.quest.com/demystifying-cxpacket-and-cxconsumer-wait-types-in-sql-server/

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 3

Why are you selecting everything into a temp table? You should filter that down to just the columns you need - and only the rows that are needed.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Jo Pattyn SSC-Dedicated Points: 32623 More actions · Answer 4

Jo Pattyn

SSC-Dedicated

Points: 32623

September 26, 2023 at 9:08 am

#4295340

You could try the maxdop hint to avoid parallellism

Dennis Jensen SSChasing Mays Points: 606 More actions · Answer 5

Jo Pattyn why on earth would you want to avoid parallellism as that is in every case that I am aware a majorly good thing that helps a Query complete faster provided the hardware on which the database resides is up to snuff and if they are dealing with that much data in a single table I surely hope their servers can properly handle parallellism.

Note: However Jo Pattyn (or anyone), if there is actually a solid reason for this please educate me as I am more than willing to learn something knew.

Also saum70 I would reiterate Jeffrey Williams why are you creating a temporary table that has all the data found within the primary table as that is completely redundant and a horrible use of resources. The purpose of a temp table is usually to work with a smaller subset of the primary table and NOT its entire contents.

saum70 Ten Centuries Points: 1002 More actions · Answer 6

Hello,

@Jeffery and @dennis, The reason i am creating temp table with all the columns is because it will be used multiple times in the stp and all the columns will be referred in the final output. also there is code to find max of serial no against each ordernumber (one ordernumber will have multiple rows).

So instead of referring the physical table again and again, temp table is created to avoid IO performance issue.

It was also observed that the mentioned query when executed in sql query analyzer takes 1.5 mins where when called from front end application (through stp) it takes avg 6 mins

What can be done in this case

Regards,

Saumik

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 7

By copying all rows and all columns into a temp table - you remove any possibility of utilizing indexes on the source table. So every time you query that temp table you have to read all rows.

You are not avoiding any IO issues by doing this - in fact, you are most likely increasing the amount of IO needed to perform the tasks because you are constantly performing a full scan of that table every time it is referenced.

Without seeing all of the code - I really can't recommend anything to solve your issues. I would guess there are many opportunities to optimize the code, but no way that can be done without seeing that code.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Dennis Jensen SSChasing Mays Points: 606 More actions · Answer 8

A Corollary to what Jeffrey Williams said -- perhaps you might need additional Indexes upon your parent table.

However, copying that table into a temp table simply does just that, it copies the table into the temporary harddrive (or SSD) storage and as such hitting that temp table is actually the same thing as hitting that parent table of course minus any of the indexes that parent table might have unless you duplicate those indexes as well which again is pointless since you already have the table and its indexes on the harddrive.

Lastly I concur with Jeffrey Williams about us not being able to help you because you have presented no code or metrics analysis of the query. So we have no way of helping you diagnose your current probleme let alone specifically help you solve your current problem. You are asking us to pull the proverbial rabbit out of the proverbial hat and while many might think we are workers of magic that is simply an illusion created by others. 😉

This reply was modified 2 years, 4 months ago by Dennis Jensen.

saum70 Ten Centuries Points: 1002 More actions · Answer 9

Hi,

thanks for replying back. due to policies i will not be able to provide actual code but i can replicate with dummy tables. Following is the details for your ref

tblA --- master table with 18900000 rows. Primary key is OrderNumber

tblB -- details table with 116000000. Primary key is OrderNumber, TranSerialNo, MessageCode

tblD -- master table with 36k rows. Primary key is DealerCode

code in stp is as below

select * into #TmpA from tblA(nolock)

select * into #TmpB from tblB(nolock)

create index #idxA on #TmpA(nOrderNumber)

create index #idxB on #TmpA(nOrderNumber, nTranSerialNo)

select DealerId into #TmpD from tblD(Nolock)

select a.OrderNumber,Max(b.TranSerialNo)

into #TmpLatestOrder

from #TmpA a

inner join #TmpB b

on a.OrderNumber = b.OrderNumber

where a.ReferenceNum=0

group by a.OrderNumber

final output query

Select all columns #TmpA and #TmpB from

#TmpA a

inner join #TmpLatestOrder b

on a.OrderNumber = b.OrderNumber

inner join #TmpB c

on b.OrderNumber = c.OrderNumber

and b.TranSerialNo=c.TranSerialNo

inner join #TmpD d

on a.nUserId = d.DealerId

Hope above is helpful

Regards,

Saumik

frederico_fonseca SSCoach Points: 16306 More actions · Answer 10

how many rows of tbl A (18900000 total) have the ReferenceNum = 0 ?

and why aren't you filtering them when doing the insert into the temp table?

CreateIndexNonclustered SSChasing Mays Points: 658 More actions · Answer 11

Dennis Jensen wrote:

Jo Pattyn why on earth would you want to avoid parallelism as that is in every case that I am aware a majorly good thing

Saum is specifically reporting parallelism waits, attributing it as a problem. whether or not it is actually a problem, not enough information has been provided. Given that Saum is doing a full table scan without any filters and inserting into a temporary heap, it is pretty unlikely that parallelism would be beneficial.

Parallelism is not always good. Excessive parallelism in some cases can make a query run longer than it would if it ran serial. In very common cases, excess parallelism uses up CPU resources that could otherwise be used by something else in queries that do not benefit from the parallelism.

@saum, I don't really have anything to say about your performance problem, but would strongly recommend against using select * into for anything. This is long running DDL and creates schema locks that are not released until the entire statement is complete. If you want to be flippant, you can do a select top 0 into #temp and then use dynamic SQL to do the insert into select from, but to me it is an infinitesimally small cost to just write the SQL that you are going to write one time and then reuse for however long the process is in place.

Dennis Jensen SSChasing Mays Points: 606 More actions · Answer 12

Okay looking at what you shared here would be a more efficient way of doing it -- although you would need to test the second SELECT to be sure it was catching only what you wanted but I am pretty sure without testing it that it should be really close. Then again if it is off I am sure someone else will chime in to correct any mistakes I made.

-- tblA master table with 18,900,000 rows   Primary key is OrderNumber

-- tblB details table with 116,000,000 rows Primary key is OrderNumber, TranSerialNo, MessageCode

-- tblD master table with 36,000 rows       Primary key is DealerCode

SELECT *
  INTO #tmpA
  FROM tblA(NOLOCK) AS [a]
 INNER JOIN tblD(NOLOCK) AS [d]
    ON [d].[DealerId] = [a].[nUserId]
 WHERE [a].[ReferenceNum] = 0;
GO

CREATE INDEX #idxA
    ON #tmpA( nOrderNumber );
GO

CREATE INDEX #idxB
    ON #tmpA( nOrderNumber, nTranSerialNo );
GO

-- This also assumes that #tmpA will now be smaller than tblB
-- Now I believe the ".[TranSerialNo] = (" will work in the
-- following but if it does not then change it to 
-- ".[TranSerialNo] IN ("
SELECT *
  INTO #tmpB
  FROM #tmpA AS [a]
 INNER JOIN tblB(NOLOCK) AS 
    ON .[OrderNumber]  = [a].[OrderNumber]
   AND .[TranSerialNo] = ( SELECT MAX( TranSerialNo ) AS TranSerialNo
                                FROM tblB(NOLOCK) AS [b2]
                               WHERE [b2].[OrderNumber] = [a].[OrderNumber]
                               GROUP [b2].[OrderNumber]
                            );
GO

-- Personally I would select all the items from #tmpA and
-- then only those items in #tmpB that are not already
-- contained within #tmpA. However, the following does what
-- yours did which means you will have duplicated column
-- names which is usually a bad thing
SELECT [a].*
  FROM #TmpA AS [a]
 UNION ALL
SELECT .*
  FROM #TmpB AS ;
GO