Columnstore Query Reverting to Row Mode with CROSS JOIN

Question

Columnstore Query Reverting to Row Mode with CROSS JOIN

Mitch McBride

SSC Journeyman

Points: 75
More actions
May 20, 2015 at 7:54 pm

#300799

I have two inline selects against a table with a nonclustered columnstore on SQL 2014 (12.0.2000). Both execute in batch mode and when I inner-join the two, they continue to execute in batch mode. When I cross join them, one executes in row mode. Any ideas on how to avoid this? Below is some SQL to simulate the issue.
Thanks,
Mitch
-- The purpose of this script is to demonstrate that
-- two queries against a columnstore index that each execute in batch mode
-- will continue to execute in batch mode when inner joined.
-- However, one of the queries will execute in row mode when cross-joined.
-- Create function to return 0 to n rows
IF OBJECT_ID('dbo.IntCount') IS NOT NULL
DROP FUNCTION dbo.IntCount;
GO
CREATE FUNCTION dbo.IntCount(@n AS BIGINT) RETURNS TABLE AS RETURN
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
SELECT TOP (@n) n FROM Nums ORDER BY n;
GO
-- Create Employee table
IF OBJECT_ID('dbo.Employee') IS NOT NULL
DROP TABLE dbo.Employee;
GO
CREATE TABLE dbo.Employee
(
EmployeeID INT,
GroupID INT
)
-- Populate with 1000 employees in 250 groups of 4
INSERT INTO
dbo.Employee
SELECT
n AS EmployeeID,
CAST((n - 1) / 4 AS INT) + 1 GroupID
FROM
dbo.IntCount(1000)
CREATE CLUSTERED INDEX IX_Employee_CL ON dbo.Employee (EmployeeID)
CREATE NONCLUSTERED INDEX IX_Employee_GroupID ON dbo.Employee (GroupID)
-- Create EmployeeReview table
IF OBJECT_ID('dbo.EmployeeReview') IS NOT NULL
DROP TABLE dbo.EmployeeReview;
GO
CREATE TABLE dbo.EmployeeReview
(
ReviewDate DATE,
EmployeeID INT,
IsPositive INT
)
-- Populate with 1000000 employee reviews between Jan 1, 2015 and Jan 31, 2015
-- A review can either be IsPositive = 0 or IsPositive = 1
INSERT INTO
dbo.EmployeeReview
SELECT
DATEADD(DAY, RAND(CHECKSUM(NEWID())) * 31, CAST('2015-01-01' AS DATE)) AS ReviewDate,
(n - 1) % 1000 + 1 AS EmployeeID,
ROUND(RAND(CHECKSUM(NEWID())) + RAND(CHECKSUM(NEWID())) / 3, 0) AS IsPositive
FROM
dbo.IntCount(1000000)
-- Create Columnstore index on all EmployeeReview columns
CREATE NONCLUSTERED COLUMNSTORE INDEX IX_EmployeeReview_CS ON dbo.EmployeeReview
(
ReviewDate,
EmployeeID,
IsPositive
)
SET STATISTICS IO ON
SET STATISTICS TIME ON
GO
-- Rank Employees for each day by % of Positive reviews against a population of Groups
-- This query's execution plan uses Batch Execution Mode for both subselects
--
-- SQL Server Execution Times:
-- CPU time = 500 ms, elapsed time = 62 ms.
SELECT
Employees.ReviewDate,
Employees.EmployeeID,
Employees.[% of Positive],
SUM(CASE WHEN Employees.[% of Positive] < Groups.[% of Positive] THEN 1 ELSE 0 END) + 1 [Rank]
FROM
(
SELECT
ReviewDate,
EmployeeReview.EmployeeID,
CAST(SUM(IsPositive) AS DECIMAL(19,5)) / CAST(SUM(1) AS DECIMAL(19,5)) AS [% of Positive]
FROM
dbo.EmployeeReview JOIN
dbo.Employee ON EmployeeReview.EmployeeID = Employee.EmployeeID
WHERE
GroupID = 1
GROUP BY
ReviewDate,
EmployeeReview.EmployeeID
) Employees
JOIN
(
SELECT
ReviewDate,
GroupID,
CAST(SUM(IsPositive) AS DECIMAL(19,5)) / CAST(SUM(1) AS DECIMAL(19,5)) AS [% of Positive]
FROM
dbo.EmployeeReview JOIN
dbo.Employee ON EmployeeReview.EmployeeID = Employee.EmployeeID
GROUP BY
ReviewDate,
GroupID
) Groups ON Employees.ReviewDate = Groups.ReviewDate
GROUP BY
Employees.ReviewDate,
Employees.EmployeeID,
Employees.[% of Positive]
-- Rank Employees across the entire month by % of Positive reviews against a population of Groups
-- This query's execution plan uses
-- Batch Execution Mode for the first subselect BUT
-- **Row** Execution Mode for the second subselect
--
-- SQL Server Execution Times:
-- CPU time = 1329 ms, elapsed time = 466 ms.
SELECT
Employees.EmployeeID,
Employees.[% of Positive],
SUM(CASE WHEN Employees.[% of Positive] < Groups.[% of Positive] THEN 1 ELSE 0 END) + 1 [Rank]
FROM
(
SELECT
EmployeeReview.EmployeeID,
CAST(SUM(IsPositive) AS DECIMAL(19,5)) / CAST(SUM(1) AS DECIMAL(19,5)) AS [% of Positive]
FROM
dbo.EmployeeReview JOIN
dbo.Employee ON EmployeeReview.EmployeeID = Employee.EmployeeID
WHERE
GroupID = 1
GROUP BY
EmployeeReview.EmployeeID
) Employees
CROSS JOIN
(
SELECT
GroupID,
CAST(SUM(IsPositive) AS DECIMAL(19,5)) / CAST(SUM(1) AS DECIMAL(19,5)) AS [% of Positive]
FROM
dbo.EmployeeReview JOIN
dbo.Employee ON EmployeeReview.EmployeeID = Employee.EmployeeID
GROUP BY
GroupID
) Groups
GROUP BY
Employees.EmployeeID,
Employees.[% of Positive]
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
GO

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 398916 More actions · Answer 1

Check the execution plan. It really comes down to have it's being resolved. If you read the FAQ from Microsoft it outlines places where you can revert to row mode processing, all of which will be in the execution plan. Also, check the estimated cost of the plan. If it's below your Cost Threshold for Parallelism you're going to get a row mode execution.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Mitch McBride SSC Journeyman Points: 75 More actions · Answer 2

The removal of the join predicate when using the cross join creates an inner loop join, which is not support by batch mode processing. How would I be able to get the cross join to use an inner hash join or use some other plan that support batch mode? An INNER HASH JOIN hint fails even when I create a dummy static value to use as a join predicate.

Grant Fritchey SSC Guru Points: 398916 More actions · Answer 3

I'm not sure. I don't have a good answer to that.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Eric M Russell SSC Guru Points: 125565 More actions · Answer 4

If you feel confident that this particular query should always leverage the non-clustered columnstore index, then have you tried an index like so?

WITH (INDEX (IX_EmployeeReview_CS))

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Grant Fritchey SSC Guru Points: 398916 More actions · Answer 5

Eric M Russell (5/21/2015)
If you feel confident that this particular query should always leverage the non-clustered columnstore index, then have you tried an index like so?
WITH (INDEX (IX_EmployeeReview_CS))

But the issue is if it goes to Batch mode or stays in Row mode. We're trying for batch mode here.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Mitch McBride SSC Journeyman Points: 75 More actions · Answer 6

I found a successful, albeit kludgy, workaround. Add a dummy column to EmployeeReview table populated with 1. A join on it behaves like a cross join, but is executed like an inner join. Not an ideal implementation, but it works.

SELECT

Employees.EmployeeID,

Employees.[% of Positive],

SUM(CASE WHEN Employees.[% of Positive] < Groups.[% of Positive] THEN 1 ELSE 0 END) + 1 [Rank]

FROM

(

SELECT

JoinKey,

EmployeeReview.EmployeeID,

CAST(SUM(IsPositive) AS DECIMAL(19,5)) / CAST(SUM(1) AS DECIMAL(19,5)) AS [% of Positive]

FROM

dbo.EmployeeReview JOIN

dbo.Employee ON EmployeeReview.EmployeeID = Employee.EmployeeID

WHERE

GroupID = 1

GROUP BY

JoinKey,

EmployeeReview.EmployeeID

) Employees

JOIN

(

SELECT

JoinKey,

GroupID,

CAST(SUM(IsPositive) AS DECIMAL(19,5)) / CAST(SUM(1) AS DECIMAL(19,5)) AS [% of Positive]

FROM

dbo.EmployeeReview JOIN

dbo.Employee ON EmployeeReview.EmployeeID = Employee.EmployeeID

GROUP BY

JoinKey,

GroupID

) Groups ON Employees.JoinKey = Groups.JoinKey

GROUP BY

Employees.EmployeeID,

Employees.[% of Positive]