June 23, 2017 at 3:57 pm
I need a query that sums the number of occurrences of a value and group them into ranges. For example, in AdventureWorks the employee table, I can run the following query
select organizationlevel, count(organizationlevel) from humanresources.employee
group by organizationlevel
organizationlevel count(organizationlevel)
NULL 0
1 6
2 27
3 66
4 190
What I want to do is return the following output on the column organization level. You see I want a sum of the values with the number of occurrences between 1 and 30 (there are 2 of those (level 1 and level 2), and a sum of the values with the number of occurrences between 31 and 70 (there is 1, level 3) and so on.
range
between 1 and 30 2
between 31 and 70 1
> 70 1
I need this query to run on columns when I am optimizing storedprocedures that may result in parameter sniffing. (some values may result in a good execution plan while other get a terrible execution plan. Can anyone help with this?
June 23, 2017 at 5:29 pm
ceciliarenebaker - Friday, June 23, 2017 3:57 PMI need a query that sums the number of occurrences of a value and group them into ranges. For example, in AdventureWorks the employee table, I can run the following queryselect organizationlevel, count(organizationlevel) from humanresources.employee
group by organizationlevel
organizationlevel count(organizationlevel)
NULL 0
1 6
2 27
3 66
4 190What I want to do is return the following output on the column organization level. You see I want a sum of the values with the number of occurrences between 1 and 30 (there are 2 of those (level 1 and level 2), and a sum of the values with the number of occurrences between 31 and 70 (there is 1, level 3) and so on.
range
between 1 and 30 2
between 31 and 70 1
> 70 1
I need this query to run on columns when I am optimizing storedprocedures that may result in parameter sniffing. (some values may result in a good execution plan while other get a terrible execution plan. Can anyone help with this?
I see what you're saying. Here's some DDL to create table and some test data to mimic your HumanResources.Employees table.
IF OBJECT_ID('dbo.Employees', 'u') IS NOT NULL DROP TABLE dbo.Employees;
CREATE TABLE dbo.Employees (
ID Integer not null identity (1, 1),
constraint Employees_PK primary key (ID),
OrganizationLevel Integer);
WITH cteLevelsAndCounts AS (
SELECT Level, Number
FROM (VALUES(1, 6),
(2, 27),
(3, 66),
(4, 190)
) x (Level, Number)
)
INSERT INTO dbo.Employees(OrganizationLevel)
SELECT Level
FROM cteLevelsAndCounts d
CROSS APPLY util.dbo.TallyN(d.Number);
Now we can query the counts and bucket them into a range.
WITH cteRanges AS (
SELECT Mn, Mx
FROM (VALUES(1, 30),
(31, 70),
(71, 9999)
) x (Mn, Mx)
),
cteData AS (
SELECT OrganizationLevel, RC = COUNT(*)
FROM dbo.Employees
GROUP BY OrganizationLevel
)
SELECT r.Mn, r.Mx, COUNT(d.RC), SUM(d.RC)
FROM cteData d
CROSS APPLY cteRanges r
WHERE d.RC >= r.Mn
AND d.RC < r.Mx
GROUP BY r.Mn, r.Mx
ORDER BY r.Mn;
Because you already have your table, you'll want to change the cteData query to read from your own table.
The cteRanges CTE could be a physical table if you want to use the set of ranges in multiple places and want a centralized list of values.
The cteData does your counts. The outer query reads the employee counts and buckets each count into one of the ranges.
You said you wanted the sum of values in each range, but the expected data showed the counts. So, I figured I'd include both to be sure.
You should be able to use the technique on almost any query in the cteData CTE. I hope this helps.
June 27, 2017 at 1:09 pm
Thank you so much for your help.
June 27, 2017 at 1:38 pm
ceciliarenebaker - Tuesday, June 27, 2017 1:09 PMThank you so much for your help.
I'm glad it worked for you. Thanks for the feedback.
June 27, 2017 at 2:45 pm
Dang! Ed beat me to it.
I would add - for optimal performance you want this index in place:
CREATE INDEX nc_poc_employees_level ON dbo.employees(OrganizationLevel);
-- Itzik Ben-Gan 2001
June 27, 2017 at 7:15 pm
Alan.B - Tuesday, June 27, 2017 2:45 PMDang! Ed beat me to it.I would add - for optimal performance you want this index in place:
CREATE INDEX nc_poc_employees_level ON dbo.employees(OrganizationLevel);
If that's the low cardinality that I think it will be, it would hurt INSERT performance possibly by a lot. Fortunately, it appears to be on an employee table which isn't likely to need to take a lot of inserts, especially in a batch.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 28, 2017 at 12:55 pm
Jeff Moden - Tuesday, June 27, 2017 7:15 PMAlan.B - Tuesday, June 27, 2017 2:45 PMDang! Ed beat me to it.I would add - for optimal performance you want this index in place:
CREATE INDEX nc_poc_employees_level ON dbo.employees(OrganizationLevel);
If that's the low cardinality that I think it will be, it would hurt INSERT performance possibly by a lot. Fortunately, it appears to be on an employee table which isn't likely to need to take a lot of inserts, especially in a batch.
Agreed. I often forget to mention the impact on data modifications when adding a new index. I'm spoiled and live in the data warehouse world where we don't have to think much about inserts, updates and deletes
-- Itzik Ben-Gan 2001
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy