March 24, 2014 at 12:50 am
might be this will be handy and more readable
with cte as (
SELECT DISTINCT t1.DealerID, t2.tire,
TotalRepairs = SUM(1) OVER (PARTITION BY t1.DealerID),
RepairsOfThisTire = SUM(1) OVER(PARTITION BY t1.DealerID, t2.tire),
PercentOfTotalRepairOrders = ROUND((SUM(1.0) OVER(PARTITION BY t1.DealerID, t2.tire)/SUM(1.0) OVER (PARTITION BY t1.DealerID)) *100,0)
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2
ON t2.RepairId = t1.RepairId
)
SELECT * FROM cte
WHERE tire IS NOT NULL
ORDER BY PercentOfTotalRepairOrders DESC
since you are using Sql 2012 you can use Lead or Lag function to calculate Total percent by Dealer Group and you can get the desire result
Every rule in a world of bits and bytes, can be bend or eventually be broken
MyBlog About Common dialog control
A Visualizer for viewing SqlCommand object script [/url]
March 24, 2014 at 8:09 am
ChrisM@home (3/23/2014)
This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:
DROP TABLE dbo.Table1
CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);
INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16);
DROP TABLE dbo.Table2
CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));
INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'),
(7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'),
(2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'),
(14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');
and a query which works:
SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs,
PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairs
FROM (
SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire,
Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC)
FROM (
SELECT
t1.DealerID, t2.tire,
TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID),
RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire)
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId
) d
WHERE tire IS NOT NULL
) e
WHERE Sector < 5
GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotal
ORDER BY DealerID, RepairsOfThisTire DESC, tire;
The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?
March 24, 2014 at 8:17 am
tfendt (3/24/2014)
...The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?
Exactly. What do they want if the number of rows is not divisible by 5. Modification of the query to accommodate the decision should be straightforward.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 24, 2014 at 8:18 am
tfendt (3/24/2014)
ChrisM@home (3/23/2014)
This query works only because of the sequence of RepairId in table2, which already orders most-used tyres ahead of least-used tyres, probably by accident. If table2 really is artificially ordered in this way, you're ok, otherwise you will get incorrect results. Here's a copy of Lyn's dataset slightly adjusted to show this:
DROP TABLE dbo.Table1
CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);
INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16);
DROP TABLE dbo.Table2
CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));
INSERT INTO dbo.Table2 VALUES (9,'225/65R17 102T'),
(7,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(8,'235/60R18 102V'),
(2,'205/55R16 89H'),(1,'205/70R15 89H'),(13,'225/65R17 102T'),
(14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V');
and a query which works:
SELECT DealerID, tire, RepairsOfThisTire, TotalRepairs,
PercentOfTotal = (RepairsOfThisTire*100.00)/TotalRepairs
FROM (
SELECT DealerID, TotalRepairs, tire, RepairsOfThisTire,
Sector = NTILE(5) OVER(PARTITION BY DealerID ORDER BY RepairsOfThisTire DESC, tire DESC)
FROM (
SELECT
t1.DealerID, t2.tire,
TotalRepairs = COUNT(*) OVER (PARTITION BY t1.DealerID),
RepairsOfThisTire = COUNT(*) OVER(PARTITION BY t1.DealerID, t2.tire)
FROM dbo.Table1 t1
LEFT JOIN dbo.Table2 t2 ON t2.RepairId = t1.RepairId
) d
WHERE tire IS NOT NULL
) e
WHERE Sector < 5
GROUP BY DealerID, TotalRepairs, tire, RepairsOfThisTire--, PercentOfTotal
ORDER BY DealerID, RepairsOfThisTire DESC, tire;
The problem with this query is it returns data for dealer id 3. The business rule I was given was that they only want to see the top 80%. In the case of dealer id 3 the query is showing the top 100%. In this example the top 80% cannot be calculated since all the orders are the same tire. The main issue I see and need to get clarification from the business on is what happens if the top tire is > 80%?
I do like this query however. I am not 100% sure on how it works though. How does it know to select the top 80%? Why NTILE(5)?
March 24, 2014 at 8:23 am
Have a look at the NTILE function in Books Online (SQL Server help).
NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 24, 2014 at 8:36 am
ChrisM@Work (3/24/2014)
Have a look at the NTILE function in Books Online (SQL Server help).NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n.
I was aware it split the data into 5 groups I was just unaware how it knows to select 80%. So 5 groups 20% in each group?
So from what I understand is the data is first partitioned by DealerID and then split into 5 groups within each DealerID based on RepairsOfThisTire. Why 5 is my question.
When I add another record to table 1 (3,18) and table 2 (18, '205/70R15 89H') for dealer 3 it then shows two records one at 75% and another at 25%.
March 24, 2014 at 8:46 am
tfendt (3/24/2014)
ChrisM@Work (3/24/2014)
Have a look at the NTILE function in Books Online (SQL Server help).NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n.
I was aware it split the data into 5 groups I was just unaware how it knows to select 80%. So 5 groups 20% in each group?
So from what I understand is the data is first partitioned by DealerID and then split into 5 groups within each DealerID based on RepairsOfThisTire. Why 5 is my question.
When I add another record to table 1 (3,18) and table 2 (18, '205/70R15 89H') for dealer 3 it then shows two records one at 75% and another at 25%.
80% is 4/5, hence NTILE(5) and discard for n=5. As you've mentioned above, if the number of qualifying rows is not divisible by 5 then you have a choice to make. The query is written the simplest way which might meet your business requirement, but is adjustable if the requirement is different.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 24, 2014 at 8:55 am
ChrisM@Work (3/24/2014)
tfendt (3/24/2014)
ChrisM@Work (3/24/2014)
Have a look at the NTILE function in Books Online (SQL Server help).NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n.
I was aware it split the data into 5 groups I was just unaware how it knows to select 80%. So 5 groups 20% in each group?
So from what I understand is the data is first partitioned by DealerID and then split into 5 groups within each DealerID based on RepairsOfThisTire. Why 5 is my question.
When I add another record to table 1 (3,18) and table 2 (18, '205/70R15 89H') for dealer 3 it then shows two records one at 75% and another at 25%.
80% is 4/5, hence NTILE(5) and discard for n=5. As you've mentioned above, if the number of qualifying rows is not divisible by 5 then you have a choice to make. The query is written the simplest way which might meet your business requirement, but is adjustable if the requirement is different.
:pinch: Makes sense now. Not sure why I was trying to make it more complex than it really was.
March 24, 2014 at 9:00 am
tfendt (3/24/2014)
ChrisM@Work (3/24/2014)
tfendt (3/24/2014)
ChrisM@Work (3/24/2014)
Have a look at the NTILE function in Books Online (SQL Server help).NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n.
I was aware it split the data into 5 groups I was just unaware how it knows to select 80%. So 5 groups 20% in each group?
So from what I understand is the data is first partitioned by DealerID and then split into 5 groups within each DealerID based on RepairsOfThisTire. Why 5 is my question.
When I add another record to table 1 (3,18) and table 2 (18, '205/70R15 89H') for dealer 3 it then shows two records one at 75% and another at 25%.
80% is 4/5, hence NTILE(5) and discard for n=5. As you've mentioned above, if the number of qualifying rows is not divisible by 5 then you have a choice to make. The query is written the simplest way which might meet your business requirement, but is adjustable if the requirement is different.
:pinch: Makes sense now. Not sure why I was trying to make it more complex than it really was.
80% of software is more complex than it needs to be for this same reason 😉
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 24, 2014 at 10:37 am
Not sure of the performance of this solution, but it is worth a try.
CREATE TABLE dbo.Table1 (DealerId INT, RepairId INT);
INSERT INTO dbo.Table1 VALUES
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(1,8),(1,9),(1,10),(2,11),(2,12),(2,13),
(3,14),(3,15),(3,16),
(4,17),(4,18),(4,19),(4,20),(4,21),(4,22),(4,23),(4,24),(4,25),(4,26);
CREATE TABLE dbo.Table2 (RepairId INT, tire VARCHAR(15));
INSERT INTO dbo.Table2 VALUES (1,'225/65R17 102T'),
(2,'225/65R17 102T'),(3,'225/65R17 102T'),(4,'235/60R18 102V'),
(5,'235/60R18 102V'),(6,'235/60R18 102V'),(7,'235/60R18 102V'),
(8,'205/55R16 89H'),(9,'205/70R15 89H'),(13,'225/65R17 102T'),
(14,'235/60R18 102V'),(15,'235/60R18 102V'),(16,'235/60R18 102V'),
(17,'235/60R18 102V'),(18,'235/60R18 102V'),(19,'235/60R18 102V'),
(20,'235/60R18 102V'),(21,'235/60R18 102V'),(22,'235/60R18 102V'),
(23,'235/60R18 102V'),(24,'235/60R18 102V'),(25,'235/60R18 102V'),
(26,'225/65R17 102T');
go
with BaseData as (
select
t1.DealerId,
t1.RepairId,
t2.tire,
tirecnt = count(t2.tire) over (partition by t1.DealerId, t2.tire),
repaircnt = count(t1.RepairId) over (partition by t1.DealerId)
from
dbo.Table1 t1 left join
dbo.Table2 t2
on t1.RepairId = t2.RepairId
), Level1 as (
select
bd.DealerId,
bd.RepairId,
bd.tire,
bd.repaircnt,
bd.tirecnt,
PercentOrders = (bd.tirecnt * 100.0 / bd.repaircnt),
cnt = (row_number() over (partition by bd.DealerId
order by
case when bd.tire is null then 1 else 0 end,
bd.tirecnt desc,
bd.tire,
bd.RepairId) * 100.0) / bd.repaircnt
from
BaseData bd
), Level2 as (
select *, rn = row_number() over (partition by DealerId, tire order by cnt desc)
from Level1
where cnt <= 80.0)
select DealerId, tire, tirecnt as RepairsThisTire, repaircnt as RepairCnt, PercentOrders as PercentOfTotal from Level2 where rn = 1 and tire is not null
order by DealerId, tirecnt desc;
go
DROP TABLE dbo.Table1;
DROP TABLE dbo.Table2;
March 24, 2014 at 11:02 am
ChrisM@Work (3/24/2014)
tfendt (3/24/2014)
ChrisM@Work (3/24/2014)
Have a look at the NTILE function in Books Online (SQL Server help).NTILE(n) splits the data into n equal-sized partitions numbered 1 to n in a user-defined order and with defined rules for when the data set is not divisible by n.
I was aware it split the data into 5 groups I was just unaware how it knows to select 80%. So 5 groups 20% in each group?
So from what I understand is the data is first partitioned by DealerID and then split into 5 groups within each DealerID based on RepairsOfThisTire. Why 5 is my question.
When I add another record to table 1 (3,18) and table 2 (18, '205/70R15 89H') for dealer 3 it then shows two records one at 75% and another at 25%.
80% is 4/5, hence NTILE(5) and discard for n=5. As you've mentioned above, if the number of qualifying rows is not divisible by 5 then you have a choice to make. The query is written the simplest way which might meet your business requirement, but is adjustable if the requirement is different.
Just got confirmation from the business. When the number of rows is not divisible by 5 we will display all the results. I will modify the code to use your new query.
Viewing 11 posts - 16 through 25 (of 25 total)
You must be logged in to reply to this topic. Login to reply