February 17, 2020 at 12:00 am
Comments posted to this topic are about the item Handling Aggregations on a Poorly Designed Database
Aveek has been working as an Analytics Engineer for the past couple of years now. His main areas of interests is SQL Server, SSIS, and SSAS. On his leisure, he likes to travel and is also an amateur photographer.
https://twitter.com/avikoleum
https://www.linkedin.com/in/aveekd22/
February 17, 2020 at 11:50 am
I think you could also just use a CTE such as:
;WITH CTE AS
(
SELECT
us.Name ,us.TeamID
,SUM(wl.HoursWorked) HoursWorked,
ROW_NUMBER() OVER(PARTITION BY NAME ORDER BY Name) RN
FROM Users us
INNER JOIN WorkLog wl ON wl.UserKey = us.UserKey
GROUP BY us.Name ,us.TeamID
)
SELECT Name,HoursWorked
from CTE
WHERE RN = 1
February 17, 2020 at 8:08 pm
How's about this instead?!:
SELECT
us.Name
,SUM(wl.HoursWorked) / COUNT(DISTINCT us.TeamID) AS HoursWorked
FROM Users us
INNER JOIN WorkLog wl ON wl.UserKey = us.UserKey
GROUP BY us.Name
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 24, 2020 at 5:19 am
The original design should have imposed a unique constraint on WorkLog.UserKey. Removing such a unique constraint (if done, in order to accommodate new design requirements) can have far reaching and unexpected consequences (beyond this view) - such as SQL that has been embedded in client application code.
The following assumes only 1 set of duplicate user hours will be inserted by the new design (i.e., WorkLog will remain designed as-is - there will be no splitting of a user's work hours between teams, and the same user will not work the same number of hours in different WorkLog entries that are being summed):
SELECT
us.Name
,SUM(distinct wl.HoursWorked) HoursWorked
FROM Users us
INNER JOIN WorkLog wl ON wl.UserKey = us.UserKey
GROUP BY us.Name
If the real tables and their uses are more complex, consider aggregating via the windowing function OVER(PARTITION BY ...), where "..." can be what I am imagining to be a column that is named (or serving the same purpose as a) [day] - using this can eliminate some of my assumptions.
I did not compare execution plans or statistics io.
With respect to the original design, there is also a possibility to sp_rename the original tables, create new base tables (with new names) that are normalized per the new design requirements, import the original tables' data into the new tables, and create update-able views that are named identically to the original tables (and reference the new tables). While that possibility is more work, it allows a new design to be more formally, publicly, and robustly declared in SQL, and yet still allow legacy applications to address what they believe to be the tables (but are now views). As a bonus, those who follow you (years later) will see your intent, formally declared :). There can be concerns with this approach (such as table or index maintenance activities), but usually "a rose by any other name smells just as sweet".
March 5, 2020 at 3:44 pm
It seems likely that reporting by team would be a feature of the application. Otherwise, why have teams? The data structure cannot support this without modification. I guess my question is, at what point do you just say 'Your application data structure requires modification in order to handle NEW requirements.'? Instead of band-aiding unsustainable solutions into an already poor design.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply