May 4, 2012 at 1:29 pm
Hi all,
Just a quick question which I'm fairly sure I already know the answer to.
Let's say I have a table which stores a clientID and the purchases the client made, as well as a table that stored the clientID and his different accounts.
Also, let's say the client had multiple purchases and multiple accounts:
CREATE TABLE #ClientPurchases
(
ID INT IDENTITY PRIMARY KEY,
ClientID INT,
Purchase DECIMAL(18, 2)
)
CREATE TABLE #ClientAccounts
(
ID INT IDENTITY PRIMARY KEY,
ClientID INT,
AccountNumber BIGINT
)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (1, 10)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (1, 20)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (1, 50)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (1, 100)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (2, 100)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (2, 500)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (2, 400)
INSERT INTO #ClientPurchases (ClientID, Purchase)
VALUES (2, 400)
INSERT INTO #ClientAccounts (ClientID, AccountNumber)
VALUES (1, 1234567)
INSERT INTO #ClientAccounts (ClientID, AccountNumber)
VALUES (1, 1512512)
INSERT INTO #ClientAccounts (ClientID, AccountNumber)
VALUES (1, 2362632)
INSERT INTO #ClientAccounts (ClientID, AccountNumber)
VALUES (1, 412421)
INSERT INTO #ClientAccounts (ClientID, AccountNumber)
VALUES (2, 235236262)
INSERT INTO #ClientAccounts (ClientID, AccountNumber)
VALUES (2, 141241)
INSERT INTO #ClientAccounts (ClientID, AccountNumber)
VALUES (2, 124124)
Now let's say I want to get a count of the total number of accounts, purchases, and the total value of the purchases. Normally I'd write a request like this:
SELECT
#ClientPurchases.ClientID,
COUNT(DISTINCT AccountNumber) AS NumAccounts,
COUNT(DISTINCT #ClientPurchases.ID) AS NumPurchases,
SUM(DISTINCT Purchase) AS AmtPurchases
FROM #ClientPurchases
JOIN #ClientAccounts ON #ClientPurchases.ClientID = #ClientAccounts.ClientID
GROUP BY #ClientPurchases.ClientID
This works for the number of accounts and purchases, but clearly doesn't work for the value of the purchases, because the SUM(DISTINCT Purchase) part is looking at a distinct value of purchases, and not a distinct purchase record.
Obviously there's ways I can get this to work, such as for example splitting the request into two:
WITH CountAccounts AS
(
SELECT
ClientID,
COUNT(*) AS NumAccounts
FROM #ClientAccounts
GROUP BY ClientID
),
ClientPurchases AS
(
SELECT
ClientID,
COUNT(*) AS NumPurchases,
SUM(Purchase) AS AmtPurchases
FROM #ClientPurchases
GROUP BY ClientID
)
SELECT
ISNULL(CountAccounts.ClientID, ClientPurchases.ClientID) AS ClientID,
ISNULL(NumAccounts, 0) AS NumAccounts,
ISNULL(NumPurchases, 0) AS NumPurchases,
ISNULL(AmtPurchases, 0) AS AmtPurchases
FROM CountAccounts
FULL JOIN ClientPurchases ON CountAccounts.ClientID = ClientPurchases.ClientID
My question is just whether there is a simpler way of doing this that I'm forgetting about, since my real query is considerably bigger and I'd rather not take the time to figure out how to split it up. Basically is there a simple way I can get a SUM of distinct values, but where the "distinct-itude" is not decided by the value, but rather by a different key.
May 4, 2012 at 1:34 pm
Unfortunately not that I'm aware of. Because the Join will distribute the two halves of the join into both counts, you have to use correllated subqueries (basically the CTE you built) for each value.
Personally, I'd probably do something like:
SELECT
clientID,
(SELECT COUNT(*) FROM #ClientAccount WHERE ClientID = d.ClientID)
...
FROM
(SELECT DISTINCT ClientID FROM tblClients) AS d
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 4, 2012 at 2:43 pm
Based on your sample data, what should the expected output look like? Manually build us a table that represents the expected results.
May 4, 2012 at 2:48 pm
The second query provides the correct results from my example set, but here they are anyway:
ClientIDNumAccountsNumPurchasesAmtPurchases
144180.00
2341400.00
May 9, 2012 at 4:35 am
I would do this with a correlated subquery, thusly:
SELECT ClientID
,(SELECT COUNT(AccountNumber)
FROM #ClientAccounts ca
WHERE ca.ClientID = cp.ClientID
GROUP BY ClientID) AS NumAccounts
,NumPurchases, AmtPurchases
FROM (
SELECT ClientID, COUNT(ClientID) AS NumPurchases, SUM(Purchase) AS AmtPurchases
FROM #ClientPurchases
GROUP BY ClientID) cp
IMHO, this version looks somewhat simpler than your version that you said works and its query plan cost shows less. Actual performance may vary though so run a timing test to be sure.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 9, 2012 at 6:14 am
Actually, I have a better idea. If you need to worry about ClientIDs that have accounts but no purchases (the above query won't work), use this.
;WITH CTE (ClientID, AccountNumber, NumPurchases, AmtPurchases) AS (
SELECT ClientID, 1, 0, 0 FROM #ClientAccounts
UNION ALL
SELECT ClientID, 0, 1, Purchase FROM #ClientPurchases
)
SELECT ClientID, SUM(AccountNumber) AS NumAccounts
,SUM(NumPurchases) AS NumPurchases, SUM(AmtPurchases) AS AmtPurchases
FROM CTE
GROUP BY ClientID
Still pretty simple and has the same query plan cost as my prior suggestion.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply