January 18, 2007 at 4:04 pm
Earlier I posted with my exact query but I think that's too confusing, so I am making an independent example up to demonstrate my question without any interference from the specifics of my code exactly.
The following tables are used in this example: (pseudo-code)
CREATE TABLE Category (
CategoryId INT PRIMARY KEY,
CategoryMin BIGINT,
CategoryMax BIGINT
)
CREATE TABLE CategoryHistory (
CategoryId INT,
Year CHAR(4),
TargetPercentage FLOAT,
PRIMARY KEY (CategoryId, Year)
)
CREATE TABLE SomeDataTable (
someId INT PRIMARY KEY,
categoryValue BIGINT,
Year CHAR(4)
)
Now for what values to test with:
Category -
CategoryId, CategoryMin, CategoryMax
1 0 100
2 101 200
3 201 300
CategoryHistory -
CategoryId, Year, TargetPercentage
1 200020.0
1 200120.0
1 200220.0
2 200045.0
2 200145.0
2 200245.0
3 200035.0
3 200135.0
3 200235.0
SomeDataTable -
SomeId, CategoryValue
145
289
378
4156
5256
6299
7179
8122
9222
10224
Now what I need to do is get counts from SomeDataTable based on two separate sets of constraints. For this example, I will just make the second count count all the rows since I only have 10 rows, but in the real table I have there are other additional constraints, but the bottom line is one join has more constaints, the other has fewer, and the joins are both returning sets from the same table for the purpose of obtaining a count...
SELECT ch.[year], c.CategoryId, c.categoryMin, c.categorymax, ch.targetPercentage,
COUNT(s.someId) AS MoreConstrainedCount
COUNT(s2.someId) AS LessConstrainedCount
FROM CategoryHistory ch
JOIN Category c ON c.categoryid = ch.categoryId
LEFT JOIN SomeDataTable s ON ch.[year] = s.[year]
AND s.CategoryValue > c.categoryMin
AND s.CategoryValue <= c.CategoryMax
LEFT JOIN SomeDataTable s2 ON ch.[year] = s2.[year]
GROUP BY ch.[year], c.CategoryId, c.categoryMin, c.categorymax, ch.TargetPercentage
ORDER BY ch.[year], c.categoryId
The counts are messed up if you do both joins. It's cubing them, or adding them together or something, and it makes them both the same, which is not what I intuitively thought would happen.
So I am wondering why this can not work, and what will work to give me correct counts. If I do only one join, then I get correct counts. It's when I introduce the second join it throws them all off and makes both counts the same in each row.
Anyone have any ideas? Thanks in advance.
January 18, 2007 at 4:44 pm
If the join is essentially the same in both cases, but with more constraints on the count, you can solve it with just one join, by moving the expression into a CASE ... WHEN that resolves to zero or one, and summing the result:
SELECT ch.[year], c.CategoryId, c.categoryMin, c.categorymax, ch.targetPercentage,
--COUNT(s.someId) AS MoreConstrainedCount
Sum(
Case When s.CategoryValue > c.categoryMin AND s.CategoryValue <= c.CategoryMax
Then 1
Else 0
End
) As MoreConstrainedCount
COUNT(s2.someId) AS LessConstrainedCount
FROM CategoryHistory ch
JOIN Category c ON c.categoryid = ch.categoryId
LEFT JOIN SomeDataTable s ON ch.[year] = s.[year]
GROUP BY ch.[year], c.CategoryId, c.categoryMin, c.categorymax, ch.TargetPercentage
ORDER BY ch.[year], c.categoryId
So, you join to ALL the rows, count them all as the LessConstrainedCount, and then count only a subset using the SUM( CASE ... WHEN ) construct to apply the additional constraints for the lessor count.
January 18, 2007 at 4:58 pm
Oh my gosh, I spent nearly all day banging my head against the wall trying to come up with this. Thank you very much for the help and your time.
However, I still don't understand why my way does not work and it's just bothering me. I just can't see any good reason for why it should not work.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply