April 14, 2014 at 3:33 pm
I have one table that contains members, and a query which contains related records of payments made grouped by month and year. I would like to find all member records having at least one record for every month of the year for the past 12 months. How can I do that? Nothing I do seems to work.:w00t: Thanks in advance.
April 14, 2014 at 5:51 pm
This should get you started 😎
DECLARE @REF_DATE DATE = '2014-06-01';
DECLARE @MEM_PAY_DATE TABLE
(
PAYMENT_ID INT NOT NULL
,MEMBER_ID INT NOT NULL
,PAYMENT_DATE DATE NOT NULL
,PAYMENT_AMOUNT INT NOT NULL
)
INSERT INTO @MEM_PAY_DATE (PAYMENT_ID,MEMBER_ID,PAYMENT_DATE,PAYMENT_AMOUNT)
SELECT
PAYMENT_ID,MEMBER_ID,CAST(PAYMENT_DATE AS DATE) AS PAYMENT_DATE,PAYMENT_AMOUNT
FROM (VALUES
( 1,2,'2013-01-01',500.00),( 2,2,'2013-02-01',500.00),( 3,2,'2013-03-01',500.00)
,( 4,2,'2013-04-01',500.00),( 5,2,'2013-05-01',500.00),( 6,2,'2013-06-01',500.00)
,( 7,2,'2013-07-01',500.00),( 8,2,'2013-08-01',500.00),( 9,2,'2013-09-01',500.00)
,(10,2,'2013-10-01',500.00),(11,2,'2013-11-01',500.00),(12,2,'2013-12-01',500.00)
,(13,2,'2014-01-01',500.00),(14,2,'2014-02-01',500.00),(15,2,'2014-03-01',500.00)
,(16,2,'2014-04-01',500.00),(17,2,'2014-05-01',500.00),(18,2,'2014-06-01',500.00)
,(19,2,'2014-07-01',500.00),( 1,3,'2013-01-01',500.00),( 2,3,'2013-02-01',500.00)
,( 3,3,'2013-03-01',500.00),( 4,3,'2013-04-01',500.00),( 5,3,'2013-05-01',500.00)
,( 6,3,'2013-06-01',500.00),( 7,3,'2013-07-01',500.00),( 8,3,'2013-08-01',500.00)
,( 9,3,'2013-09-01',500.00),(10,4,'2013-10-01',500.00),(11,4,'2013-11-01',500.00)
,(12,4,'2013-12-01',500.00),(13,4,'2014-01-01',500.00),(14,4,'2014-02-01',500.00)
,(15,4,'2014-03-01',500.00),(16,4,'2014-04-01',500.00),(17,4,'2014-05-01',500.00)
,(18,4,'2014-06-01',500.00),(19,4,'2014-07-01',500.00),( 8,2,'2014-12-01',500.00)
) AS X(PAYMENT_ID,MEMBER_ID,PAYMENT_DATE,PAYMENT_AMOUNT);
;WITH DIST_MONTHS AS
(SELECT * FROM
(
SELECT
MPD.MEMBER_ID
,MPD.PAYMENT_DATE
,ROW_NUMBER() OVER
(
PARTITION BY MEMBER_ID, YEAR(MPD.PAYMENT_DATE), MONTH(MPD.PAYMENT_DATE)
ORDER BY YEAR(MPD.PAYMENT_DATE) ASC, MONTH(MPD.PAYMENT_DATE) ASC
) AS MPD_RID
,(YEAR(MPD.PAYMENT_DATE) * 100) + MONTH(MPD.PAYMENT_DATE) AS MPD_YM
FROM @MEM_PAY_DATE MPD
) AS X WHERE X.MPD_RID = 1)
,FULL_YEAR_PAY AS
(
SELECT
DM.MEMBER_ID
,COUNT(DM.PAYMENT_DATE) OVER
(PARTITION BY DM.MEMBER_ID) AS MP_COUNT
,ROW_NUMBER() OVER
(
PARTITION BY DM.MEMBER_ID
ORDER BY (SELECT NULL)
) AS MP_RID
FROM DIST_MONTHS DM
WHERE DM.MPD_YM BETWEEN ((YEAR(DATEADD(MONTH,-11,@REF_DATE)) * 100) + MONTH(DATEADD(MONTH,-11,@REF_DATE)))
AND ((YEAR(@REF_DATE) * 100) + MONTH(@REF_DATE))
)
SELECT
*
FROM FULL_YEAR_PAY FYP
WHERE FYP.MP_RID = 1
AND FYP.MP_COUNT = 12
April 15, 2014 at 3:29 am
Using Eirikur's data set, here's a different way:
;WITH MyOriginalQuery AS ( -- a query which contains related records of payments made grouped by month and year
SELECT PAYMENT_ID, MEMBER_ID, [Year] = YEAR(PAYMENT_DATE), [Month] = MONTH(PAYMENT_DATE), PAYMENT_AMOUNT
FROM @MEM_PAY_DATE
)
SELECT m.*
FROM (
SELECT
MEMBER_ID,
MonthsPaid = COUNT(*) OVER(PARTITION BY MEMBER_ID)
FROM MyOriginalQuery q
INNER JOIN (
SELECT
[Year] = YEAR(DATEADD(month,-n,GETDATE())),
[Month] = MONTH(DATEADD(month,-n,GETDATE()))
FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11)) d (n)
) ym
ON ym.[Year] = q.[Year] AND ym.[Month] = q.[Month]
) d
INNER JOIN Members m
ON m.MEMBER_ID = d.MEMBER_ID
WHERE d.MonthsPaid = 12
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 13, 2014 at 5:31 pm
Thank you! I'll try it.:-)
May 13, 2014 at 5:33 pm
Thank you! I'll try this one too!:-)
May 14, 2014 at 10:07 am
It's funny how Joe Celko can suggest that someone learns so many ISO standards and can't even learn an ANSI standard that has been out for 22 years. Please use SQL-92 JOINS.
The period table is a good option, but it should be tested as well to see if it's the best.
May 14, 2014 at 11:46 am
You're assuming that all people think the same way.
People that think in sets, are very capable to understand commutativity so they can understand that a + b + c = c + a + b. Which can use "big sigma" notation.
The problem would be consistency. As this is a SQL Server 2014 forum, the old join syntax would not be available for outer joins. Even worse than using SQL-86 joins is to mix both standards just because you claim it's the way to think on a set based manner.
May 14, 2014 at 5:01 pm
You really love to overcomplicate things.
Even worse, I can force the order of execution with derived tables. I materialize one of the joins, thus
((T1
INNER JOIN
T2
ON T1.a = T2.a)
AS X
T3
ON X.a = T3.a);
You can't be sure about that. In SQL Server (because this is a SQL Server site), there's no guarantee of the order of the execution unless you use a hint to your query.
Another advantage is visual.
If you take your time to format your code correctly, you don't need to move around the predicates to identify the relations of the tables. Using ON clauses, you can differentiate between relationships and conditions. I find a lot easier to read when you write one table per row with its relationships explained, than throwing all of them in a single row and trying to differentiate what's a relationship and what's a condition for the query.
I also try to use parens to get the inner joins together so they can be optimized.
Again, that won't give optimization. If you aren't sure about how a specific engine works, you shouldn't claim something will optimize a query. And no, there's no one-query-fits-all.
May 14, 2014 at 9:14 pm
< horse riding in ... clop, clop, clop, clop, clop > Everyone else is wrong!! My way is the only way!! < clop, clop, clop, clop, clop ... horse riding out >
(here we go again)
May 15, 2014 at 2:58 am
SQL Server will join tables in whatever order results in the lowest-cost plan. Join order doesn’t matter. The order of ON clauses matters – and can be manipulated to do some funky stuff.
“You have a query which joins three tables, say Customers, Orders and Order Lines. It’s an outer join between Customers and Orders because you want all customers in the output whether or not they’ve ever placed an order. The order can’t be empty, it must actually have some lines, and these should appear in the output.”
Create a customer table with three customers, Peter, Simon and Chris
CREATE TABLE #Customers (CustomerID INT IDENTITY(1,1), CustomerName VARCHAR(20))
INSERT INTO #Customers (CustomerName) VALUES ('Peter'), ('Simon'), ('Chris')
Three orders for Peter, two orders for Simon and none for Chris (boo hoo)
CREATE TABLE #Orders (OrderID INT IDENTITY(1,1), CustomerID INT)
INSERT INTO #Orders (CustomerID) VALUES (1),(1),(1),(2),(2)
Only one of those orders has any items on it – Peter’s first order
CREATE TABLE #Orderlines (OrderlineID INT IDENTITY(1,1), OrderID INT, PartName VARCHAR(20))
INSERT INTO #Orderlines (OrderID, PartName) VALUES (1, 'Peter01'), (1, 'Peter02')
Then you write the obvious query:
-- Query 1
SELECT c.*, o.*, ol.*
FROM #Customers c
LEFT JOIN #Orders o ON o.CustomerID = c.CustomerID
INNER JOIN #Orderlines ol ON ol.OrderID = o.OrderID
The result set isn’t what you might expect – there are only two rows, corresponding to Peter’s two items. The result set looks as if SQL Server has changed the outer join to an inner join and the execution plan confirms this. Most TSQL coders know and will sometimes begin by testing a query where both child tables are outer joined:
-- Query 2
SELECT c.*, o.*, ol.*
FROM #Customers c
LEFT JOIN #Orders o ON o.CustomerID = c.CustomerID
LEFT JOIN #Orderlines ol ON ol.OrderID = o.OrderID
-Which returns too many rows and is tricky to filter to meet the requirements. So they switch to this:
-- Query 3
SELECT c.*, o.*
FROM #Customers c
LEFT JOIN (
SELECT o.*, ol.OrderlineID, ol.PartName
FROM #Orders o
INNER JOIN #Orderlines ol ON ol.OrderID = o.OrderID
) o
ON o.CustomerID = c.CustomerID
-Which generates the correct result set as you would expect, and the execution plan confirms an inner join and an outer join.
You could also bracket your joins, but it’s so counterintuitive and tricky to maintain that I’m not even going to provide an example. If you must have a look, use Google – then forget what you’ve seen because if you use it, you risk defenestration by other TSQL coders. There is another way, and that is to change the order of the ON clauses, so that Orders and Orderlines are inner joined before the product is then joined to the Customer table:
-- Query 4
SELECT c.*, o.*, ol.*
FROM #Customers c
LEFT JOIN #Orders o
INNER JOIN #Orderlines ol
ON ol.OrderID = o.OrderID
ON o.CustomerID = c.CustomerID
This also generates the correct result set, confirmed by the execution plan. It’s easy to see what’s going on and the plan is only trivially different from query 3 – it’s slightly “cheaper”.
There are two points to make here. Firstly, join order doesn’t matter to SQL Server but ON clause order does. Secondly, TSQL coders like any other programmers choose a formatting style for their code which offers to them the fastest “read-to-understand” interval – the length of time it takes to figure out what the code is supposed to be doing. For most of us, that means putting join predicates in ON clauses, putting ON clauses immediately after the second table to which they relate, and putting filters in the WHERE clause for inner joins or in the ON clause for outer joins. There are rules for the first table in the FROM list too. If you take a look at the TSQL posted on ssc by the most prolific TSQL responders, almost all of it adheres to this format. I don’t believe any of them considered ANSI standards when they chose to code in this way. Most don’t give a flying snake. What they care about is their own productivity, and what they gain is exchangeability with other TSQL coders. We can read and understand each other’s code really fast.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 15, 2014 at 7:19 am
CELKO (5/14/2014)
horse riding in ... clop, clop, clop, clop, clop > Everyone else is wrong!! My way is the only way!! < clop, clop, clop, clop, clop ... horse riding out > (here we go again)
Did I fail to give an example? fail to quote an industry standard ? A known principle or fact?
Nope! I am pedantic! I even post ISO Standards by number! If you can post an equally strong argument for another position with the supporting evidence and references then please do it.
I spent a decade working on the SQL Standard, and before that I wrote columns in the trade press on Software Engineering. I was researching this stuff full time for AIRMICS while I was at GA Tech.
It is your style. Just reading it screams that you are always right and everyone else is wrong. You push the ISO standards but I know of few companies willing to pay for copies of the standards and to require their developers to following them (with perhaps the exception of some companies working government contracts where the government requires that those be followed, which hasn't been the ones I have worked for up to now) and I don't know many developers (database of or otherwise) that are willing to pay for them out of their own pocket either. So, instead of espousing your own personal views on this in every post you make, back off a bit and just try helping and teaching. You could always post a link to a fixed blog post that espouses your pedantic view of things.
I have been working and learning about SQL Server for over 15 years now, and I know I still have more to learn and discover about it. And I plan on using the system to its fullest which means I will use the local dialect (T-SQL) to get the most from the system. If this means rewriting the code if it has to be ported to PostgreSQL or Oracle at some point, so be it.
You may have much to teach even me but your style simply turns me off. Which is why I won't even buy or read your books or come to watch you at conferences. Maybe if you made an effort to be nicer, more approachable online I might consider changing my attitude.
Edit:
And this will be the last I have to say. I will refrain from making any more comments to you on any future forum thread you may venture to comment.
May 15, 2014 at 8:48 am
CELKO (5/14/2014)
I disagree. In typography and visual psychology there is a principle called the Law of Proximity; the basic idea is that things in proximity on a plane are grouped together in your mind. This why the sentence : “I saw the cheese the mouse the cat the dog chased caught ate. “ is hard to read unless your native language uses Reverse Polish notation 🙂 The verbs are clustered away from their subjects. There are also visual illusions based on it.
Let me help you with a visual example and how a correct formatting will comply with your argument of Law of Proximity and will also help people be able to identify any part of a query with a quick look.
Tables are close to each other vertically and close to their relationships horizontally. Alias are also close to each other. If you want to look for all join conditions of a table, you look to the rigth to find out the conditions to previous tables and down for any condition that start with the corresponding alias.
Furthermore, not all predicates are simple binary; BETWEEN is 3-ary and I can do n-ary relations and they are not always reduced to binary predicates. Try to use "T1.a BETWEEN T2.b AND T3.c" in an ON clause; which ON clause do you use? Toi ne, it is in the WHERE clause that applies to the whole query, not one step in a sequence of processing steps.
As you can see, I wouldn't use that horrible condition. I'd keep each relation on a separate condition. Why would you use something like that? Maybe you need to review your design.
I think what you are trying to say is that you want join conditions in the ON clauses and SARG (Search ARGuments) in the WHERE clause. This is what ACCESS does in its generated code. But it violates Law of Proximity and it measurably harder to read for understanding. While not important, optimizers usually (not always!) apply SARGs first to reduce the size of the working set before it does the join.
I'm not aware about other optimizers, but SQL Server (in most cases at least) will apply the Join first, so this argument doesn't help.
Another advantage on using this syntax is that you can identify faster any accidental cross join caused by the lack of a join condition. As you say, it's part of the Law of Proximity. The subject (tables) must be near to the verb (join condition).
EDIT: I forgot, here's a clean image of the code.
May 15, 2014 at 12:09 pm
I thought that it was clear that I posted an example of a bad design just to show you how to include the BETWEEN on 3 different tables as you mentioned (Could you deliver a good example of how would you do that?).
You changed your argument to something that even if it's correct, it has nothing to do with the previous discussion.
Your query doesn't seem as readable as mine. You don't give the proper space to be able to differentiate one part from the other and your query would need a major change if you needed to change an inner join into an outer join.
Views aren't "the law of proximity at the extreme", they're just code obfuscation. They won't allow you to easily identify the involved objects.
May 15, 2014 at 12:48 pm
Lynn Pettis (5/14/2014)
< horse riding in ... clop, clop, clop, clop, clop > Everyone else is wrong!! My way is the only way!! < clop, clop, clop, clop, clop ... horse riding out >(here we go again)
The only thing missing from this scene is a dry voice "When a naked man's chasing a woman through an alley with a butcher knife and a
, I figure he's not out collecting for the Red Cross."
😎
May 15, 2014 at 1:59 pm
Lynn Pettis (5/15/2014)
It is your style. Just reading it screams that you are always right and everyone else is wrong.
Lynn Pettis (5/15/2014)
I have been working and learning about SQL Server for over 15 years now, and I know I still have more to learn and discover about it. And I plan on using the system to its fullest which means I will use the local dialect (T-SQL) to get the most from the system. If this means rewriting the code if it has to be ported to PostgreSQL or Oracle at some point, so be it.
Lynn Pettis (5/15/2014)
You may have much to teach even me but your style simply turns me off. Which is why I won't even buy or read your books or come to watch you at conferences. Maybe if you made an effort to be nicer, more approachable online I might consider changing my attitude.
+1000 on each point
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply