July 1, 2016 at 12:35 pm
Hi guys. I'm fairly new to SQL, but learning quickly. I've got a situation where I need to query a list of jobs, select some that meet a specific criteria (easy) but I need to also select some other's that would be related to these (don't know how).
Here's some sample data to illustrate:
Source table:
create table projects
(
ProjectNo nvarchar (20),
ProjectType nvarchar(10),
Location nvarchar (10),
)
insert into projects (ProjectNo, ProjectType, Location)
values ('abc123', 'new', '111');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc124', 'std', '222');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc125', 'new', '333');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc126', 'std', '111');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc127', 'std', '111');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc128', 'std', '666');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc129', 'std', '777');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc134', 'std', '1332');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc135', 'new', '1443');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc136', 'new', '1554');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc137', 'new', '1665');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc138', 'std', '1776');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc139', 'std', '1554');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc140', 'std', '1998');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc141', 'std', '1554');
insert into projects (ProjectNo, ProjectType, Location)
values ('abc142', 'std', '2220');
I need to query the rows that have "new" as the project type. Easy. But for each one of the rows with ProjectType = "new" I also need to append any row that has a matching value in the "Location" field. Call these "children". So for project abc123, at location 111 with ProjectType="new", there are 2 other records tha are "children" of this record. abc126 and abc127.
I'd like the result of my query to look like this. Each parent retrieved, along with any related "child" records. (Order is not important, I grouped by location and I added blank lines for readability)
ProjectProjectTypeLocation
abc123new111
abc126std111
abc127std111
abc125new333
abc135new1443
abc136new1554
abc139std1554
abc141std1554
abc137new1665
Can someone help me with this query?
Thanks very much.
Scott
July 1, 2016 at 12:49 pm
;WITH NewProjects
AS
(
SELECT Location
FROM projects
WHERE ProjectType = 'New'
)
SELECT p.ProjectNo
,p.ProjectType
,p.Location
FROM NewProjects AS np
JOIN Projects AS p ON p.Location = np.Location
ORDER BY p.Location ASC
First I identified the projects with type New in the CTE.
Then use that location to return information for all projects/children from the Projects table.
July 1, 2016 at 12:54 pm
Or this:
SELECT *
FROM dbo.projects
WHERE Location IN
(
SELECT Location
FROM dbo.projects
WHERE ProjectType = 'new'
)
-- Itzik Ben-Gan 2001
July 1, 2016 at 1:03 pm
Thank you! An alternative that may work double duty for some other things I do might be to have a 4th field that flags the children when I build the source table, call that field "ChildDef" for example.
Can you show me how I could query the sample table and produce a resulting output where any row at a location where there is also an associated "new" ProjectType gets assigned the value of "newChild"?
Output like this:
ProjectProjectTypeLocationChildDef
abc123new111
abc124222
abc125new333
abc126111newChild
abc127111newChild
abc128666
abc129777
abc1341332
abc135new1443
abc136new1554
abc137new1665
abc1381776
abc1391554newChild
abc1401998
abc1411554newChild
abc1422220
July 1, 2016 at 1:13 pm
SELECT
p.ProjectNo,
ProjectType = CASE WHEN p.ProjectType = 'new' AND x.ProjectType IS NOT NULL THEN 'new' ELSE '' END,
p.Location,
ChildDef = ISNULL(ChildDef,'')
FROM dbo.projects p
LEFT JOIN
(
SELECT *, ChildDef = CASE ProjectType WHEN 'new' THEN '' ELSE 'newChild' END
FROM dbo.projects
WHERE Location IN
(
SELECT Location
FROM dbo.projects
WHERE ProjectType = 'new'
)
) x ON x.ProjectNo = p.ProjectNo;
-- Itzik Ben-Gan 2001
July 1, 2016 at 1:19 pm
Thank guys VERY MUCH. All work perfectly of course. I'm studying to dissect and comprehend the mechanics. I get it when I read it, now to get it to sink in so I can apply next time 🙂
July 5, 2016 at 12:24 pm
I thought I had a handle on this and was able to make some mods for my real problem. But I ran into an issue I didn't anticipate originally. My "children" are only children if their finish date is close (within 2 months) of each other.
Revised sample table:
create table projects
(
ProjectNo nvarchar (20),
ProjectType nvarchar(10),
Location nvarchar (10),
Finish date
)
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc123', 'new', '111','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc124', 'std', '222','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc125', 'new', '333','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc126', 'std', '111','12/1/16');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc127', 'std', '111','11/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc128', 'std', '666','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc129', 'std', '777','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc134', 'std', '1332','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc135', 'new', '1443','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc136', 'new', '1554','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc137', 'new', '1665','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc138', 'std', '1776','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc139', 'std', '1554','12/1/16');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc140', 'std', '1998','12/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc141', 'std', '1554','11/1/15');
insert into projects (ProjectNo, ProjectType, Location, Finish)
values ('abc142', 'std', '2220','12/1/15');
How can I restrict the sql above to only mark the children when their finish date is within 2 months?
The result should be:
ProjectNoProject TypeLocationChildDefFinish
abc123new11112/1/2016
abc12422212/1/2016
abc125new33312/1/2016
abc12611112/1/2017
abc127111newChild11/1/2016
abc12866612/1/2016
abc12977712/1/2016
abc134133212/1/2016
abc135new144312/1/2016
abc136new155412/1/2016
abc137new166512/1/2016
abc138177612/1/2016
abc139155412/1/2017
abc140199812/1/2016
abc1411554newChild11/1/2016
abc142222012/1/2016
There are 2 projects at location 111, one is the "parent" (new) project. One of the other 2 is within 2 months finish date so it is a "child". But the other one is a year away so it is not a child.
Hopefully that makes sense.
I tried to figure out how to parse it but I'm stuck 🙁
July 5, 2016 at 1:00 pm
;WITH NewProjects
AS
(
SELECT Location
, Finish
FROM Projects
WHERE ProjectType = 'New'
)
SELECT p.ProjectNo
, CASE p.ProjectType WHEN 'New' THEN p.ProjectType ELSE '' END AS ProjectType
, p.Location
, CASE WHEN (p.ProjectType <> 'New') AND (ABS(DATEDIFF(month, p.Finish, np.Finish)) <= 2) THEN 'newChild' ELSE '' END AS ChildDef
, p.Finish
FROM Projects AS p
LEFT JOIN NewProjects AS np ON np.Location = p.Location
ORDER BY p.Location ASC, p.ProjectType, p.Finish ASC
Edit - I changed it to a LEFT JOIN so that all projects are returned in the select
July 5, 2016 at 1:27 pm
Works, I'm digesting now. I think I was having some trouble understanding the virtual tables (my noob description) that get created in here and at what point they were "alive", I think I see now. Studying.
Thanks very much for the help and lesson. Again.
July 6, 2016 at 7:04 am
I made one small change so I get "new" or "newChild" in the ChildDef field.
How can I restrict this to just rows with one or the other of those 2 values? I gather I can't use WHERE with ChildDef, it errors with "invalid column" when I try it. I guess because it doesn't exist yet. I also tried limiting by projectType and had success with one case but not another. Which I didn't understand. See my commented out WHERE's at the end of this code:
Is there a way to filter the output on Childef?
WITH NewProjects
AS (
SELECT Location
,Finish
FROM Projects
WHERE ProjectType = 'New'
)
SELECT p.ProjectNo
,CASE p.ProjectType
WHEN 'New'
THEN p.ProjectType
ELSE ''
END AS ProjectType
,p.Location
,CASE
WHEN (p.ProjectType <> 'New')
AND (ABS(DATEDIFF(month, p.Finish, np.Finish)) <= 2)
THEN 'newChild'
WHEN p.ProjectType = 'New'
THEN 'new'
ELSE NULL
END AS ChildDef
,p.Finish
FROM Projects AS p
LEFT JOIN NewProjects AS np
ON np.Location = p.Location
--where p.ChildDef in('new','newChild') --fails
--where projecttype in('new') -- works
--where projecttype = '' -- no error but no results
ORDER BY p.Location ASC
,p.ProjectType
,p.Finish ASC
July 6, 2016 at 7:17 am
WHERE p.ProjectType <> 'New' AND (ABS(DATEDIFF(month, p.Finish, np.Finish)) <= 2
or
WHERE p.ProjectType = 'New'
John
July 6, 2016 at 7:55 am
Got it. Can't use ChildDef but can just repeat the criteria that it used. I tried this to get both cases and it worked
WHERE (
p.ProjectType <> 'New' AND
(ABS(DATEDIFF(month, p.Finish, np.Finish)) <= 2)
)
OR
(p.ProjectType = 'New')
Curios though, why doesn't this work?
where p.projecttype = ''
July 6, 2016 at 8:06 am
What do you mean, why doesn't it work? Error message or wrong results? Do you have any projects where ProjectType is blank?
By the way, you need to be careful with your WHERE clause. Since it includes a column from the outer table, you may find that your left join has turned back to an inner join.
John
July 6, 2016 at 8:13 am
In your sample data ProjectType was either "new" or "std" there was no empty string ('').
You are changing it to an empty string in your SELECT statement via CASE. So you cannot use it in the WHERE clause.
Quote from this book by Itzik Ben-Gan
The logical query processing order of the six main query clauses is
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
end quote.
So the only clause that can see after SELECT is evaluated is the ORDER BY clause; even though we write the SELECT first before the WHERE.
July 6, 2016 at 8:29 am
In your sample data ProjectType was either "new" or "std" there was no empty string ('').
You are changing it to an empty string in your SELECT statement via CASE. So you cannot use it in the WHERE clause.
Got it. I understand now. Kinda the same issue I was having trying to use ChildDef, just didn't see it. Poor choice on my part for my example reusing the same field name. Thanks!
where p.projecttype='std' gave the result I expected.
Also, thanks for the order.
By the way, you need to be careful with your WHERE clause. Since it includes a column from the outer table, you may find that your left join has turned back to an inner join.
Now I'm worried. 🙂 Is the way I have the where clause correct for this example? I'm getting the right result in this dataset, but I'm about to adapt this to run on 100k records. So maybe some case I didn't duplicate in the sample data.
WHERE (
p.ProjectType <> 'New' AND
(ABS(DATEDIFF(month, p.Finish, np.Finish)) <= 2)
)
OR
(p.ProjectType = 'New')
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply