August 30, 2009 at 10:38 am
Hello everyone!
I have two tables with a 1-0,N relationship and I am having a hard time trying to retrieve a parent row and ONE (or NULL) child row. In layman terms, the parent table holds Primary Accountholder's name while the child table may (or may not) contain authorized secondary names, like the spouse, siblings, children, mistress :-), etc. Right now, I am only concerned in showing; for each of our Primary customers the first secondary name (or none if there aren't any) that exists in the child table. The parent table's unique ID is on the child table and each child row has a unique ID as well. I've tried the following but obviously did not work because it only gives me the columns of the parent table and just the parentid from the child table, and I need to see all columns from both tables.
select *
from tbl_parent pt
left join (select distinct parentid from tbl_child) ch on pt.parentid = ch.parentid
Any help will be greatly appreciated!
Thanks!
August 30, 2009 at 11:09 am
Define 'first'. There's no order within a table in SQL so there has to be something that determines which row it is that you want. If you can explain that and give a table structure, some sample data and expected results we can get you a query.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 30, 2009 at 12:24 pm
Thank Gail!
What I mean by "first" is any child row having the same parentID, I do not care what order or distinctive indicator, mainly because there is none. In reading other posts here, I guess what I am really looking for is to build a query that will give me a resul set of unique rows basing it on the parentID column. But to follow your advice and willingness, here's the table structure and some sample data.
IF OBJECT_ID('[dbo].[tbl_Parent]', 'U') IS NOT NULL
DROP TABLE [dbo].[tbl_Parent]
GO
IF OBJECT_ID('[dbo].[tbl_Children]', 'U') IS NOT NULL
DROP TABLE [dbo].[tbl_Children]
GO
--Parent Table
CREATE TABLE [dbo].[tbl_Parent](
[ParentID] [int] IDENTITY(1,1) NOT NULL,
[OfficeID] [int] NULL,
[Company] [varchar](100) NULL,
[Salutation] [varchar](40) NULL,
[LastName] [varchar](50) NULL,
[FirstName] [varchar](50) NULL
) ON [PRIMARY]
GO
--Children table
CREATE TABLE [dbo].[tbl_children](
[ChildID] [int] IDENTITY(1,1) NOT NULL,
[ParentID] [int] NULL,
[Salutation] [varchar](40) NULL,
[LastName] [varchar](50) NULL,
[FirstName] [varchar](50) NULL
) ON [PRIMARY]
GO
DECLARE @UID INT
--Insert Parent rows
INSERT tbl_Parent SELECT '1','Systems Integrated Corp.','Mrs.','Johanssen','Scarlett'
SET @UID = @@Identity
--Insert Children rows
INSERT tbl_children SELECT @UID,'Mr.','Montessori','Marcos'
INSERT tbl_children SELECT @UID,'Mrs.','Montessori','Mary'
INSERT tbl_Parent SELECT '1','DataComm Inc.','Mr.','Belfast','Martin James'
SET @UID = @@Identity
INSERT tbl_children SELECT @UID, 'Mrs.','Belfast','Josephine'
INSERT tbl_children SELECT @UID, NULL,'Belfast','Martin James II'
INSERT tbl_children SELECT @UID, NULL,'Belfast','Sophia'
INSERT tbl_Parent SELECT '2','TroubleMakers LLC.','Dr.','Bakula','Scott'
SET @UID = @@Identity
INSERT tbl_children SELECT @UID, 'Mrs.','Bakula','Rebecca'
INSERT tbl_Parent SELECT '2','Global Dynamics','Ms.','Ferguson','Mary'
SET @UID = @@Identity
INSERT tbl_children SELECT @UID,'Mr.','Ferguson','Paul'
INSERT tbl_children SELECT @UID,'Mr.','Ferguson','Esteban'
INSERT tbl_children SELECT @UID,'Mr.','Ferguson','Robert'
INSERT tbl_children SELECT @UID,'Mr.','Ferguson','Frank'
INSERT tbl_Parent SELECT '3','Austral Motors Inc.','Mr.','Perez','Joe'
SET @UID = @@Identity
INSERT tbl_children SELECT @UID,'Mrs.','Perez','Erika'
INSERT tbl_children SELECT @UID,'Ms.','Perez','Bertha'
INSERT tbl_children SELECT @UID,'Mr.','Perez','Armand'
INSERT tbl_children SELECT @UID,'Mrs.','Perez','Anne Marie'
INSERT tbl_children SELECT @UID,'Ms.','Perez','Beatriz'
INSERT tbl_Parent SELECT '3','Thermal Conductors and Diagrams Corp.','Mr.','Jones','Davy'
SET @UID = @@Identity
--Now Here show all the records joined
select *
from tbl_parent parent
left join tbl_children child on parent.parentid = child.parentid
The output I am looking for is just one row for each instead of two or three or as many parent rows as matching children rows there are. Am I making any sense???
Thank Again
August 30, 2009 at 12:55 pm
So it doesn't matter in the slightest which row you get back from the child table?
I must admit, that's a little odd.... Why is it that any one will do?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 30, 2009 at 4:34 pm
I agree with the above post. Unless you have some criteria defining with what you want back in that one row, it doesnt make sense. Or maybe we're just missing some vital information. Let us know.
Thanks...
S
--
:hehe:
August 30, 2009 at 4:58 pm
Thanks to both for your interest. What I need is a query that will return only one row for each parentID. In the script I provided you will notice that for every row in the parent table, there are 0, 1 or n rows in the child table. For example, the ParentID 1 (Systems Integrated Corp, Mrs. Scarlett Johansen) has two children (ChildID 1 for Mr. Marcos Monessori & ChildID 2 for Mrs. Mary Montessori) and using a regular join, I will get back two rows (one per child) both containing the same data of ParentID 1. I need to return just One Parent row and one Child row (or NULL if there are no children)
I have attached a word file with the screenshots representing this.
Once again, thank you for all your input.
August 30, 2009 at 5:55 pm
After my previous post, I kept digging and I found the solution. Here's what I was able to create and obtained just one child for each parent.
select child.*
from tbl_children child
join (SELECT ParentID, min(childid) as ChildID
FROM tbl_Children
GROUP BY ParentID) idx on child.childid = idx.childid
I want to still thank both of you for your interest and helping me understand this a bit better. I am sure I will have more questions aong the way so I hope I don't become too much of a nissuance.
Kind regards,
August 31, 2009 at 1:57 am
I understand what it is you want. What I'm curious about is why you want to see any one of the child rows. Usually for this kind of thing there's something which indicates which row (oldest date, lowest sequence, something)
There's an easier way than the subquery on SQL 2005.
SELECT Parent.LastName, Parent.FirstName, FirstChild.LastName, FirstChild.FirstName
FROM tbl_Parent Parent LEFT OUTER JOIN
(SELECT LastName, FirstName, Row_Number() OVER (Partition By ParentID Order By ChildID) AS RowNo FROM tbl_Child) FirstChild
WHERE (RowNo IS NULL OR RowNo = 1)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 31, 2009 at 4:26 pm
Thanks Gail!
Sorry I took this long to respond. I tried to execute your code but you missed the join condition and, you will excuse my ignorance, what columns are you thinking these two tables should be joined under?
SELECT Parent.LastName, Parent.FirstName, FirstChild.LastName, FirstChild.FirstName
FROM tbl_Parent Parent LEFT OUTER JOIN
(SELECT LastName, FirstName, Row_Number() OVER (Partition By ParentID Order By ChildID) AS RowNo FROM tbl_Child) FirstChild
WHERE (RowNo IS NULL OR RowNo = 1)
Please school/educate me! Honestly, I will really appreciate it anything you can provide me with.
Regards,
September 1, 2009 at 2:20 am
Sorry, I left a column and the join condition out. They should be joined on ParentID as that's the foreign key between the tables.
This should work better
SELECT Parent.LastName, Parent.FirstName, FirstChild.LastName, FirstChild.FirstName
FROM tbl_Parent Parent LEFT OUTER JOIN
(SELECT Parent_ID, LastName, FirstName, Row_Number() OVER (Partition By ParentID Order By ChildID) AS RowNo FROM tbl_Child) FirstChild
ON Parent.ParentID = FirstChild.PArentID AND RowNo = 1
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 1, 2009 at 6:26 am
Gail,
Now it works perfectly! I will attempt to modify it to test it with production data and let you know. Your query gives me a nice starting point for a project I always wanted to do.
Thank you again for your help and assistance.
Kind regards,
😀
September 1, 2009 at 8:24 am
Just for diversity's sake you can also use CROSS APPLY to accomplish this. It may just be a Coke vs Pepsi thing but I find this way a bit more readable, though with the small sample data it's hard to see if it has detrimental performance issues (in general, from what I've seen, any advice Gail provides is likely to be right on the money, so favor her method over mine). So, just wanted to show an alternative method; in my example the first child alphabetically will be returned. Vive la difference:
SELECT
Parent.LastName,
Parent.FirstName,
FirstChild.LastName,
FirstChild.FirstName
FROM
tbl_Parent Parent
CROSS APPLY
(
SELECT TOP 1
fc1.ParentID,
fc1.ChildID,
LastName,
FirstName
FROM
tbl_Children fc1
WHERE
fc1.ParentID = Parent.ParentID
ORDER BY
fc1.LastName,
fc1.FirstName
) FirstChild
September 1, 2009 at 9:48 am
Cross Apply forces the subquery to be run once per row of the outer query. A simple join with ROW_NUMBER does not. It's very easy to see the performance characteristics of the two.
Also, the two queries return different data. To return the desired data (showing nulls if there's no child) you should use OUTER APPLY, not CROSS APPLY. CROSS APPLY is an inner join
Using the sample data that Johnny provided (though I added primary keys and a clustered index on the foreign key column on the child table):
Outer join with row_number
Table 'tbl_children'. Scan count 1, logical reads 2, physical reads 0
Table 'tbl_Parent'. Scan count 1, logical reads 2, physical reads 0
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
Outer Apply with TOP and Order By
Table 'tbl_children'. Scan count 6, logical reads 12, physical reads 0
Table 'tbl_Parent'. Scan count 1, logical reads 2, physical reads 0
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
The performance difference will not be noticeable on a small table, but when there are lots of rows in Parent, this will become very painful.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 1, 2009 at 10:43 am
Thanks for the correction, Gail. I meant to use OUTER APPLY in my query but didn't double-check my post carefully enough.
I have the luxury of a database with most table rows < 1M, so I can often choose code that "scans" better without focusing on the under the hood performance; not that I can ignore performance completely, it's just that if I were to code out a similar query both ways, the execution time ends up being comparable, so I don't see the performance hit as much.
But I can definitely see why the CROSS/OUTER APPLY solution wouldn't scale.
I think I'll start refactoring places where I use APPLYs to adopt your ROWNUMBER derived table method.
Thanks, again.
September 1, 2009 at 12:19 pm
Thanks again to both of you.
Since I have more than 1M rows in the production parent table, it would be very painful. Gail, I applied your exercise of (though I added primary keys and a clustered index on the foreign key column on the child table) and added the keys and it is blazing fast. I get the resut set (during off-peak hours) in less than 2 seconds, which is totally amazing.
I don't know how would I had done it without your expert input and advice. I am very grateful for your help!
Regards,:-D
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply