Display records only if they have multiple child records

Question

Display records only if they have multiple child records

fsnell

Old Hand

Points: 397
More actions
July 22, 2011 at 3:37 pm

#244788

I am hoping someone can help me with the following query. I am trying to display only those "Case_Number" which have 2 or more "Order_Number".
If I have the Count = 1, all records show up, if I have Count > 1, no records show up. I have many records have 1, 2, or more Orders.
Thank you for your help.
Frank
DECLARE @StartDate DATETIME;
DECLARE @EndDate DATETIME;
SET @StartDate = '07/01/2011';
SET @EndDate = '07/29/2011';
SELECT DISTINCT
POS.CASE_NUMBER
, POS.ORDER_NUMBER
, POS.START_DATETIME
, POS.STOP_DATETIME
, HF.AHFS_THERAPEUTIC_CLASS
, HF.GENERIC_NAME
, HF.BRAND_NAME
, Adm.Patient_Lastname
, Adm.Patient_Firstname
, CASE WHEN POS.SCHEDULE LIKE 'P%' THEN 'P' ELSE 'S' END AS [PRN/Sched]
FROM POS_Orders POS
INNER JOIN Hospital_Formulary HF ON POS.NDC_REPACKAGED_NBR
= HF.NDC_NBR + ISNULL(HF.REPACKAGED_NBR, '')
INNER JOIN Admission Adm ON POS.CASE_NUMBER = Adm.Case_Number
WHERE (HF.AHFS_THERAPEUTIC_CLASS = '281608')
AND (POS.START_DATETIME > @StartDate)
AND (POS.STOP_DATETIME < @EndDate)
GROUP BY POS.CASE_NUMBER
, POS.ORDER_NUMBER
, POS.START_DATETIME
, POS.STOP_DATETIME
, HF.AHFS_THERAPEUTIC_CLASS
, HF.GENERIC_NAME
, HF.BRAND_NAME
, Adm.Patient_Lastname
, Adm.Patient_Firstname
,SCHEDULE
HAVING COUNT(POS.ORDER_NUMBER) > 1

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 1

Disclaimer: I do not have your tables on my side so this is untested code.

If this code does not work for you right off the bat then hopefully it will give you a start. The idea is to first use the case_order CTE to isolate the case_number and order_number combinations you're interested in and then use that result set to join to the rest of your tables to pickup the data necessary to populate the result set you are trying to deliver.

DECLARE @StartDate DATETIME ;

DECLARE @EndDate DATETIME ;

SET @StartDate = '07/01/2011' ;

SET @EndDate = '07/29/2011' ;

WITH case_order ( CASE_NUMBER, ORDER_NUMBER )

AS (

SELECT CASE_NUMBER,

ORDER_NUMBER

FROM POS_Orders

WHERE START_DATETIME > @StartDate

AND STOP_DATETIME < @EndDate

GROUP BY CASE_NUMBER,

ORDER_NUMBER

HAVING COUNT(*) > 1

)

SELECT DISTINCT

POS.CASE_NUMBER,

POS.ORDER_NUMBER,

POS.START_DATETIME,

POS.STOP_DATETIME,

HF.AHFS_THERAPEUTIC_CLASS,

HF.GENERIC_NAME,

HF.BRAND_NAME,

Adm.Patient_Lastname,

Adm.Patient_Firstname,

CASE WHEN POS.SCHEDULE LIKE 'P%' THEN 'P'

ELSE 'S'

END AS [PRN/Sched]

FROM case_order co

JOIN POS_Orders POS ON co.CASE_NUMBER = POS.CASE_NUMBER

AND co.ORDER_NUMBER = POS.ORDER_NUMBER

INNER JOIN Hospital_Formulary HF ON POS.NDC_REPACKAGED_NBR = HF.NDC_NBR + ISNULL(HF.REPACKAGED_NBR, '')

INNER JOIN Admission Adm ON POS.CASE_NUMBER = Adm.Case_Number

WHERE HF.AHFS_THERAPEUTIC_CLASS = '281608' ;

If you still have question, please provide the following:

1. CREATE TABLE statements to build your tables.

2. INSERT INTO statements to populate the tables with usable test data.

3. The expected results.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

fsnell Old Hand Points: 397 More actions · Answer 2

Thank you. I will have to test it Monday.

Not familiar with the "with" statement...will have to research that one.

again, thanks, and will let you know.

fsnell Old Hand Points: 397 More actions · Answer 3

CELKO (7/23/2011)
Basic Netiquette is to post DDL; read the section at the start of the Forum.

What is DLL? Read what section at the start of Forum? I searched, and see no guidance on how to request help. By all means assist, so I can provide the correct information.

If all you want is a case number, why all this other stuff? Why do you have dates? You never told us.

If I didn't need the other stuff, I would not have put it in my query. The End User needs to see each Order information for the Case Numbers. Dates are parameters, and is Set only to test in query. Orders have a start date and and end date, and to properly capture the information, I have to make sure the order has not ended before the start of the requested date range, nor started after the requested date range.

Read the specs you posted and think about someone trying to program from them.

I am the one programming from them, and they seemed to make perfect sense to me. BUT... I understand what you are saying. Same problems I have dealing with End User helping them get what they want, and not necessarily what they are asking. Constructive criticism is always appreciated.

You used a SELECT DISTINCT and a GROUP BY together; think about it.

Because of the way records are entered (not my choice), every change in an Orders record table results in a new record, but none of the info I am requesting is changed, but results in duplicate records, hence the DISTINCT. The Group BY is required because of the aggregate I am using (or so the errors keep telling me). This error and Group By problem is part of my problem with this query.

We have a DATE data type now, so use it. There is no such thing as a generic start or end date; it has to be a particular start or a particular end date.

My Dates are DateTime data types, as matching the data type I am putting the parameter in (Old Database that has been upgaded a couple of times). Not sure what you mean by generic start or end date. I am, and have used in similar querys, both start AND end dates, as they are required to make sure the order (medication) was being taken within the requested (Parameter) date range.

Since we have no DDL, I will guess that the POS Orders have the case number in them. We do not need the drug codes or the patient's name to get the count.

Again, not sure what a DDL is, but yes, the POS Orders does have the case number. I am not after the count, only the case numbers that have more than one order, and the corresponding orders (and details) for those case numbers.

This is thinking in sets and not records (rows are not records).

And I thought I was. I have the set in the way I need it, except for the last filtering part. Some Case Numbers have only one order, some have 2, and others have more. The audit requires only those case numbers that have 2 or more Orders (and details).

Here is my guess based on your specs:

My specs, as you call them, are correct, just missing the how to filter for the last step, and are no where near what you are guessing.

BEGIN
DECLARE @in_report_start_date DATE;
DECLARE @in_report_end_date DATE;
–- we use ISO-8601 date formats, not dialect
SET @in_report_start_date = '2011-07-01';
SET @in_report_end_date = '2011-07-29';

I am an On The Job self learning DBA. I have never heard of ISO-8601, but will look it it and add it in my repository. But the Set commands are for query design only, and will be removed from the final product.

SELECT POS.case_number, @in_report_start_date, @in_report_end_date,
COUNT(DISTINCT order_number) OVER (PARTITION BY case_number)
AS order_cnt
FROM POS_Orders AS POS
AND POS.something_start_date > @in_report_start_date
AND POS.something_end_date < @in_report_end_date;
END;

Not even close to what I am asking for, but thanks for trying. This is missing all the other information requested, like Drug Name, Patients name, etc.

Want to try again, with better specs and basic Netiquette?

Does that help? I really don't want to have to add the full tables, as there are 40-50 fields per table. I joined the tables I needed, and selected the fields I needed from those tables. If needed, I will create an mini version of the tables in question.

That being said, opc.three appeared to understand the question, gave me a viable solution and a couple of pointers to look at and research, and without the negative critique. You could probably learn something from him.

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 4

fsnell (7/23/2011)
CELKO (7/23/2011)
Basic Netiquette is to post DDL; read the section at the start of the Forum.
What is DLL? Read what section at the start of Forum? I searched, and see no guidance on how to request help. By all means assist, so I can provide the correct information.

DDL = Data Definition Language. In this context, think CREATE TABLE statements. See my #1 above.

DML = Data Manipulation Language. In this context, think INSERT INTO statements. See my #2 above.

Here is an article with details for next time you decide to ask a question on this or any other forum: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

You used a SELECT DISTINCT and a GROUP BY together; think about it.
Because of the way records are entered (not my choice), every change in an Orders record table results in a new record, but none of the info I am requesting is changed, but results in duplicate records, hence the DISTINCT. The Group BY is required because of the aggregate I am using (or so the errors keep telling me). This error and Group By problem is part of my problem with this query.

CELKO is pointing out that DISTINCT and GROUP BY are both de-duplicating the rows in the returned set...and that it's a waste of resources to introduce both into the same query. There is no need for both here. This is constructive criticism, please research and understand why CELKO is correct here.

This is thinking in sets and not records (rows are not records).
And I thought I was. I have the set in the way I need it, except for the last filtering part. Some Case Numbers have only one order, some have 2, and others have more. The audit requires only those case numbers that have 2 or more Orders (and details).

Your terminology is saying that you are not especially experienced in thinking in sets, as it relates to a database. Rows are not records...and columns are not fields (as they are referred too later in your post). A database table is made up of rows and columns. You may be thinking in sets...but the concept of a "field" implies the discrete value stored at the intersection of a single row and single column, and saying you want to operate on that concept implies that you are not thinking in terms of sets. A field is a thing on a computer screen, as in where you would type your name into a web form, or a place where hopefully people will be playing professional American Football in a couple months.

BEGIN
DECLARE @in_report_start_date DATE;
DECLARE @in_report_end_date DATE;
–- we use ISO-8601 date formats, not dialect
SET @in_report_start_date = '2011-07-01';
SET @in_report_end_date = '2011-07-29';
I am an On The Job self learning DBA. I have never heard of ISO-8601, but will look it it and add it in my repository. But the Set commands are for query design only, and will be removed from the final product.

The date format you chose, MM/DD/YYYY, is supported by SQL Server, but is not a standard way to represent dates in a way that multiple database platforms that implement standards may recognize it. It may not affect the outcome of your specific query when run on SQL Server, but it implies you may not be aware of some fundamental database concepts and the standards that went into building them. Again, this is constructive criticism, please research and understand why CELKO is correct here.

Want to try again, with better specs and basic Netiquette?
Does that help? I really don't want to have to add the full tables, as there are 40-50 fields per table. I joined the tables I needed, and selected the fields I needed from those tables. If needed, I will create an mini version of the tables in question.
That being said, opc.three appeared to understand the question, gave me a viable solution and a couple of pointers to look at and research, and without the negative critique. You could probably learn something from him.

Maybe his tact in this online forum is an area where CELKO could improve but in terms of database theory and history I look at him as a teacher. I have a practical eye and focus on solutions in these forums so do not look to teach and call out errors in terminology and violations of standards as much as others. In many cases people do not want to hear such things when they are trying to solve a tough problem to meet a deadline. That said, be quick to see where CELKO is right...and leave the rest. His delivery in these forums is awkward (in my opinion) but he was trying to help you.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

R.P.Rozema SSChampion Points: 12317 More actions · Answer 5

Searching for rows for which more than one with the same value exists by using a "having count() > 1" is going to be slower the bigger your data set becomes. A much faster way to find the case numbers for which multiple order numbers exist is easily seen by rephrasing your question like this:

"Return the requested set of information for all orders matching the selection criteria and having a case number for which at least one other order number with the same case number exists." This way you don't need SQL server to first go through all case numbers summing all of their orders. Instead it can simply pick the first case number from the set to see if one other order exists with the same case number. If it doesn't, it can discard the row. If it does, it can return you the order. Then repeat this until all available case numbers are processed. This can theoretically be further improved on, but since you've got some awkward filter criteria on the orders I doubt that these will help you much in this case.

I don't have a server available to me right now, nor did you specify the DDL to reproduce your test bed, so I am going to give you an untested example query.

DECLARE @StartDate DATETIME;

DECLARE @EndDate DATETIME;

SET @StartDate = '07/01/2011';

SET @EndDate = '07/29/2011';

WITH cteOrders AS (

SELECT POS.CASE_NUMBER

,POS.ORDER_NUMBER

,POS.START_DATETIME

,POS.STOP_DATETIME

,HF.AHFS_THERAPEUTIC_CLASS

,HF.GENERIC_NAME

,HF.BRAND_NAME

,Adm.Patient_Lastname

,Adm.Patient_Firstname

,CASE WHEN POS.SCHEDULE LIKE 'P%' THEN 'P' ELSE 'S' END AS [PRN/Sched]

FROM POS_Orders POS

INNER JOIN Hospital_Formulary HF ON (

POS.NDC_REPACKAGED_NBR = HF.NDC_NBR + ISNULL(HF.REPACKAGED_NBR, ''))

INNER JOIN Admission Adm ON (

POS.CASE_NUMBER = Adm.Case_Number)

WHERE HF.AHFS_THERAPEUTIC_CLASS = '281608'

AND POS.START_DATETIME > @StartDate

AND POS.STOP_DATETIME < @EndDate

)

SELECT o1.*

FROM cteOrders o1

WHERE EXISTS (

SELECT *

FROM cteOrders o2

WHERE o2.CASE_NUMBER = o1.CASE_NUMBER

AND o2.ORDER_NUMBER <> o1.ORDER_NUMBER

);

And let me add a tip on selection ranges -especially those with dates/times in them-, it is better to search for a range including the start value and excluding the end value. i.e. in your case POS.START_DATETIME >= @StartDate and POS.STOP_DATETIME < @EndDate. Your specified criteria return all rows starting from July first, excluding those starting exactly at midnight on July 1st, up until the 29th of July. My guess is that you intended to return all rows for July, but by specifying it the way you did you failed to include not only the July 1st midnight orders, but also all from July 30th and 31st. If it were not your intention to return all of July's orders, but a shifting window, then it would be better to use dateadd() to specify this date range.

And a second tip: Next time better check whose help you're rejecting before responding.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

fsnell Old Hand Points: 397 More actions · Answer 6

First of all, let me apologize to Celko...I responded before I did my breathing exercises. Althought I was not fond of the "presentation", I do appreciate your efforts, and have taken some of your comments to heart, and will be using them immediately.

I am still confused about how Group By de-duplicates, as I have never seen it refered to that way, or used for that purpose, but will take your word for it, and will research it in the near future (as I like know how and why something happens).

As for working with sets...I think I understand. Everytime we do add a criteria, it it goes through each record to see if it meets the criteria stated. That is NOT thinking in sets, and will take a little bit of mental shifting to grasp the concept.

opc.three, Celko, and R.P. Rozeme....I do appreciate all your comments, helping me increase my knowledge.

If the posted solutions to not work this week when I get a chance to test them, I will redo my question with the proper DDL.

As for your question R.P. Rozeme on the Start Date and Stop Date. The dates are ranges the user needs at the time, whether 2 days apart, 2 weeks, or 6 months. Different auditors come in and have different criteria for thier date ranges.

Thank you everybody for your help,

Frank

Rich Mechaber SSChampion Points: 11131 More actions · Answer 7

fsnell (7/24/2011)
I am still confused about how Group By de-duplicates, as I have never seen it refered to that way, or used for that purpose, but will take your word for it, and will research it in the near future (as I like know how and why something happens).

Frank, try this out and you'll see that GROUP BY de-duplicates:

CREATE TABLE #orders (ProductID INT, OrderDate DATETIME, CustomerID INT);

INSERT INTO #orders

(ProductID, OrderDate, CustomerID)

SELECT 1, '2011-07-01', 5

UNION

SELECT 1, '2011-07-01', 6

UNION

SELECT 1, '2011-07-01', 7

UNION

SELECT 1, '2011-07-01', 8

UNION

SELECT 1, '2011-07-01', 9

SELECT ProductID, OrderDate

FROM #orders

GROUP BY ProductID, OrderDate;

DROP TABLE #orders;

Do you see what's happening? Five different customers placed an order for the same product on the same day. My SELECT query only requested ProductID and OrderDate and grouped on both columns. Grouping will return only 1 row from my table.

Had I included CustomerID in the SELECT clause but not in the GROUP BY clause, I would indeed have received the oh-so-familiar error: "Column '#orders.CustomerID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." Try it yourself. This occurs b/c SQL has no idea how to group these rows.

If you change one of the dates in the original code I posted to something different, then you'll get 2 rows returned, 1 for each unique combination of ProductID and OrderDate.

Don't be put off too much by the sometimes pointed replies here. This site contains a wealth of information, provided by generous experts taking time away from their real jobs to help us out. Do read and follow the link about how to post the most useful information here: as a matter of selfish pragmatism, the easier you make it for others to work on your problem, the more likely you will receive a solution to it!

Good luck and welcome to SQL Server Central,

Rich

Rich Mechaber SSChampion Points: 11131 More actions · Answer 8

fsnell (7/22/2011)
Thank you. I will have to test it Monday.
Not familiar with the "with" statement...will have to research that one.
again, thanks, and will let you know.

Google "Common Table Expression" to find more. New to SQL 2005.

Couple of persnickity syntax items that will cause you to forcibly uproot your hair if you're not familiar with them:

- Every column in the CTE must be aliased.

- The last entry in your code before the beginning of the CTE ("WITH cte AS ...") must be a semi-colon terminating the preceding statement. This is one of those few instances where T-SQL requires a statement terminator. If you forget the semi-colon, the error you get will not be helpful: "Incorrect syntax near ..."

- The CTE may be queried only once, and only in the statement immediately following the CTE definition.

- Note there is no semi-colon after the CTE definition/before the subsequent query.

- If you want to "concatenate" CTEs in a row before executing your "real" query, place a comma after all but the last one:

SELECT 'This is a CTE';--Note the terminator!

WITH cte1 As

(

SELECT 1 As MyINT

),

cte2 As

(

SELECT (2 + MyINT) As YourINT

FROM cte1

),

cte3 As

(

SELECT (2 * YourINT) As HisINT

FROM cte2

)

SELECT HisINT

FROM cte3;

HTH,

Rich

fsnell Old Hand Points: 397 More actions · Answer 9

I finally got it to work. I ended up using R.P. Rozema's solution, but I learned a lot more, and greatly appreciate everybodys comments.

Some things I got out of these comments:

R.P. Rozema - Thanks for showing me a different way processing my query. I had to sit down and analyze it, but now it makes sense, and I understand.

rmechaber - thanks for the examples and explainations. I actually went out and googled "Distinct vs Group By", and (from my understanding), there is not much difference, except one is used with aggregates, and the other with out.

CELKO - Thanks for the new Standard (ISO-8601). I do prefer to use standards, but do not have a lot of time to research them.

opc.three - Thanks for answering various questions, and introducing me to the "With". I played with it in various ways this weekend, trying to learn about it, and I can already see lots of benefits.

-- Beginning of the reporting period, but used to capture all records still active at end of reporting period

DECLARE @StartDate DATETIME;

-- End of the reporting period, but used to capture all records active at start of reporting period

DECLARE @EndDate DATETIME;

--Paramaters that will be input at run time, with varying time frames

SET @StartDate = '2011-07-01';

SET @EndDate = '2011-07-25';

WITH Orders AS

(

SELECT

POS.CASE_NUMBER

, POS.ORDER_NUMBER

, POS.START_DATETIME

, POS.STOP_DATETIME

, HF.AHFS_THERAPEUTIC_CLASS

, HF.GENERIC_NAME

, HF.BRAND_NAME

, Adm.Patient_Lastname

, Adm.Patient_Firstname

, CASE WHEN POS.SCHEDULE LIKE 'P%' THEN 'P' ELSE 'S' END AS [PRN/Sched]

FROM POS_Orders POS

INNER JOIN Hospital_Formulary HF ON POS.NDC_REPACKAGED_NBR = HF.NDC_NBR + ISNULL(HF.REPACKAGED_NBR, '')

INNER JOIN Admission Adm ON POS.CASE_NUMBER = Adm.Case_Number

WHERE (HF.AHFS_THERAPEUTIC_CLASS = '281608')

-- Actually messed up my logic earlier, and R.P. Rozema made me rethink it and correct it.

AND (POS.START_DATETIME < @EndDate)

AND (POS.STOP_DATETIME > @StartDate )

)

SELECT o1.*

FROM Orders o1

WHERE EXISTS

( SELECT *

FROM Orders o2

WHERE o2.CASE_NUMBER = o1.CASE_NUMBER

AND o2.ORDER_NUMBER <> o1.ORDER_NUMBER

);