selecting TOP row from a result set

Question

selecting TOP row from a result set

sql_learner29

SSCrazy Eights

Points: 9107
More actions
January 9, 2010 at 8:26 pm

#140552

[font="Comic Sans MS"]Hi all.
I have question in a query that is returning me a set of data....with following columns:
account_no / emply_no / dept_no / address / Balance / phone / route / city / ....and 10 more columns...
now i am getting with some repeated account numbers in my result...
I want to select only the first row for every account reegardless of values in other columns,,,
the values in other columns may or may not be same ...
for example dept no can be D1 for the same account and can also be D2 for the same account..I just want the first row that is returned from the main query...
any help
thanks a lot
[/font]
[font="Comic Sans MS"]
---------------------------------------------------
Thanks [/font]

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 1

SQL Learner-684602 (1/9/2010)
[font="Comic Sans MS"]Hi all.
I have question in a query that is returning me a set of data....with following columns:
account_no / emply_no / dept_no / address / Balance / phone / route / city / ....and 10 more columns...
now i am getting with some repeated account numbers in my result...
I want to select only the first row for every account reegardless of values in other columns,,,
the values in other columns may or may not be same ...
for example dept no can be D1 for the same account and can also be D2 for the same account..I just want the first row that is returned from the main query...
any help
thanks a lot
[/font]

How about

SELECT DISTINCT account_no FROM mytable

If it doesn't matter what the values are in the other columns, then why retrieve them?

Okay, that doesn't answer your question but it might set you thinking about the problem.

You could use ROW_NUMBER() OVER(PARTITION BY account_no ORDER BY somecolumn).

Write this into your query as the first i.e. leftmost column of your SELECT list and eyeball the result.

It should be clear at this point: if not, please post the new query, your original query, plus some sample data which they will run against. Sample data should be prepared as a create table statement with a series of INSERTS to populate with data.

Cheers

ChrisM@home

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 2

Please post table definitions, sample data and desired output. Read this to see the best way to post this to get quick responses.

http://www.sqlservercentral.com/articles/Best+Practices/61537/

You said you want the 'first row'. First by what criteria? Bear in mind that there's no order of rows in a table.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Dugi SSCoach Points: 17998 More actions · Answer 3

I think one of the solution is to use Row_number() function but for more precise result, please post table structure and some sample data as GilaMonster said!

============================================================
SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 4

Here's one way of obtaining the sort of results you seem to be after. It is based on data in the AdventureWorks sample database.

SELECT TOP (10)

E.EmployeeID,

EA.Rate,

EA.ModifiedDate,

EA.RateChangeDate

FROM AdventureWorks.HumanResources.Employee E

CROSS

APPLY (

-- One row returned for each row from Employee

-- The ORDER BY clause determines which row is returned

-- Omit the ORDER BY if you don't care which row is returned

SELECT TOP (1)

EPH.ModifiedDate,

EPH.Rate,

EPH.RateChangeDate

FROM AdventureWorks.HumanResources.EmployeePayHistory EPH

WHERE EPH.EmployeeID = E.EmployeeID

ORDER BY

EPH.RateChangeDate DESC

) EA

ORDER BY

E.EmployeeID ASC;

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 5

This is a great example but what if the employee table had duplicate entries. and one or more columns had non-distinct values and you wanted to return the the first record regardless of whether they had a PayHistory Record?

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 6

Again I'm going to ask for your definition of 'first row'. First by what criteria? Of do you just mean 'any one row, reagardless'?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Phil Factor SSC-Insane Points: 20254 More actions · Answer 7

While we are waiting for more information, this is how I'd tackle the problem as I understand it... (actually, I'd make sure there was a primary key before I'd do anything else!)

[font="Courier New"]

--we are assuming that there is no unique key to this table

--from what has been said, Department and Account_No together

--seem to provide a unique key but this has never been provided

--with an index or constraint

SELECT

account_no, emply_no, dept_no, [address], Balance, phone, [route], city --etc.

FROM

BadlyDesignedTable

INNER JOIN (SELECT

account_no, MIN(Dept_no) AS FirstDeptNo

FROM

BadlyDesignedTable

GROUP BY

account_no

) ChosenRecord

ON ChosenRecord.account_no=BadlyDesignedTable.account_no

AND ChosenRecord.FirstDeptNo=BadlyDesignedTable.Dept_no[/font]

Best wishes,
Phil Factor

Paul White SSC Guru Points: 150468 More actions · Answer 8

Welsh Corgi
This is a great example but what if:

Welsh Corgi
...the employee table had duplicate entries.

Use a CTE or derived table in place of the Employee table. A DISTINCT or GROUP BY is all that is required.

Welsh Corgi
one or more columns had non-distinct values and you wanted to return the the first record regardless of whether they had a PayHistory Record?

Change the CROSS APPLY to OUTER APPLY.

As Gail points out, and as I tried to hint in my code comments, it all rather hinges on the details - especially what criteria one chooses to define 'first record'.

Phil's example is another example of a valid approach - though personally I do tend to prefer TOP with an APPLY and ORDER BY...but it's often just personal preference.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

sql_learner29 SSCrazy Eights Points: 9107 More actions · Answer 9

Hi All[font="Comic Sans MS"]

thank you so much for all your Replies.....

The criteria is to get the first Row...

SELECT

Area

,StateName

,CityName

,ACCOUNT_NO

,CUSTOMER_NAME

,CounterNo

,ROUTE_No

,Address

,TeleNumber

FROM FACT F

,Dim1 1

,Dim2 2

,Dim3 3

WHERE

F.DW_1_KEY = 1.DW_1_KEY

AND F.DW_2_KEY = 2.DW_2_KEY

AND F.DW_3_KEY = 3.3_KEY

and CAL.[year] = year(DATEADD(month, -1, GETDATE()))

and CAL.[month] = month(DATEADD(month, -1, GETDATE()))

order by Area

Now as I said earlier,

the output may contain repeated account numbers, but I want to select only the top row for every account number.

I cant use group by as Same account number may have different Count number or address...I just want to show the first row that comes from the table for every account number....

I cannot Ignore other columns as I will show the value for other columns too from the first row for that account number...

I dont know how to use RowNumber...

Let me try CROSS and OUTER APPLY...

.Any more help?

Thanku

[/font]

[font="Comic Sans MS"]
---------------------------------------------------

Thanks [/font]

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 10

Ok, so the 'first' row is the one with the lowest Area?

If so, try this.

SELECT

Area

,StateName

,CityName

,ACCOUNT_NO

,CUSTOMER_NAME

,CounterNo

,ROUTE_No

,Address

,TeleNumber

FROM

(SELECT

Area

,StateName

,CityName

,ACCOUNT_NO

,CUSTOMER_NAME

,CounterNo

,ROUTE_No

,Address

,TeleNumber

, Row_Number() OVER (PARTITION BY ACCOUNT_NO ORDER BY AREA) AS RowNo

FROM FACT F

INNER JOIN Dim1 D1 on F.DW_1_KEY = D1.DW_1_KEY

INNER JOIN Dim2 D2 ON F.DW_2_KEY = D2.DW_2_KEY

INNER JOIN Dim3 D3 ON F.DW_3_KEY = D3.3_KEY

WHERE

CAL.[year] = year(DATEADD(month, -1, GETDATE()))

and CAL.[month] = month(DATEADD(month, -1, GETDATE()))) sub

WHERE RowNo = 1

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

adatla 98538 Newbie Points: 1 More actions · Answer 11

I had similar need and here is how I got it done. I assume that even though there will be duplication account numbers, there will be unique ACCOUNT_NO and AREA combinations. If not then you will need to find the unique combination that you are interested and change the order by with in the row_number() call to have the unique columns.

;WITH My_INFO AS (

SELECT

Area

,StateName

,CityName

,ACCOUNT_NO

,CUSTOMER_NAME

,CounterNo

,ROUTE_No

,Address

,TeleNumber

,row_number() over (Partition by ACCOUNT_NO order by Area) as row_num

FROM FACT F

,Dim1 1

,Dim2 2

,Dim3 3

WHERE

F.DW_1_KEY = 1.DW_1_KEY

AND F.DW_2_KEY = 2.DW_2_KEY

AND F.DW_3_KEY = 3.3_KEY

and CAL.[year] = year(DATEADD(month, -1, GETDATE()))

and CAL.[month] = month(DATEADD(month, -1, GETDATE()))

)

SELECT a.Area

,a.StateName

,a.CityName

,a.ACCOUNT_NO

,a.CUSTOMER_NAME

,a.CounterNo

,a.ROUTE_No

,a.Address

,a.TeleNumber

FROM My_INFO WHERE row_num = 1

ORDER BY Area

shannonjk SSCarpal Tunnel Points: 4286 More actions · Answer 12

Not to create more confusion, but what is the business need here? If you only want to return one row of data per account number, then why return the address or phone number or any other data? It would be kind of strange that there is a business requirement to obtain random data per customer or account?

I know there 'are' some needs every once in a while the require this but very seldom. Sometimes you can get around building tricky and excessively advanced code by refining the business requirements 😉

Link to my blog http://notyelf.com/

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 13

shannonjk (1/13/2010)
If you only want to return one row of data per account number, then why return the address or phone number or any other data? It would be kind of strange that there is a business requirement to obtain random data per customer or account?

But the queries with Row number don't return random data. It's one account with that account's telephone number, address, etc in it.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 14

I just walked into the situation.

The Databases were created by Non-IT Professionals.

I have experience with the entire SDLC.

I appreciate your comments and I hope to change the way things are done but that remains to be seen.

I'm sort of in a void at this time. I was ask if I could do that so until I can eliminate duplicates, I have done it before. The business owner is out of town and I do not yet have permissions to the Server.

I believe all that I need is an outer and an inner sub query with a SELECT TOP and an Order by clause to get the first record.

After I examine each table in the database I hope the database is designed in a way and the process that will afford me the opportunity to create constraints, triggers and what ever else is needed to prevent this from happening in the future so that.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/