January 9, 2010 at 8:26 pm
[font="Comic Sans MS"]Hi all.
I have question in a query that is returning me a set of data....with following columns:
account_no / emply_no / dept_no / address / Balance / phone / route / city / ....and 10 more columns...
now i am getting with some repeated account numbers in my result...
I want to select only the first row for every account reegardless of values in other columns,,,
the values in other columns may or may not be same ...
for example dept no can be D1 for the same account and can also be D2 for the same account..I just want the first row that is returned from the main query...
any help
thanks a lot
[/font]
Thanks [/font]
January 10, 2010 at 2:29 am
SQL Learner-684602 (1/9/2010)
[font="Comic Sans MS"]Hi all.I have question in a query that is returning me a set of data....with following columns:
account_no / emply_no / dept_no / address / Balance / phone / route / city / ....and 10 more columns...
now i am getting with some repeated account numbers in my result...
I want to select only the first row for every account reegardless of values in other columns,,,
the values in other columns may or may not be same ...
for example dept no can be D1 for the same account and can also be D2 for the same account..I just want the first row that is returned from the main query...
any help
thanks a lot
[/font]
How about
SELECT DISTINCT account_no FROM mytable
If it doesn't matter what the values are in the other columns, then why retrieve them?
Okay, that doesn't answer your question but it might set you thinking about the problem.
You could use ROW_NUMBER() OVER(PARTITION BY account_no ORDER BY somecolumn).
Write this into your query as the first i.e. leftmost column of your SELECT list and eyeball the result.
It should be clear at this point: if not, please post the new query, your original query, plus some sample data which they will run against. Sample data should be prepared as a create table statement with a series of INSERTS to populate with data.
Cheers
ChrisM@home
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 10, 2010 at 2:39 am
Please post table definitions, sample data and desired output. Read this to see the best way to post this to get quick responses.
http://www.sqlservercentral.com/articles/Best+Practices/61537/
You said you want the 'first row'. First by what criteria? Bear in mind that there's no order of rows in a table.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 10, 2010 at 2:46 am
I think one of the solution is to use Row_number() function but for more precise result, please post table structure and some sample data as GilaMonster said!
January 10, 2010 at 5:07 am
Here's one way of obtaining the sort of results you seem to be after. It is based on data in the AdventureWorks sample database.
SELECT TOP (10)
E.EmployeeID,
EA.Rate,
EA.ModifiedDate,
EA.RateChangeDate
FROM AdventureWorks.HumanResources.Employee E
CROSS
APPLY (
-- One row returned for each row from Employee
-- The ORDER BY clause determines which row is returned
-- Omit the ORDER BY if you don't care which row is returned
SELECT TOP (1)
EPH.ModifiedDate,
EPH.Rate,
EPH.RateChangeDate
FROM AdventureWorks.HumanResources.EmployeePayHistory EPH
WHERE EPH.EmployeeID = E.EmployeeID
ORDER BY
EPH.RateChangeDate DESC
) EA
ORDER BY
E.EmployeeID ASC;
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
January 10, 2010 at 11:05 pm
This is a great example but what if the employee table had duplicate entries. and one or more columns had non-distinct values and you wanted to return the the first record regardless of whether they had a PayHistory Record?
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
January 11, 2010 at 2:12 am
Again I'm going to ask for your definition of 'first row'. First by what criteria? Of do you just mean 'any one row, reagardless'?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 11, 2010 at 2:37 am
While we are waiting for more information, this is how I'd tackle the problem as I understand it... (actually, I'd make sure there was a primary key before I'd do anything else!)
[font="Courier New"]
--we are assuming that there is no unique key to this table
--from what has been said, Department and Account_No together
--seem to provide a unique key but this has never been provided
--with an index or constraint
SELECT
account_no, emply_no, dept_no, [address], Balance, phone, [route], city --etc.
FROM
BadlyDesignedTable
INNER JOIN (SELECT
account_no, MIN(Dept_no) AS FirstDeptNo
FROM
BadlyDesignedTable
GROUP BY
account_no
) ChosenRecord
ON ChosenRecord.account_no=BadlyDesignedTable.account_no
AND ChosenRecord.FirstDeptNo=BadlyDesignedTable.Dept_no[/font]
Best wishes,
Phil Factor
January 11, 2010 at 2:48 am
Welsh Corgi
This is a great example but what if:
Welsh Corgi
...the employee table had duplicate entries.
Use a CTE or derived table in place of the Employee table. A DISTINCT or GROUP BY is all that is required.
Welsh Corgi
one or more columns had non-distinct values and you wanted to return the the first record regardless of whether they had a PayHistory Record?
Change the CROSS APPLY to OUTER APPLY.
As Gail points out, and as I tried to hint in my code comments, it all rather hinges on the details - especially what criteria one chooses to define 'first record'.
Phil's example is another example of a valid approach - though personally I do tend to prefer TOP with an APPLY and ORDER BY...but it's often just personal preference.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
January 12, 2010 at 2:13 pm
Hi All[font="Comic Sans MS"]
thank you so much for all your Replies.....
The criteria is to get the first Row...
SELECT
Area
,StateName
,CityName
,ACCOUNT_NO
,CUSTOMER_NAME
,CounterNo
,ROUTE_No
,Address
,TeleNumber
FROM FACT F
,Dim1 1
,Dim2 2
,Dim3 3
WHERE
F.DW_1_KEY = 1.DW_1_KEY
AND F.DW_2_KEY = 2.DW_2_KEY
AND F.DW_3_KEY = 3.3_KEY
and CAL.[year] = year(DATEADD(month, -1, GETDATE()))
and CAL.[month] = month(DATEADD(month, -1, GETDATE()))
order by Area
Now as I said earlier,
the output may contain repeated account numbers, but I want to select only the top row for every account number.
I cant use group by as Same account number may have different Count number or address...I just want to show the first row that comes from the table for every account number....
I cannot Ignore other columns as I will show the value for other columns too from the first row for that account number...
I dont know how to use RowNumber...
Let me try CROSS and OUTER APPLY...
.Any more help?
Thanku
[/font]
Thanks [/font]
January 13, 2010 at 12:14 am
Ok, so the 'first' row is the one with the lowest Area?
If so, try this.
SELECT
Area
,StateName
,CityName
,ACCOUNT_NO
,CUSTOMER_NAME
,CounterNo
,ROUTE_No
,Address
,TeleNumber
FROM
(SELECT
Area
,StateName
,CityName
,ACCOUNT_NO
,CUSTOMER_NAME
,CounterNo
,ROUTE_No
,Address
,TeleNumber
, Row_Number() OVER (PARTITION BY ACCOUNT_NO ORDER BY AREA) AS RowNo
FROM FACT F
INNER JOIN Dim1 D1 on F.DW_1_KEY = D1.DW_1_KEY
INNER JOIN Dim2 D2 ON F.DW_2_KEY = D2.DW_2_KEY
INNER JOIN Dim3 D3 ON F.DW_3_KEY = D3.3_KEY
WHERE
CAL.[year] = year(DATEADD(month, -1, GETDATE()))
and CAL.[month] = month(DATEADD(month, -1, GETDATE()))) sub
WHERE RowNo = 1
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 13, 2010 at 12:22 pm
I had similar need and here is how I got it done. I assume that even though there will be duplication account numbers, there will be unique ACCOUNT_NO and AREA combinations. If not then you will need to find the unique combination that you are interested and change the order by with in the row_number() call to have the unique columns.
;WITH My_INFO AS (
SELECT
Area
,StateName
,CityName
,ACCOUNT_NO
,CUSTOMER_NAME
,CounterNo
,ROUTE_No
,Address
,TeleNumber
,row_number() over (Partition by ACCOUNT_NO order by Area) as row_num
FROM FACT F
,Dim1 1
,Dim2 2
,Dim3 3
WHERE
F.DW_1_KEY = 1.DW_1_KEY
AND F.DW_2_KEY = 2.DW_2_KEY
AND F.DW_3_KEY = 3.3_KEY
and CAL.[year] = year(DATEADD(month, -1, GETDATE()))
and CAL.[month] = month(DATEADD(month, -1, GETDATE()))
)
SELECT a.Area
,a.StateName
,a.CityName
,a.ACCOUNT_NO
,a.CUSTOMER_NAME
,a.CounterNo
,a.ROUTE_No
,a.Address
,a.TeleNumber
FROM My_INFO WHERE row_num = 1
ORDER BY Area
January 13, 2010 at 12:57 pm
Not to create more confusion, but what is the business need here? If you only want to return one row of data per account number, then why return the address or phone number or any other data? It would be kind of strange that there is a business requirement to obtain random data per customer or account?
I know there 'are' some needs every once in a while the require this but very seldom. Sometimes you can get around building tricky and excessively advanced code by refining the business requirements 😉
Link to my blog http://notyelf.com/
January 13, 2010 at 1:09 pm
shannonjk (1/13/2010)
If you only want to return one row of data per account number, then why return the address or phone number or any other data? It would be kind of strange that there is a business requirement to obtain random data per customer or account?
But the queries with Row number don't return random data. It's one account with that account's telephone number, address, etc in it.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 13, 2010 at 1:42 pm
I just walked into the situation.
The Databases were created by Non-IT Professionals.
I have experience with the entire SDLC.
I appreciate your comments and I hope to change the way things are done but that remains to be seen.
I'm sort of in a void at this time. I was ask if I could do that so until I can eliminate duplicates, I have done it before. The business owner is out of town and I do not yet have permissions to the Server.
I believe all that I need is an outer and an inner sub query with a SELECT TOP and an Order by clause to get the first record.
After I examine each table in the database I hope the database is designed in a way and the process that will afford me the opportunity to create constraints, triggers and what ever else is needed to prevent this from happening in the future so that.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Viewing 15 posts - 1 through 15 (of 27 total)
You must be logged in to reply to this topic. Login to reply