select first entry in every set?

Question

select first entry in every set?

Russ-230211

SSC Veteran

Points: 293
More actions
June 2, 2006 at 10:40 am

#113215

In a SQL query, my results sets sometime contains records in which there are 2 or more records which are not quite duplicates. For instance, there may be 2 or more records with the same application id, but different dates. I only want one of the records.
If I could I would assign an sequence identity id to every record (in the results set) and restart the sequence identify values to 1 every time there was a change in the application id, thus all the records I wanted would have sequence id of 1.
How can you do something like that with SQL?
Thanks for any and all help.
Russ....

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

PW-201837 SSC-Insane Points: 20805 More actions · Answer 1

In SQL 2005, there are RANK functions which can generate sequences for situations like this.

In SQL2000, you typically join to a derived table, which uses 1 ofthe SQL aggregate functions to isolate the 1 record for each set. For example, in your data, if you want the Application ID plus the row with the most recent date:

>>For instance, there may be 2 or more records with the same application id, but different dates.

Select a.*

From YourApplictionTable As a

Inner Join

-- Build a derived table of most recent date per set of ApplicationID

(

Select ApplicationID, Max(AppDate) As MostRecentDate

From YourApplictionTable

Group By ApplicationID

) dtRecent

On (a.ApplicationID = dtRecent.ApplicationID And

a.AppDate = dtRecent.MostRecentDate

)

This approach obviously runs into issues if there are ties and 2 or more records have the same value for the column you're aggregating.

stax68 SSChampion Points: 11711 More actions · Answer 2

Do you want the earliest of the records, latest, don't care? If you want first or last, how will you break ties? If you don't care which record you keep it would make it easier to write a view or derived table returning the PKs of the desired records (but if you don't care what's in the field, do you even want the field?)

You can use numbering with an UPDATE of a temp table/variable, but a single SELECT statment is likely to be more efficient.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

stax68 SSChampion Points: 11711 More actions · Answer 3

PW you assume dates are unique within a group. Not a very safe assumption, esp. if there's no time portion, or frequent transactions.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

stax68 SSChampion Points: 11711 More actions · Answer 4

It's a bit inelegant but you can use a hash value to avoid problems with duplicate dates.

Select

a.*

From

YourApplicationTable As a

Inner

Join

(

Select ApplicationID, cast(substring(max(select convert(char(8),App_Date,112) + str(ApplicationTablePK,10)),9,10) as int) As ApplicationTablePK

From YourApplictionTable

Group By ApplicationID

)

V

On

(a.ApplicationTablePK = V.ApplicationTablePK

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Vladan SSC-Insane Points: 22072 More actions · Answer 5

Almost identical question appeared several times during the last month, I'm sure you'll be able to find some of these discussions.

General warning : if you want to simply get one of the records and don't care which one it is (often described in the requirements as "need the first row, without any ordering"), then there is something wrong with the design. It is absolutely legitimate to retrieve a row that is the oldest, newest, last updated, with greatest value in "amount" column etc., but bringing up "any first row" is not correct. Theoretically, if you run such query 5 times without any changes in data, you could get 5 different results.