Find the First Row
Recently I was asked an interesting programming question about how to return certain
information. At first glance, it appeared to be simple to me, but a few people had
the same problem. After a little testing, I found it was interesting and decided to make
some notes:
The Problem
A colleague had a table that contained information about employees. Some department number, a name,
and a code. I built a short test table using the following code:
create table MyTest ( id int, nm char( 10), code int )
The challenge in this instance was to return the first employee from each department. Suppose
the table contained the following information:
IDNmCode ----------------- 1Joe 550 1Jim 600 2Mary 300 3Mike 350 4Pam 50 4Jeff 200 4Judge 100
The idea was to return the follwing results:
IDNmCode ----------------- 1Joe 550 2Mary 300 3Mike 350 4Pam 50
The Solution
There are actually a few solutions, as there are for most problems in SQL. To return the
desired results, we want to return the first instance of each id value. The easiest
way to do this, IMHO, is to use a correlated subquery.
In constructing this query, we first build our column list as a simple select:
select id ,nm ,code from MyTest a
Now we can build a select that finds the first name. We cannot use the MIN()
function for the following reason. Suppose we run:
select id, min( nm) from MyTest b group by id
we get
1Jim 2Mary 3Mike 4Jeff
The problem is with department 4. The first instance is Pam, not Jeff. Using an ordering
clause also causes a similar problem. What we want is the first item in Natural order, or
the order in which the items are stored.
This is a perfect place for the TOP operator. We can use this operator to limit our
result set, and without an ORDER BY clause or other operator to affect the result set, the
natural order takes over. The following code:
select top 1 nm from MyTest b where id = 4
we get
which is what we want.
NOTE: The example is returning the natural order of the data, which is what was requested.
The presence of an index can change the ordering of the data. An ORDER by clause can be
used to force the first row based on a column of data, but there must be such a column
available for ordering.
We can now link these two queries together, making the second a subquery of the first as follows:
select id ,nm ,code from MyTest a where nm = ( select top 1 nm from MyTest b where b.id = a.id )
Notice that the outer query (the first query) has a reference embedded in the subquery (the
second query). This reference makes the correlation between the subquery and the outer
query. This query gives us the results we desire.
IDNmCode ----------------- 1Joe 550 2Mary 300 3Mike 350 4Pam 50
Code used to create this article is here: findfirst.sql
Conclusions
I didn't think this was a difficult query, but it is a little tricky. A couple
years ago, however, it may have stumped be for awhile. Having a broad knowledge of the
different functions in T-SQL can come in extremely handy in devising solutions to a query problem.
Hopefully this has helped you and as always, I welcome feedback on this topic.