Selecting only rows with changed value

Question

Post reply

Selecting only rows with changed value

marcus

SSC Enthusiast

Points: 103
More actions
May 5, 2004 at 12:23 pm

#108299

Hello,
I have a table with three columns:
1920 | ID16 | Value1
1922 | ID16 | Value1
1923 | ID19 | Value1
1925 | ID10 | Value2
1926 | ID16 | Value2
1929 | ID20 | Value3
1930 | ID16 | Value3
1933 | ID16 | Value1
1934 | ID30 | Value1
Now I need the following result set:
1920 | ID16 | Value1
1925 | ID10 | Value2
1929 | ID20 | Value3
1933 | ID16 | Value1
which means I need a Stored Procedure which gives me back only these rows, which have an altered value (3rd. column), independently of the other columns.
And the best of all, it should be quite fast, because the table contains really a lot of rows.
Has anybody an idea how to solve that?
thanks, marcus

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Matt Miskinnis SSC-Addicted Points: 461 More actions · Answer 1

hard to follow what your asking?

first off - these columns I'm presuming are the following:

column 1 incremental numbers?

column 2 ID's?

first off you would need to know what the original value was because in your example you show:

1920 | ID16 | Value1

1925 | ID10 | Value2

1929 | ID20 | Value3

1933 | ID16 | Value1

which basically shows the ID16 having the same value??

In cases where you have changing values it might be a good idea to create date/time column and have it default the date and time when data is entered. this way no matter what ID is being pulled you could query it having the date column descending giving you the lateset updated value.

Jeff Moden SSC Guru Points: 1004433 More actions · Answer 2

Hey Matt,

I get it... when sorted in the order of the first column, you want to show only those rows whose value in the third column has changed from the previous row...

Does your table have an auto-numbering ID or is the 1st column, by any chance, unique?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Phil Parkin SSC Guru Points: 246991 More actions · Answer 3

If the answer to Jeff's question is that column 1 is unique and ascending (as it appears from your example), I still can't think of a quick way of doing this ... seems like you're heading towards the dreaded CURSOR to me.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

marcus SSC Enthusiast Points: 103 More actions · Answer 4

Yes Jeff is right, columns 1 is unique and ascending

Jeff Moden SSC Guru Points: 1004433 More actions · Answer 5

Ugh... No cursors... although it can certainly be done with one I don't have an answer right now (on the way to work) but, I've been challenged. I whack at this tonight...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

PaulR-122207 Ten Centuries Points: 1003 More actions · Answer 6

This may not be the most elegant, but how about:

-----------------------

set nocount on
drop table #tmp
select 1920 as AutoID, 'ID16' as ID, 'Value1' as Value
into #tmp
insert into #tmp values (1922, 'ID16', 'Value1')
insert into #tmp values (1923, 'ID19', 'Value1')
insert into #tmp values (1925, 'ID10', 'Value2')
insert into #tmp values (1926, 'ID16', 'Value2')
insert into #tmp values (1929, 'ID20', 'Value3')
insert into #tmp values (1930, 'ID16', 'Value3')
insert into #tmp values (1933, 'ID16', 'Value1')
insert into #tmp values (1934, 'ID30', 'Value1')
print 'Here is our original data:'
select * from #tmp
print 'Here are the results we get of changed items for col 3'
print '(This assumes that col1 is an autoid)'
print '(Remember, tho, that natural order of a relation should, in theory, be irrelevant...hence, why do this?)'
select *
from #tmp
where AutoID = (select min(AutoID) from #tmp)
UNION
select t1.*
from #tmp t1
where value <> (select value
  from #tmp t2
  where t2.autoid = (select max(autoid)
     from #tmp t3
     where t3.autoid < t1.autoid))
print 'We want to see:'
print '1920 | ID16 | Value1'
print '1925 | ID10 | Value2'
print '1929 | ID20 | Value3'
print '1933 | ID16 | Value1'
set nocount off

-----------------------

It returns:

-----------------------

Here is our original data:
AutoID      ID   Value
----------- ---- ------
1922        ID16 Value1
1923        ID19 Value1
1925        ID10 Value2
1926        ID16 Value2
1929        ID20 Value3
1930        ID16 Value3
1933        ID16 Value1
1934        ID30 Value1
1920        ID16 Value1
Here are the results we get of changed items for col 3
(This assumes that col1 is an autoid)
(Remember, tho, that natural order of a relation should, in theory, be irrelevant...hence, why do this?)
AutoID      ID   Value
----------- ---- ------
1920        ID16 Value1
1925        ID10 Value2
1929        ID20 Value3
1933        ID16 Value1
We want to see:
1920 | ID16 | Value1
1925 | ID10 | Value2
1929 | ID20 | Value3
1933 | ID16 | Value1

-----------------------

Is this what you're looking for?

Paul

David Burrows SSC Guru Points: 65144 More actions · Answer 7

This should work, but may be bad performance (depending on indexing) for a lot of rows

select a.*

from

a

left outer join

b

on b.col1 = (select max(col1) as col1 from

where col1 < a.col1)

where isnull(b.col3,'') <> a.col3

order by a.col1

Far away is close at hand in the images of elsewhere.
Anon.

marcus SSC Enthusiast Points: 103 More actions · Answer 8

Okay, you guys are great.

Thanks a lot for all the effort you put in. It seems that this is exactly what I was looking for.

thanks again, marcus

David Burrows SSC Guru Points: 65144 More actions · Answer 9

Jeff,

Well done, I do not want to steal your thunder but I must take issue with your post.

First, my query does return the first value (ID=1920).

Second, I extended the data to 100,000 rows and put a PK on ID (otherwise query takes too long) and ran your query against mine and mine ran consistently faster.

I know it was only a test but I did run the queries several times to check and was surprised at the results

Far away is close at hand in the images of elsewhere.
Anon.

Jeff Moden SSC Guru Points: 1004433 More actions · Answer 10

David,

I'd say that I owe you an apology! Your code does, in fact, return all of the rows expected! I don't know what I did to your code late last night but I copied and modified it again this morning to match the test table I'd set up and your code ran as expected.

Again, I sure am sorry... don't know what the heck happened. I've deleted my previous erroneous post.

Thank you for the feed back on the speed... Yeah, that really surprises the heck out of me, too! The <> on my the machine I'm using seems to be a real killer. Sent the same code samples I used to a buddy to run and he agrees with your test results. Don't know what the difference is, yet. Guess I have more testing to do 'cause speed will really be important on the project coming up.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

David Burrows SSC Guru Points: 65144 More actions · Answer 11

No apology needed.

I liked your code, it is the way I like to do it, not sure why I did my differently other than it seemed to work.

As for the speed, I did it out of curiosity more than anything. I suspect someone will find some test that will reverse the timings, who knows.

I ran my tests on SQL2K on a box that is currently only used by our Finance people so performance may be relevant to server load. At one point I did notice my query used more cpu, so again may be relevant.

The other thing that surprised me was the difference in the execution plans.

Far away is close at hand in the images of elsewhere.
Anon.

Jeff Moden SSC Guru Points: 1004433 More actions · Answer 12

David,

Thank you for your kindness...

I've sent the code samples to 4 different people and have run it on the server at work. In every case, your code with the <> ran nearly twice as fast as the "postive" logic code I wrote. And you're right... the differences in the Execution Plan were amazing.

I still don't know why my machine was (is) indicating the reverse of our now mutual findings.

Rather than mislead anyone, I've deleted my posting because it was flat-out wrong! Man, am I glad you did the followup speed test! Thanks a million!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)