Using ROW_NUMBER() function with partition by clause

Question

Post reply

Using ROW_NUMBER() function with partition by clause

Ringo-394296

Old Hand

Points: 371
More actions
April 18, 2011 at 8:34 am

#249180

Hi All,
With the table and dataset below does anyone know how I could group rows that contain a common policy reference and currency with a numeric value? For e.g. from the test dataset below I'd like the following results:
Policy Currency GroupID Amount
a GBP 1 -30
a GBP 1 100
a JPY 2 70
a JPY 2 300
a USD 3 10
a USD 3 20
a USD 3 25
b GBP 1 55
b GBP 1 55
b USD 2 -20
c JPY 1 -80
(So the group ID increments only for every currency within a policy, regardless of the number of rows that use that policy, and is reset between policies.)
If I try something like:
select
policy
,currency
,row_number() over
(
partition by
policy
,currency
order by
policy
) as 'groupID'
,amount
Then the row number function gives incrementing numbers to groups of rows with a common policy and currency, so too granular a row_number is being generated, but if I try:
select
policy
,currency
,row_number() over
(
partition by
policy
order by
policy
) as 'groupID'
,amount
from partition_testing
Then I get a new set of incrementing numbers for groups of rows with a common policy.
Am I barking up the wrong tree with this, I'd just like to get a set of incrementing numbers for each currency within a policy as opposed to each row with a common currency/policy!
Any help greatly appreciated.
Many thanks.
--Test table and data:
create table partition_testing
(
policy varchar(10)
,currency char(3)
,sequence int
,amount int
)
insert into partition_testing (policy,currency,sequence,amount) values ('a','USD',1,10)
insert into partition_testing (policy,currency,sequence,amount) values ('a','USD',2,20)
insert into partition_testing (policy,currency,sequence,amount) values ('a','USD',3,25)
insert into partition_testing (policy,currency,sequence,amount) values ('a','GBP',4,-30)
insert into partition_testing (policy,currency,sequence,amount) values ('a','GBP',5,100)
insert into partition_testing (policy,currency,sequence,amount) values ('a','JPY',6,70)
insert into partition_testing (policy,currency,sequence,amount) values ('a','JPY',7,300)
insert into partition_testing (policy,currency,sequence,amount) values ('b','GBP',1,55)
insert into partition_testing (policy,currency,sequence,amount) values ('b','GBP',2,90)
insert into partition_testing (policy,currency,sequence,amount) values ('b','USD',3,-20)
insert into partition_testing (policy,currency,sequence,amount) values ('c','JPY',1,80)

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Dan.Humphries SSChampion Points: 12125 More actions · Answer 1

could you add a sample outcome. It may be due to monday but I am not sure I follow what you are looking for.

Dan

If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 2

You're nearly there...check out RANK() and DENSE_RANK().

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Ringo-394296 Old Hand Points: 371 More actions · Answer 3

Hi Dan,

Apologies, my OP is a little unclear!

What I would like is to be able to generate row numbers that are applied through a 'window' which views the sample data (in my original post) by policy.

So each currency used by a policy gets a new row number, and the row numbering restarts for each policy, as follows:

[font="Courier New"]

Policy-----Currency-----Row Number

A---------USD---------1

A---------GBP----------2

A---------JPY----------3

B---------USD----------1

B---------GBP----------2

[/font]

This is to be able to group sets of rows which are each against a common policy and currency.

The problem that I have is I cannot get the row numbers to be common across several rows which each use a common policy and currency, the following example shows the results that I get:

[font="Courier New"]

Policy-----Currency-----Amount------Row_Number() as GroupID

A---------USD---------10------------1

A---------USD---------10------------2

A---------USD---------10------------3

A---------GBP----------10-----------4

A---------JPY----------10-----------5

B---------USD----------10-----------1

B---------GBP----------10-----------2

[/font]

So each row within a common policy/currency gets a new row number, with the rownumber being restarted for every new policy.

Whereas I am trying to get the following results, note the values in the row number column, these only increment per policy and currency, rather than per row:

[font="Courier New"]

Policy-----Currency-----Amount------Row_Number() as GroupID

A---------USD---------10------------1

A---------GBP----------10-----------2

A---------JPY----------10-----------3

B---------USD----------10-----------1

B---------GBP----------10-----------2

[/font]

Hopefully this explains more clearly what I'm trying to achieve.

Many thanks,

Iain

Dan.Humphries SSChampion Points: 12125 More actions · Answer 4

Given your example

Policy-----Currency-----Amount------Row_Number() as GroupID
A---------USD---------10------------1
A---------USD---------10------------1
A---------USD---------10------------1
A---------GBP----------10-----------2
A---------JPY----------10-----------3
B---------USD----------10-----------1
B---------GBP----------10-----------2

Row_number would not be the tool to use. Data wise there is absolutley no difference that I can tell between

A---------USD---------10------------1

and

A---------USD---------10------------1

since there are identical records Row_Number would not assign them the same value no mater what you did.

Dan

If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 5

DENSE_RANK is what you need , though you have not specified how to order the currencies.

Clear Sky SQL
My Blog[/url]

Ringo-394296 Old Hand Points: 371 More actions · Answer 6

My apologies again, I meant to have different amounts on each row (being a bit hasty), but also for arguments sake I've put in an additional column which contains data which cannot be aggregated:

[font="Courier New"]

Policy-----Currency-----Amount------Row_Number() as GroupID---Person

A---------USD---------10------------1--------------------------Bob

A---------USD---------20------------1--------------------------Fred

A---------USD---------40------------1--------------------------Sally

A---------GBP----------70-----------2--------------------------Harry

A---------JPY----------90-----------3--------------------------Simon

B---------USD----------15-----------1--------------------------Tammy

B---------GBP----------80----------2--------------------------Lisa

[/font]

Cheers,

Iain

Ringo-394296 Old Hand Points: 371 More actions · Answer 7

Thank you for the for the tip, I did look briefly at these functions but they seem to continue the number series across each window ... is it possible to change this behaviour so that the numbers restart at one at the beginning of each window created by the partition clause?

Many thanks,

Iain

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 8

Ringo-394296 (4/18/2011)
Thank you for the for the tip, I did look briefly at these functions but they seem to continue the number series across each window ... is it possible to change this behaviour so that the numbers restart at one at the beginning of each window created by the partition clause?
Many thanks,
Iain

Try it out, Iain. You've clearly got the hang of ROW_NUMBER(). Best way to learn is to experiment 😉

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Ringo-394296 Old Hand Points: 371 More actions · Answer 9

Dan/Chris/Dave,

Thanks for your help and suggestions, it sounds like I should research the other ranking functions but for the meantime (I have a deadline which is due in about 2 hours!) I've discovered a workaround:

Anyone reading this with a similar problem - my issue was being cause by the granularity of the data. The data contained the following hiererachy:

--Policy Reference

---------Currency

----------------Transaction

I was trying to gerate rownumber at the level of currency with data that was at the level of transaction, hence the rownumber function didn't stop at currency and generated the additional (unwanted) row numbers, even though transaction level details weren't shown in the result set).

My workaround is to take a distinct list of policy references and currencies into a side table and then run the row_number/partition by query over the resulting data (which is at the higher granularity of only currency).

The row_number results generated from that query (at the level of currency) I then wrote back to my original dataset using a join between my side table and main dataset on policy reference and currency.

A bit messy but it works and means I might be able to get home on time tonight!

Again thanks for all help.

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 10

Try this - it might not give you exactly what you need but it will be close enough for you to correct:

SELECT policy, currency, sequence, amount,

GroupID = DENSE_RANK() OVER(PARTITION BY policy ORDER BY currency)

FROM partition_testing

ORDER BY policy, currency, sequence

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]