April 18, 2011 at 8:34 am
Hi All,
With the table and dataset below does anyone know how I could group rows that contain a common policy reference and currency with a numeric value? For e.g. from the test dataset below I'd like the following results:
Policy Currency GroupID Amount
a GBP 1 -30
a GBP 1 100
a JPY 2 70
a JPY 2 300
a USD 3 10
a USD 3 20
a USD 3 25
b GBP 1 55
b GBP 1 55
b USD 2 -20
c JPY 1 -80
(So the group ID increments only for every currency within a policy, regardless of the number of rows that use that policy, and is reset between policies.)
If I try something like:
select
policy
,currency
,row_number() over
(
partition by
policy
,currency
order by
policy
) as 'groupID'
,amount
Then the row number function gives incrementing numbers to groups of rows with a common policy and currency, so too granular a row_number is being generated, but if I try:
select
policy
,currency
,row_number() over
(
partition by
policy
order by
policy
) as 'groupID'
,amount
from partition_testing
Then I get a new set of incrementing numbers for groups of rows with a common policy.
Am I barking up the wrong tree with this, I'd just like to get a set of incrementing numbers for each currency within a policy as opposed to each row with a common currency/policy!
Any help greatly appreciated.
Many thanks.
--Test table and data:
create table partition_testing
(
policy varchar(10)
,currency char(3)
,sequence int
,amount int
)
insert into partition_testing (policy,currency,sequence,amount) values ('a','USD',1,10)
insert into partition_testing (policy,currency,sequence,amount) values ('a','USD',2,20)
insert into partition_testing (policy,currency,sequence,amount) values ('a','USD',3,25)
insert into partition_testing (policy,currency,sequence,amount) values ('a','GBP',4,-30)
insert into partition_testing (policy,currency,sequence,amount) values ('a','GBP',5,100)
insert into partition_testing (policy,currency,sequence,amount) values ('a','JPY',6,70)
insert into partition_testing (policy,currency,sequence,amount) values ('a','JPY',7,300)
insert into partition_testing (policy,currency,sequence,amount) values ('b','GBP',1,55)
insert into partition_testing (policy,currency,sequence,amount) values ('b','GBP',2,90)
insert into partition_testing (policy,currency,sequence,amount) values ('b','USD',3,-20)
insert into partition_testing (policy,currency,sequence,amount) values ('c','JPY',1,80)
April 18, 2011 at 8:44 am
could you add a sample outcome. It may be due to monday but I am not sure I follow what you are looking for.
Dan
If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.
April 18, 2011 at 8:46 am
You're nearly there...check out RANK() and DENSE_RANK().
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
April 18, 2011 at 9:02 am
Hi Dan,
Apologies, my OP is a little unclear!
What I would like is to be able to generate row numbers that are applied through a 'window' which views the sample data (in my original post) by policy.
So each currency used by a policy gets a new row number, and the row numbering restarts for each policy, as follows:
[font="Courier New"]
Policy-----Currency-----Row Number
A---------USD---------1
A---------GBP----------2
A---------JPY----------3
B---------USD----------1
B---------GBP----------2
[/font]
This is to be able to group sets of rows which are each against a common policy and currency.
The problem that I have is I cannot get the row numbers to be common across several rows which each use a common policy and currency, the following example shows the results that I get:
[font="Courier New"]
Policy-----Currency-----Amount------Row_Number() as GroupID
A---------USD---------10------------1
A---------USD---------10------------2
A---------USD---------10------------3
A---------GBP----------10-----------4
A---------JPY----------10-----------5
B---------USD----------10-----------1
B---------GBP----------10-----------2
[/font]
So each row within a common policy/currency gets a new row number, with the rownumber being restarted for every new policy.
Whereas I am trying to get the following results, note the values in the row number column, these only increment per policy and currency, rather than per row:
[font="Courier New"]
Policy-----Currency-----Amount------Row_Number() as GroupID
A---------USD---------10------------1
A---------USD---------10------------1
A---------USD---------10------------1
A---------GBP----------10-----------2
A---------JPY----------10-----------3
B---------USD----------10-----------1
B---------GBP----------10-----------2
[/font]
Hopefully this explains more clearly what I'm trying to achieve.
Many thanks,
Iain
April 18, 2011 at 9:18 am
Given your example
Policy-----Currency-----Amount------Row_Number() as GroupID
A---------USD---------10------------1
A---------USD---------10------------1
A---------USD---------10------------1
A---------GBP----------10-----------2
A---------JPY----------10-----------3
B---------USD----------10-----------1
B---------GBP----------10-----------2
Row_number would not be the tool to use. Data wise there is absolutley no difference that I can tell between
A---------USD---------10------------1
and
A---------USD---------10------------1
since there are identical records Row_Number would not assign them the same value no mater what you did.
Dan
If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.
April 18, 2011 at 9:23 am
DENSE_RANK is what you need , though you have not specified how to order the currencies.
April 18, 2011 at 9:28 am
My apologies again, I meant to have different amounts on each row (being a bit hasty), but also for arguments sake I've put in an additional column which contains data which cannot be aggregated:
[font="Courier New"]
Policy-----Currency-----Amount------Row_Number() as GroupID---Person
A---------USD---------10------------1--------------------------Bob
A---------USD---------20------------1--------------------------Fred
A---------USD---------40------------1--------------------------Sally
A---------GBP----------70-----------2--------------------------Harry
A---------JPY----------90-----------3--------------------------Simon
B---------USD----------15-----------1--------------------------Tammy
B---------GBP----------80----------2--------------------------Lisa
[/font]
Cheers,
Iain
April 18, 2011 at 9:30 am
Thank you for the for the tip, I did look briefly at these functions but they seem to continue the number series across each window ... is it possible to change this behaviour so that the numbers restart at one at the beginning of each window created by the partition clause?
Many thanks,
Iain
April 18, 2011 at 9:38 am
Ringo-394296 (4/18/2011)
Thank you for the for the tip, I did look briefly at these functions but they seem to continue the number series across each window ... is it possible to change this behaviour so that the numbers restart at one at the beginning of each window created by the partition clause?Many thanks,
Iain
Try it out, Iain. You've clearly got the hang of ROW_NUMBER(). Best way to learn is to experiment 😉
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
April 18, 2011 at 9:52 am
Dan/Chris/Dave,
Thanks for your help and suggestions, it sounds like I should research the other ranking functions but for the meantime (I have a deadline which is due in about 2 hours!) I've discovered a workaround:
Anyone reading this with a similar problem - my issue was being cause by the granularity of the data. The data contained the following hiererachy:
--Policy Reference
---------Currency
----------------Transaction
I was trying to gerate rownumber at the level of currency with data that was at the level of transaction, hence the rownumber function didn't stop at currency and generated the additional (unwanted) row numbers, even though transaction level details weren't shown in the result set).
My workaround is to take a distinct list of policy references and currencies into a side table and then run the row_number/partition by query over the resulting data (which is at the higher granularity of only currency).
The row_number results generated from that query (at the level of currency) I then wrote back to my original dataset using a join between my side table and main dataset on policy reference and currency.
A bit messy but it works and means I might be able to get home on time tonight!
Again thanks for all help.
April 18, 2011 at 10:05 am
Try this - it might not give you exactly what you need but it will be close enough for you to correct:
SELECT policy, currency, sequence, amount,
GroupID = DENSE_RANK() OVER(PARTITION BY policy ORDER BY currency)
FROM partition_testing
ORDER BY policy, currency, sequence
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply