June 29, 2012 at 1:18 am
This is my original table:
declare @originalTable table (Username varchar(30), Date datetime, Code varchar(10))
insert into @originalTable
select 'User3', '2012-06-25', '007' union
select 'User1', '2012-06-20', '002' union
select 'User1', '2012-06-28', '010' union
select 'User1', '2012-06-21', '003' union
select 'User2', '2012-06-22', '004' union
select 'User2', '2012-06-23', '005' union
select 'User1', '2012-06-19', '001' union
select 'User3', '2012-06-26', '008' union
select 'User2', '2012-06-29', '011' union
select 'User3', '2012-06-24', '006' union
select 'User3', '2012-06-30', '012' union
select 'User1', '2012-06-27', '009'
I am looking for a select, which would give me this (in this exact order):
declare @expectedTable table (Username varchar(30), DateFrom datetime, DateTo datetime, CodeFrom varchar(10), CodeTo varchar(10))
insert into @expectedTable
select 'User1', '2012-06-19', '2012-06-21', '001', '003' union
select 'User2', '2012-06-22', '2012-06-23', '004', '005' union
select 'User3', '2012-06-24', '2012-06-26', '006', '008' union
select 'User1', '2012-06-27', '2012-06-28', '009', '010' union
select 'User2', '2012-06-29', '2012-06-29', '011', '011' union
select 'User3', '2012-06-30', '2012-06-30', '012', '012'
First I have to sort the original table by Code, I get this:
User12012-06-19 00:00:00.000001
User12012-06-20 00:00:00.000002
User12012-06-21 00:00:00.000003
User22012-06-22 00:00:00.000004
User22012-06-23 00:00:00.000005
User32012-06-24 00:00:00.000006
User32012-06-25 00:00:00.000007
User32012-06-26 00:00:00.000008
User12012-06-27 00:00:00.000009
User12012-06-28 00:00:00.000010
User22012-06-29 00:00:00.000011
User32012-06-30 00:00:00.000012
Now I should somehow group by Username and select min(Date), max(Date), min(Code), max(Code). But if I group by Username, then ALL records of an user are grouped together, but I need only those, who are together.
I guess I would need to add another column, like this:
1User12012-06-19 00:00:00.000001
1User12012-06-20 00:00:00.000002
1User12012-06-21 00:00:00.000003
2User22012-06-22 00:00:00.000004
2User22012-06-23 00:00:00.000005
3User32012-06-24 00:00:00.000006
3User32012-06-25 00:00:00.000007
3User32012-06-26 00:00:00.000008
4User12012-06-27 00:00:00.000009
4User12012-06-28 00:00:00.000010
5User22012-06-29 00:00:00.000011
6User32012-06-30 00:00:00.000012
Then I could group by this new column and Username and get expected results. How can I add this new column? I tried with ranking functions, but couldn't figure it out...
June 29, 2012 at 1:27 am
This, perhaps?
; with cte as
(
select *
, rn1 = ROW_NUMBER() over(Order by ot.code)
- ROW_NUMBER() over(partition by ot.username Order by ot.code)
from @originalTable ot
)
select c.Username
,DateFrom = MIN (c.Date)
,Dateto = max (c.Date)
,CodeFrom = MIN (c.Code)
,CodeTo = max (c.Code)
from cte c
group by c.Username , c.rn1
order by CodeFrom
June 29, 2012 at 1:33 am
@ColdCoffee: no, I tried that. The query
select
row_number() over (order by Code) RowNumber,
row_number() over (partition by username order by code) RowNumberWithPartition,
row_number() over (order by Code) - row_number() over (partition by username order by code) Combined,
*
from @originalTable
order by Code
yields this:
110User12012-06-19 00:00:00.000001
220User12012-06-20 00:00:00.000002
330User12012-06-21 00:00:00.000003
413User22012-06-22 00:00:00.000004
523User22012-06-23 00:00:00.000005
615User32012-06-24 00:00:00.000006
725User32012-06-25 00:00:00.000007
835User32012-06-26 00:00:00.000008
945User12012-06-27 00:00:00.000009
1055User12012-06-28 00:00:00.000010
1138User22012-06-29 00:00:00.000011
1248User32012-06-30 00:00:00.000012
The row_number function with partition by is incorrect because it continues with counting instead of starting it over again when the user changes.
June 29, 2012 at 2:34 am
SELECT Username,
DateFrom = MIN([Date]),
DateTo = MAX([Date]),
CodeFrom = MIN(Code),
CodeTo = MAX(Code)
FROM (
SELECT Username, [Date], Code,
rn = ROW_NUMBER() OVER(ORDER BY Username, Code),
dr = DENSE_RANK() OVER(ORDER BY Code)
FROM @originalTable
) d
GROUP BY Username, dr-rn
ORDER BY 2
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 29, 2012 at 3:03 am
_simon_ (6/29/2012)
@ChrisM@Work - that's it, thanks πIs there any specific reason why you used dense_rank function? I tried with two row_number functions and it still works ok.
I'd be interested to see your version with two row_number functions. I used dense_rank because it's easier to work out -I can't remember how to do this stuff.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 29, 2012 at 3:10 am
It is exactly the same as yours, I just replaced dense_rank function with row_number:
SELECT Username,
DateFrom = MIN([Date]),
DateTo = MAX([Date]),
CodeFrom = MIN(Code),
CodeTo = MAX(Code)
FROM (
SELECT Username, [Date], Code,
rn = ROW_NUMBER() OVER(ORDER BY Username, Code),
dr = ROW_NUMBER() OVER(ORDER BY Code)
FROM @originalTable
) d
GROUP BY Username, dr-rn
ORDER BY 2
I tested it on large data and the results is the same in both cases.
June 29, 2012 at 9:40 am
_simon_ (6/29/2012)
The row_number function with partition by is incorrect because it continues with counting instead of starting it over again when the user changes.
My query produces the exact same output that you asked for. Did you ask for a extra colum with increasing numbers or did you want you result to be Grouped by a the order of code with recurring Username?
July 1, 2012 at 11:43 pm
I want my results to be grouped, your query is totally correct. I was just curious if there is any reason behind the decision, that you used dense_rank() instead of row_number().
These two queries returns the same result:
SELECT Username,
DateFrom = MIN([Date]),
DateTo = MAX([Date]),
CodeFrom = MIN(Code),
CodeTo = MAX(Code)
FROM (
SELECT Username, [Date], Code,
rn = ROW_NUMBER() OVER(ORDER BY Username, Code),
dr = DENSE_RANK() OVER(ORDER BY Code)
FROM @originalTable
) d
GROUP BY Username, dr-rn
ORDER BY 2
SELECT Username,
DateFrom = MIN([Date]),
DateTo = MAX([Date]),
CodeFrom = MIN(Code),
CodeTo = MAX(Code)
FROM (
SELECT Username, [Date], Code,
rn = ROW_NUMBER() OVER(ORDER BY Username, Code),
dr = ROW_NUMBER() OVER(ORDER BY Code)
FROM @originalTable
) d
GROUP BY Username, dr-rn
ORDER BY 2
Edit - but it doesn't matter really, main thing is, that it just works π
July 2, 2012 at 1:52 am
_simon_ (7/1/2012)
I want my results to be grouped, your query is totally correct. I was just curious if there is any reason behind the decision, that you used dense_rank() instead of row_number().These two queries returns the same result:
SELECT Username,
DateFrom = MIN([Date]),
DateTo = MAX([Date]),
CodeFrom = MIN(Code),
CodeTo = MAX(Code)
FROM (
SELECT Username, [Date], Code,
rn = ROW_NUMBER() OVER(ORDER BY Username, Code),
dr = DENSE_RANK() OVER(ORDER BY Code)
FROM @originalTable
) d
GROUP BY Username, dr-rn
ORDER BY 2
SELECT Username,
DateFrom = MIN([Date]),
DateTo = MAX([Date]),
CodeFrom = MIN(Code),
CodeTo = MAX(Code)
FROM (
SELECT Username, [Date], Code,
rn = ROW_NUMBER() OVER(ORDER BY Username, Code),
dr = ROW_NUMBER() OVER(ORDER BY Code)
FROM @originalTable
) d
GROUP BY Username, dr-rn
ORDER BY 2
Edit - but it doesn't matter really, main thing is, that it just works π
No other reason than habit π
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply