Get spans from a table

  • I have a table with codes which are inserted by a user. Now I need to display this data with spans, so it is easily visible who inserted which codes and when.

    Sorry for the bad description, but I think the example will describe it better:

    if object_id('tempdb..#temp') is not null drop table #temp

    create table #temp (Id int identity(1,1), Date datetime, UserId int, Code bigint)

    go

    insert into #temp (Date, UserId, Code)

    select '2010-08-10 08:02:00', 1, 1000001 union all

    select '2010-08-10 08:03:00', 1, 1000002 union all

    select '2010-08-10 08:04:00', 1, 1000003 union all

    select '2010-08-10 08:05:00', 1, 1000004 union all

    select '2010-08-10 08:06:00', 1, 1000005 union all

    select '2010-08-10 08:07:00', 2, 1000006 union all

    select '2010-08-10 08:08:00', 2, 1000007 union all

    select '2010-08-10 08:09:00', 2, 1000008 union all

    select '2010-08-10 08:10:00', 1, 1000009 union all

    select '2010-08-10 08:11:00', 1, 1000010 union all

    select '2010-08-10 08:12:00', 1, 1000011 union all

    select '2010-08-10 08:13:00', 3, 1000012 union all

    select '2010-08-10 08:14:00', 1, 1000013 union all

    select '2010-08-10 08:15:00', 1, 1000014 union all

    select '2010-08-10 08:16:00', 2, 1000015 union all

    select '2010-08-10 08:17:00', 2, 1000016 union all

    select '2010-08-10 08:18:00', 3, 1000017 union all

    select '2010-08-10 08:19:00', 3, 1000018 union all

    select '2010-08-10 08:20:00', 3, 1000019 union all

    select '2010-08-10 08:21:00', 1, 1000020

    -- I want to get this data

    select '2010-08-10 08:02:00' DateFrom, '2010-08-10 08:06:00' DateTo, 1 UserId, 1000001 CodeFrom, 1000005 CodeTo union all

    select '2010-08-10 08:07:00' DateFrom, '2010-08-10 08:09:00' DateTo, 2 UserId, 1000006 CodeFrom, 1000008 CodeTo union all

    select '2010-08-10 08:10:00' DateFrom, '2010-08-10 08:12:00' DateTo, 1 UserId, 1000009 CodeFrom, 1000011 CodeTo union all

    select '2010-08-10 08:13:00' DateFrom, '2010-08-10 08:13:00' DateTo, 3 UserId, 1000012 CodeFrom, 1000012 CodeTo union all

    select '2010-08-10 08:14:00' DateFrom, '2010-08-10 08:15:00' DateTo, 1 UserId, 1000013 CodeFrom, 1000014 CodeTo union all

    select '2010-08-10 08:16:00' DateFrom, '2010-08-10 08:17:00' DateTo, 2 UserId, 1000015 CodeFrom, 1000016 CodeTo union all

    select '2010-08-10 08:18:00' DateFrom, '2010-08-10 08:20:00' DateTo, 3 UserId, 1000017 CodeFrom, 1000019 CodeTo union all

    select '2010-08-10 08:21:00' DateFrom, '2010-08-10 08:21:00' DateTo, 1 UserId, 1000020 CodeFrom, 1000020 CodeTo

    -- which produces...

    DateFrom DateTo UserId CodeFrom CodeTo

    2010-08-10 08:02:00 2010-08-10 08:06:00 1 10000011000005

    2010-08-10 08:07:00 2010-08-10 08:09:00 2 10000061000008

    2010-08-10 08:10:00 2010-08-10 08:12:00 1 10000091000011

    2010-08-10 08:13:00 2010-08-10 08:13:00 3 10000121000012

    2010-08-10 08:14:00 2010-08-10 08:15:00 1 10000131000014

    2010-08-10 08:16:00 2010-08-10 08:17:00 2 10000151000016

    2010-08-10 08:18:00 2010-08-10 08:20:00 3 10000171000019

    2010-08-10 08:21:00 2010-08-10 08:21:00 1 10000201000020

    Again I could use a cursor, but there must be a cleaner solution.

    Thanks

  • Please test if performance is acceptable in you case:

    ;with termrec

    as

    (

    select t1.UserId, t1.Date, t1.Code

    from #temp t1

    left join #temp t2

    on t2.UserId = t1.UserId

    and t2.Code = t1.Code + 1

    where t2.Date is null

    )

    select t1.UserId, min(t1.Date) DateFrom, tor.DateTo, min(t1.Code) CodeFrom, tor.CodeTo

    from #temp t1

    cross apply (select min(Date) DateTo, min(Code) CodeTo from termrec tr where tr.UserId = t1.UserId and tr.Date >= t1.Date) tor

    group by t1.UserId, tor.DateTo, tor.CodeTo

    order by tor.DateTo, t1.UserId

    There is an option to implement this using "quirky update" method well described by Jeff Moden here:

    http://www.sqlservercentral.com/articles/T-SQL/68467/

    NOTE: Pay especial attention to the last "the RULES" section of the above article!

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Works great, thanks!

    And the article goes to my to-do list 🙂

  • ... but extremely slow on big tables though, probably because of cross join...

  • Use "quirky update"!

    Also, you may speed up my query by adding indices on Date and Code columns...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Here it is, to help you:

    -- you will need table with two extra columns for DateFrom and CodeFrom

    -- "select into" is quite fast even for large tables

    select *, CAST(null as datetime) DateFrom, CAST(null as bigint) CodeFrom

    into #temp1

    from #temp

    -- the clustered index (a must have) is required to ensure order of update!

    -- I hope it can be declared as unique...

    create unique clustered index ix_temp1 on #temp1(Date, UserId)

    -- variable to store intermdiate values and anchor

    declare @DateFrom datetime

    declare @CodeFrom bigint

    declare @crusr int

    -- "quirky" update will set relevant DateFrom and CodeFrom for each record,

    -- so we will be able to group by them

    update #temp1

    set @DateFrom = DateFrom = case when UserId = @crusr then @DateFrom else [Date] end

    ,@CodeFrom = CodeFrom = case when UserId = @crusr then @CodeFrom else Code end

    ,@crusr = UserId

    option (maxdop 1)

    -- final query is smiple enough...

    select UserId, DateFrom, MAX([Date]) DateTo, CodeFrom, MAX(Code) CodeTo

    from #temp1

    group by UserId, DateFrom, CodeFrom

    order by DateFrom, UserId

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • This alternative doesn't use the quirky update, and doesn't have any specific indexing requirements in order to produce the right answer, though appropriate indexes on Code and UserId columns would improve performance.

    ;WITH cteSPAN AS (

    SELECT

    [Date],

    UserId,

    Code,

    ROW_NUMBER() OVER (ORDER BY Code)

    - ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY Code) AS SpanId

    FROM #temp

    )

    SELECT

    MIN([Date]) AS DateFrom,

    MAX([Date]) AS DateTo,

    UserId,

    MIN(Code) AS CodeFrom,

    MAX(Code) AS CodeTo

    FROM cteSPAN

    GROUP BY UserId, SpanId

    ORDER BY CodeFrom

    If you can guarantee that the Code column contains sequential integers with no gaps then the following will also work and will be faster:

    ;WITH cteSPAN AS (

    SELECT

    [Date],

    UserId,

    Code,

    Code - ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY Code) AS SpanId

    FROM #temp

    )

    SELECT

    MIN([Date]) AS DateFrom,

    MAX([Date]) AS DateTo,

    UserId,

    MIN(Code) AS CodeFrom,

    MAX(Code) AS CodeTo

    FROM cteSPAN

    GROUP BY UserId, SpanId

    ORDER BY CodeFrom

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply