How to de-dup values in a specific column?

  • How can I dedup the same values in a particular column?

    Here is my data:

    Col1             Col2

    -------------------------------------------------------------------

    917            2294,2294,2294,2294 

    918            2294,2294,2294

    920            2060

    Col2 has duplicate values in row 1 and 2.

    I need to transform col 2 data to look as below: 

    Col1          Col2

    ----------------------------------------

    917            2294

    918            2294

    920            2060

     

    Thanks

  • I know this isn't what you're asking for, but here's a solution which works for your example. Now, can you give us an example which breaks this solution? That will make it easier to understand your problem

    --data

    declare @t table (Col1 int, Col2 varchar(100))

    insert @t

              select 917, '2294,2294,2294,2294'

    union all select 918, '2294,2294,2294'

    union all select 920, '2060'

    --calculation

    select Col1, left(Col2, 4) as Col2 from @t

    --or

    select Col1, left(Col2, charindex(',', Col2+',')-1) as Col2 from @t

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Could you pl. provide some more information...

    1) How many different values can be there in Col2 ?! What is the length ?!

    2) Is it always comma separated and are the values always ints ?!







    **ASCII stupid question, get a stupid ANSI !!!**

  • Sushila,

    1.  Col2 is of data type NTEXT so there can be any number of  values in it.

    2.  Yes the values stored in Col2 are always type integer and separated by commas.

    Thanks

  • Hi Ryan,

    Thanks for proposing some solutions.  However my data comes in some rather ad-hoc format.

    Samples below did not yield expected results. 

    declare @t table (Col1 int, Col2 varchar(100))

    insert @t

              select 917, '2294,2294,2294,2294'

    union all select 918, '2294,2294,2294'

    union all select 920, '2060'

    union all select 921, '208,2294,2341,2294,399,2294,2294'

    select Col1, left(Col2, 4) as Col2 from @t

    Results:

    917 2294

    918 2294

    920 2060

    921 208,

    ----------------------------------------------------------

    insert @t

              select 917, '2294,2294,2294,2294'

    union all select 918, '2294,2294,2294'

    union all select 920, '2060'

    union all select 921, '208,2294,2341,2294,399,2294,2294'

    select Col1, left(Col2, charindex(',', Col2+',')-1) as Col2 from @t

    Results

    917 2294

    918 2294

    920 2060

    921 208

    expected Result for last row

    921     208,2294,2341,399

    Thanks

  • There's what we needed! Col2 can have different values in its list, and you've shown what you'd want to do with them. Now, on with the solution...

    Oh, last thought on understanding the problem - Is there some significance to the ordering of Col2, or can the results be in any order? E.g. does it have to be "208,2294,2341,399" in your example, or can it be, say, 208,399,2294,2341?

    Thanks...

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Ryan,

    "208,2294,2341,399" is OK.  208,399,2294,2341 is OK too.

    In the database there is a code table:

    ID      description

    ----------------------

    208     Project208

    399     Project399

    2294    Project2294

    2341    Project2341

    What I will have to do next is take 208,2294,2341,399 or

    208,399,2294,2341  and replace them with their descriptions to display them on the report.

    End result:

    Project1, Project399, Project 2294, Project2341 or

    Project208,Project2294,Project2341,Project399.

    Thanks

     

     

     

     

     

  • Ryan,

    "208,2294,2341,399" is OK.  208,399,2294,2341 is OK too.

    In the database there is a code table:

    ID      description

    ----------------------

    208     Project208

    399     Project399

    2294    Project2294

    2341    Project2341

    What I will have to do next is take 208,2294,2341,399 or

    208,399,2294,2341  and replace them with their descriptions to display them on the report.

    End result:

    Project1, Project399, Project 2294, Project2341 or

    Project208,Project2294,Project2341,Project399.

    Thanks

     

     

     

     

     

  • Hoa,

    How's this...?

    --data

    declare @t table (Col1 int, Col2 varchar(100))

    insert @t

              select 917, '2294,2294,2294,2294'

    union all select 918, '2294,2294,2294'

    union all select 920, '2060'

    union all select 921, '208,2294,2341,2294,399,2294,2294'

    --calculation

    declare @numbers table (i int identity(1,1), j bit)

    insert @numbers select top 101 null from master.dbo.syscolumns a, master.dbo.syscolumns b

    create table mytemprolluptable (Id int, v varchar(100))

    insert mytemprolluptable

    select distinct Col1, substring(Col2 + ',', i, charindex(',', substring(Col2 + ',', i, 100))-1) as Col2

    from @numbers, @t t

    where substring(',' + Col2, i, 1) = ','

    go

    --create function

    create function mytemprollupfunction (@id int) returns varchar(100) as

    begin

        declare @v-2 varchar(100)

        set @v-2 = ''

        select @v-2 = @v-2 + v + ',' from mytemprolluptable where id = @id

        return left(@v, len(@v) - 1)

    end

    go

    --select results

    select distinct Id as Col1, dbo.mytemprollupfunction(Id) as Col2 from mytemprolluptable

    --tidy up

    drop table mytemprolluptable

    drop function mytemprollupfunction

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Hi Ryan,

    It works wonderfully.  Thanks for your patience and help!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply