Sorting Columns

  • I have a table with 5 columns which contains values entered by the users.

    The problem I have is that the users might enter the values in any order but I need to work out whether the same set of values has already been entered.

    For example:

    
    
    col1 col2 col3
    A B C
    B A C

    These two lines are the same as far as the application is concerned. How can I 'sort the columns' to that I can get two rows like:

    
    
    A B C
    A B C

    I have an idea involving a case statement and another idea normalising the data into a temporary table and then using an order by but I'm not sure whether it will acutally work or whether there is a very simple solution that somebody knows about.

    Jeremy

  • Hi Jeremy,

    quote:


    I have a table with 5 columns which contains values entered by the users.

    The problem I have is that the users might enter the values in any order but I need to work out whether the same set of values has already been entered.

    For example:

    
    
    col1 col2 col3
    A B C
    B A C

    These two lines are the same as far as the application is concerned. How can I 'sort the columns' to that I can get two rows like:

    
    
    A B C
    A B C

    I have an idea involving a case statement and another idea normalising the data into a temporary table and then using an order by but I'm not sure whether it will acutally work or whether there is a very simple solution that somebody knows about.


    does this need to be done at the server, or isn't this validation which could be done at the client.

    What do you need it for?

    Frank

    http://www.insidesql.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • It is something that should have been done on the client at the time they entered the data but hindsight is a wonderful thing.

    I need to do this within SQL Server.

    The applications uses the data to, essentially, create a where clause for an sql statement and the values contain both the column/value/operator which I build in a statement.

    For example:

    
    
    Col1 Col2
    sex = 'M' age > 50
    age > 50 sex = 'M'

    If I create an SQL statement from each line:

    select *

    from table

    where

    sex = 'M'

    and

    age > 50

    select *

    from table

    where

    age > 50

    and

    sex = 'M'

    These two SQL Statements will return the exactly the same set of results but the way the 'criteria' is stored in the table is different.

    Jeremy

  • The simplest way would be to use a UDF or UDF's, passing in all 3 columns, and getting it/them to return the columns in the correct order.

    Depending on how you are going to use the data, you could use a single Table valued UDF to return 3 columns, or write 3 separate UDF's so that each will return a single value.

  • if columns have only A,B,C VALUES (Or limited variations)

    if the problem is so simple; i offer u to use some kind of mathematical Total Approach.

    Hope this Helps.

    A=1

    B=25

    C=1000

    select

    ID,

    CASE COL1

    WHEN 'A' then 1

    WHEN 'B' THEN 25

    WHEN 'C' THEN 1000

    END +

    CASE COL2

    WHEN 'A' then 1

    WHEN 'B' THEN 25

    WHEN 'C' THEN 1000

    END +

    CASE COL3

    WHEN 'A' then 1

    WHEN 'B' THEN 25

    WHEN 'C' THEN 1000

    END as MATHEMATICAL_TOTAL

    FROM XXXXX_OPTIONS_TABLE

  • I agree with alkanfer and ian ...U can use a simple query with case statements in case the data u r trying to fetch is limited to less number of columns or the amount of data is less...else u can implement the logic in a UDF and use it to give the expected output if the amt of data is more.

    Cheers!

    Arvind


    Arvind

  • Hmm.. and what if one user writes age > 50 and another one age>50 (without spaces around operator)? It is still the same condition, but unless you set up the function to trap it, it will be considered as two different values.

    I'm still not sure what you want to do - check and correct existing data, while new data will be already checked on client side? Perform the check periodically? Or do you want to start this check as soon as someone is trying to enter new conditions, and in case such conditions are already stored, make sure that the stored set is used instead of the freshly entered (i.e., prevent insert of a "duplicate" condition)?

  • Vladan,

    I wanted to check existing data because of a problem we had with the production system.

    We solved it in the end by manually rearranging the columns in excel and doing it that way.

    I will be re-writing that part of the application and will put in the proper checking in the new screen.

    Thanks for all the ideas.

    Jeremy

    Edited by - Jeremy Kemp on 09/19/2003 05:44:35 AM

  • A little bit late!

    Set NoCount On
    
    Create Table Tmp1(Col1 Varchar(15),Col2 Varchar(15))
    Go
    Insert Tmp1
    Select 'sex = ''M''','age > 50'
    Insert Tmp1
    Select 'age > 55','sex = ''F'''
    Insert Tmp1
    Select 'age > 50','sex = ''M'''
    Insert Tmp1
    Select 'age > 55','sex = ''M'''
    Insert Tmp1
    Select 'A','B'
    Insert Tmp1
    Select 'B','A'
    Insert Tmp1
    Select 'D','B'
    Go
    -- Warning ~ must not be part of the data, is used as a delimiter
    Select t1.*,t2.*
    From Tmp1 as t1
    Cross Join Tmp1 as t2
    Where t1.Col1<T2.Col1 And
    Replace(Replace('~'+t1.Col1+'~'+t1.Col2+'~',t2.Col1,'~'),t2.Col2,'~')='~~~~~'

    Go
    Drop Table Tmp1
    Go
    Col1 Col2 Col1 Col2
    --------------- --------------- --------------- ---------------
    age > 50 sex = 'M' sex = 'M' age > 50
    A B B A

  • Jeremy,

    I have a fast idea which is not very clean because it involves a cursor but it seems to work.

    The idea is to make a double transformation. here is the script:

    create table #t (id int, a varchar(50), b varchar(50), c varchar(50))

    create table #t1 (id int, ord int, col varchar(50), val varchar(50))

    create table #t2 (id int, a varchar(50), b varchar(50), c varchar(50))

    --initial feed

    insert into #t values (1, 'A', 'B', 'C')

    insert into #t values (2, 'B', 'C', 'A')

    --first transformation

    insertinto #t1

    selectid,

    0,

    'a',

    a

    from#t

    union all

    selectid,

    0,

    'b',

    b

    from#t

    union all

    selectid,

    0,

    'c',

    c

    from#t

    --this is just to getting the right order

    --because you cannot put any order by to an updateable cursor

    create clustered index i1 on #t1(id, val)

    declare@i int,

    @id int,

    @idBase int

    declareC cursor for select id from #t1 for update

    open C

    fetch C into @id

    set @idBase = 0

    while @@fetch_status = 0

    begin

    if @id != @idBase

    begin

    set @idBase = @id

    set @i = 0

    end

    set @i = @i + 1

    update#t1 set ord = @i where current of C

    fetch C into @id

    end

    close C

    deallocate C

    --second transformation

    insert into #t2

    selectt11.id,

    t11.val,

    t12.val,

    t13.val

    from#t1 as t11 join #t1 as t12 on t11.id = t12.id

    join #t1 as t13 on t11.id = t13.id

    wheret11.ord = 1

    andt12.ord = 2

    andt13.ord = 3

    -- here is what you want to get

    select * from #t2

    From now on you can make on the #t2 table what ever you want (search for existence...)

    I hope it helps

    Bye

    Gabor



    Bye
    Gabor

  • Here's another solution using the CHECKSUM function.

    --

    -- Create a temp table variable for demonstration purposes.

    --

    declare @temp table

    (

    pk int identity primary key, -- I assume the table in question has an ID column...

    col1 varchar(20),

    col2 varchar(20),

    col3 varchar(20)

    )

    --

    -- Populate the temp table with some sample values, including duplicates.

    --

    insert @temp (col1, col2, col3) values ('A', 'B', 'C')

    insert @temp (col1, col2, col3) values ('E', 'F', 'G')

    insert @temp (col1, col2, col3) values ('H', 'I', 'J')

    insert @temp (col1, col2, col3) values ('Xx', 'Yy', 'Zz')

    insert @temp (col1, col2, col3) values ('age > 50', 'name = ''John Smith'' ', 'ssn = 123121234')

    insert @temp (col1, col2, col3) values ('B', 'A', 'C') -- Dupe

    insert @temp (col1, col2, col3) values ('C', 'B', 'A') -- Dupe

    insert @temp (col1, col2, col3) values ('Zz', 'Yy', 'Xx') -- Dupe

    insert @temp (col1, col2, col3) values ('name = ''John Smith'' ', 'age > 50', 'ssn = 123121234') -- Dupe

    --

    -- We might occasionally get an overflow from adding checksum values.

    -- We need to temporarily ignore them.

    --

    set arithabort off

    set arithignore on

    --

    -- Use the CHECKSUM function to detect columns with duplicate values.

    -- Usually checksum is used like: checksum(col1, col2, col3), but that

    -- would not work here because the return value is dependent on column order.

    --

    select t1.*,

    checksum(t1.col1) + checksum(t1.col2) + checksum(t1.col3) as chksum

    from @temp as t1

    where exists

    (

    select *

    from @temp as t2

    where ( checksum(t2.col1) + checksum(t2.col2) + checksum(t2.col3) ) =

    ( checksum(t1.col1) + checksum(t1.col2) + checksum(t1.col3) )

    and t1.pk <> t2.pk

    )

    order by chksum, pk

    --

    -- Return settings to normal.

    --

    set arithabort on

    set arithignore off

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply