September 18, 2003 at 8:33 am
I have a table with 5 columns which contains values entered by the users.
The problem I have is that the users might enter the values in any order but I need to work out whether the same set of values has already been entered.
For example:
col1 col2 col3
A B C
B A C
These two lines are the same as far as the application is concerned. How can I 'sort the columns' to that I can get two rows like:
A B C
A B C
I have an idea involving a case statement and another idea normalising the data into a temporary table and then using an order by but I'm not sure whether it will acutally work or whether there is a very simple solution that somebody knows about.
Jeremy
September 18, 2003 at 8:51 am
Hi Jeremy,
quote:
I have a table with 5 columns which contains values entered by the users.The problem I have is that the users might enter the values in any order but I need to work out whether the same set of values has already been entered.
For example:
col1 col2 col3
A B C
B A C
These two lines are the same as far as the application is concerned. How can I 'sort the columns' to that I can get two rows like:
A B C
A B CI have an idea involving a case statement and another idea normalising the data into a temporary table and then using an order by but I'm not sure whether it will acutally work or whether there is a very simple solution that somebody knows about.
does this need to be done at the server, or isn't this validation which could be done at the client.
What do you need it for?
Frank
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
September 18, 2003 at 9:01 am
It is something that should have been done on the client at the time they entered the data but hindsight is a wonderful thing.
I need to do this within SQL Server.
The applications uses the data to, essentially, create a where clause for an sql statement and the values contain both the column/value/operator which I build in a statement.
For example:
Col1 Col2
sex = 'M' age > 50
age > 50 sex = 'M'
If I create an SQL statement from each line:
select *
from table
where
sex = 'M'
and
age > 50
select *
from table
where
age > 50
and
sex = 'M'
These two SQL Statements will return the exactly the same set of results but the way the 'criteria' is stored in the table is different.
Jeremy
September 18, 2003 at 9:11 am
The simplest way would be to use a UDF or UDF's, passing in all 3 columns, and getting it/them to return the columns in the correct order.
Depending on how you are going to use the data, you could use a single Table valued UDF to return 3 columns, or write 3 separate UDF's so that each will return a single value.
September 18, 2003 at 9:40 pm
if columns have only A,B,C VALUES (Or limited variations)
if the problem is so simple; i offer u to use some kind of mathematical Total Approach.
Hope this Helps.
A=1
B=25
C=1000
select
ID,
CASE COL1
WHEN 'A' then 1
WHEN 'B' THEN 25
WHEN 'C' THEN 1000
END +
CASE COL2
WHEN 'A' then 1
WHEN 'B' THEN 25
WHEN 'C' THEN 1000
END +
CASE COL3
WHEN 'A' then 1
WHEN 'B' THEN 25
WHEN 'C' THEN 1000
END as MATHEMATICAL_TOTAL
FROM XXXXX_OPTIONS_TABLE
September 19, 2003 at 4:22 am
I agree with alkanfer and ian ...U can use a simple query with case statements in case the data u r trying to fetch is limited to less number of columns or the amount of data is less...else u can implement the logic in a UDF and use it to give the expected output if the amt of data is more.
Cheers!
Arvind
Arvind
September 19, 2003 at 5:28 am
Hmm.. and what if one user writes age > 50 and another one age>50 (without spaces around operator)? It is still the same condition, but unless you set up the function to trap it, it will be considered as two different values.
I'm still not sure what you want to do - check and correct existing data, while new data will be already checked on client side? Perform the check periodically? Or do you want to start this check as soon as someone is trying to enter new conditions, and in case such conditions are already stored, make sure that the stored set is used instead of the freshly entered (i.e., prevent insert of a "duplicate" condition)?
September 19, 2003 at 5:44 am
Vladan,
I wanted to check existing data because of a problem we had with the production system.
We solved it in the end by manually rearranging the columns in excel and doing it that way.
I will be re-writing that part of the application and will put in the proper checking in the new screen.
Thanks for all the ideas.
Jeremy
Edited by - Jeremy Kemp on 09/19/2003 05:44:35 AM
September 19, 2003 at 6:00 am
A little bit late!
Set NoCount On
Create Table Tmp1(Col1 Varchar(15),Col2 Varchar(15))
Go
Insert Tmp1
Select 'sex = ''M''','age > 50'
Insert Tmp1
Select 'age > 55','sex = ''F'''
Insert Tmp1
Select 'age > 50','sex = ''M'''
Insert Tmp1
Select 'age > 55','sex = ''M'''
Insert Tmp1
Select 'A','B'
Insert Tmp1
Select 'B','A'
Insert Tmp1
Select 'D','B'
Go
-- Warning ~ must not be part of the data, is used as a delimiter
Select t1.*,t2.*
From Tmp1 as t1
Cross Join Tmp1 as t2
Where t1.Col1<T2.Col1 And
Replace(Replace('~'+t1.Col1+'~'+t1.Col2+'~',t2.Col1,'~'),t2.Col2,'~')='~~~~~'
Go
Drop Table Tmp1
Go
Col1 Col2 Col1 Col2
--------------- --------------- --------------- ---------------
age > 50 sex = 'M' sex = 'M' age > 50
A B B A
September 19, 2003 at 6:53 am
Jeremy,
I have a fast idea which is not very clean because it involves a cursor but it seems to work.
The idea is to make a double transformation. here is the script:
create table #t (id int, a varchar(50), b varchar(50), c varchar(50))
create table #t1 (id int, ord int, col varchar(50), val varchar(50))
create table #t2 (id int, a varchar(50), b varchar(50), c varchar(50))
--initial feed
insert into #t values (1, 'A', 'B', 'C')
insert into #t values (2, 'B', 'C', 'A')
--first transformation
insertinto #t1
selectid,
0,
'a',
a
from#t
union all
selectid,
0,
'b',
b
from#t
union all
selectid,
0,
'c',
c
from#t
--this is just to getting the right order
--because you cannot put any order by to an updateable cursor
create clustered index i1 on #t1(id, val)
declare@i int,
@id int,
@idBase int
declareC cursor for select id from #t1 for update
open C
fetch C into @id
set @idBase = 0
while @@fetch_status = 0
begin
if @id != @idBase
begin
set @idBase = @id
set @i = 0
end
set @i = @i + 1
update#t1 set ord = @i where current of C
fetch C into @id
end
close C
deallocate C
--second transformation
insert into #t2
selectt11.id,
t11.val,
t12.val,
t13.val
from#t1 as t11 join #t1 as t12 on t11.id = t12.id
join #t1 as t13 on t11.id = t13.id
wheret11.ord = 1
andt12.ord = 2
andt13.ord = 3
-- here is what you want to get
select * from #t2
From now on you can make on the #t2 table what ever you want (search for existence...)
I hope it helps
Bye
Gabor
Bye
Gabor
September 19, 2003 at 11:28 am
Here's another solution using the CHECKSUM function.
--
-- Create a temp table variable for demonstration purposes.
--
declare @temp table
(
pk int identity primary key, -- I assume the table in question has an ID column...
col1 varchar(20),
col2 varchar(20),
col3 varchar(20)
)
--
-- Populate the temp table with some sample values, including duplicates.
--
insert @temp (col1, col2, col3) values ('A', 'B', 'C')
insert @temp (col1, col2, col3) values ('E', 'F', 'G')
insert @temp (col1, col2, col3) values ('H', 'I', 'J')
insert @temp (col1, col2, col3) values ('Xx', 'Yy', 'Zz')
insert @temp (col1, col2, col3) values ('age > 50', 'name = ''John Smith'' ', 'ssn = 123121234')
insert @temp (col1, col2, col3) values ('B', 'A', 'C') -- Dupe
insert @temp (col1, col2, col3) values ('C', 'B', 'A') -- Dupe
insert @temp (col1, col2, col3) values ('Zz', 'Yy', 'Xx') -- Dupe
insert @temp (col1, col2, col3) values ('name = ''John Smith'' ', 'age > 50', 'ssn = 123121234') -- Dupe
--
-- We might occasionally get an overflow from adding checksum values.
-- We need to temporarily ignore them.
--
set arithabort off
set arithignore on
--
-- Use the CHECKSUM function to detect columns with duplicate values.
-- Usually checksum is used like: checksum(col1, col2, col3), but that
-- would not work here because the return value is dependent on column order.
--
select t1.*,
checksum(t1.col1) + checksum(t1.col2) + checksum(t1.col3) as chksum
from @temp as t1
where exists
(
select *
from @temp as t2
where ( checksum(t2.col1) + checksum(t2.col2) + checksum(t2.col3) ) =
( checksum(t1.col1) + checksum(t1.col2) + checksum(t1.col3) )
and t1.pk <> t2.pk
)
order by chksum, pk
--
-- Return settings to normal.
--
set arithabort on
set arithignore off
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply