November 25, 2011 at 5:19 am
Hi,
I have 2 columns:
col1 col2
abc\aaa aaa\abc
ax\pq pqr\aaa
prq\aaa prt\arz
I want to query out results with maching characters :
Result
col1 col2
abc\aaa aaa\abc
prq\aaa pqr\aaa
Is there any way of doing this using SQL?
Regards,
Ankit
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”
November 25, 2011 at 6:33 am
November 25, 2011 at 11:07 am
Hi Gila,
Thanks for your reply. The data is normalized but I have concatenated fields separated by "/". I want to now populate ID to these fields with fields eg: [abc/pqr] = [pqr/abc].
I wanted to know if it was possible without splitting.
Eagerly waiting for your response.
Ankit
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”
November 28, 2011 at 2:33 am
SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/".
Oxymoron. The former or the latter, it can't be both at the same time.
I wanted to know if it was possible without splitting.
No, not possible. Why don't you want to split? You can do it on the fly in your query, you don't need to apply changes to your existing data.
-- Gianluca Sartori
November 28, 2011 at 3:14 am
The data is normalized but I have concatenated fields separated by "/".
Only one question. Why???
November 28, 2011 at 3:21 am
SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/"
Hope that CELKO doesn't see that comment 😉
SQL_By_Chance (11/25/2011)
I wanted to know if it was possible without splitting.
So you want to compare split parts of a string without splitting? I don't think you'll find a way to do that.
DECLARE @TABLE AS TABLE (col1 VARCHAR(7), col2 VARCHAR(7))
INSERT INTO @TABLE
SELECT 'abc\aaa', 'aaa\abc'
UNION ALL SELECT 'ax\pq', 'pqr\aaa'
UNION ALL SELECT 'prq\aaa', 'prt\arz'
UNION ALL SELECT 'axi\pq', 'prq\aaa'
;WITH CTE AS (
SELECT a.col1, b.col2
FROM @TABLE a
CROSS JOIN @TABLE b)
SELECT col1, col2
FROM CTE a
CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,
PATINDEX('%\%', col2) AS patternCol2,
LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2
FROM CTE
WHERE a.col1 = col1 AND a.col2 = col2) b
WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))
OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))
AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))
OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))
The above, which is not as efficient as the string splitter that you were pointed at, will compare the string parts with each other.
e.g. abc\aaa = aaa\abc but prq\aaa != pqr\aaa
November 28, 2011 at 3:42 am
Thanks Gina
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”
November 28, 2011 at 3:45 am
Cause the client requirement is like that 🙂 .
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”
November 28, 2011 at 3:48 am
Thanks Cadavre,
Wish you many more centuries for providing the solution.
Thanks,
Ankit
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”
November 28, 2011 at 4:21 am
SQL_By_Chance (11/28/2011)
Thanks Cadavre,Wish you many more centuries for providing the solution.
Thanks,
Ankit
Bear in mind that my solution is going to die a horrible death on a real table with even a few thousand rows. I'll set up a test to prove how poor the performance is later on.
This can be avoided by normalising the data.
Tests, as promised: -
BEGIN TRAN
SET NOCOUNT ON
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment
END
--1000 Random rows of data
SELECT TOP 1000 IDENTITY(INT,1,1) AS ID,
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)
+ '\' +
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col1,
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)
+ '\' +
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col2
INTO #testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
PRINT '========== BASELINE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT COUNT(*)
FROM #testEnvironment
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== CTE MATCH =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH CTE AS (
SELECT a.col1, b.col2
FROM #testEnvironment a
CROSS JOIN #testEnvironment b)
SELECT COUNT(*)
FROM (
SELECT col1, col2
FROM CTE a
CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,
PATINDEX('%\%', col2) AS patternCol2,
LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2
FROM CTE
WHERE a.col1 = col1 AND a.col2 = col2) b
WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))
OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))
AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))
OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))
) a
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
ROLLBACK
First, the above script tests 1000 rows.
========== BASELINE ==========
Table '#testEnvironment____________________________________________________________________________________________________000000000026'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
================================================================================
========== CTE MATCH ==========
Table '#testEnvironment____________________________________________________________________________________________________000000000026'. Scan count 19, logical reads 42, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 4, logical reads 13320, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2235 ms, elapsed time = 619 ms.
================================================================================
OK, not too terrible. Let's try 5,000 rows.
BEGIN TRAN
SET NOCOUNT ON
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment
END
--5000 Random rows of data
SELECT TOP 5000 IDENTITY(INT,1,1) AS ID,
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)
+ '\' +
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col1,
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)
+ '\' +
CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col2
INTO #testEnvironment
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
PRINT '========== BASELINE =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT COUNT(*)
FROM #testEnvironment
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
PRINT '========== CTE MATCH =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH CTE AS (
SELECT a.col1, b.col2
FROM #testEnvironment a
CROSS JOIN #testEnvironment b)
SELECT COUNT(*)
FROM (
SELECT col1, col2
FROM CTE a
CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,
PATINDEX('%\%', col2) AS patternCol2,
LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2
FROM CTE
WHERE a.col1 = col1 AND a.col2 = col2) b
WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))
OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))
AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))
OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))
) a
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
ROLLBACK
========== BASELINE ==========
Table '#testEnvironment____________________________________________________________________________________________________000000000027'. Scan count 1, logical reads 22, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
================================================================================
========== CTE MATCH ==========
Table '#testEnvironment____________________________________________________________________________________________________000000000027'. Scan count 19, logical reads 154, physical reads 0, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 4, logical reads 157464, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 74110 ms, elapsed time = 19701 ms.
================================================================================
Ack!! Not scaling well at all, as promised 😉
Try it yourself, but you'll find that this sort of problem has to be sorted out by normalisation rather than methods like this.
If you insist on sticking with this sort of design, then I think you'll always have massive performance degradation.
November 28, 2011 at 8:06 am
SQL_By_Chance (11/28/2011)
Cause the client requirement is like that 🙂 .
How often do I hear this but in reality it is not that the client requires the data to be stored a certain (and horrible) way. It is that the client wants it presented in a certain way. VERY rarely do clients even want to be involved at the data level let alone make system design decisions about how the data is stored. The reason they are hiring this work out is because they lack the expertise to do it themselves. If the client is mandating how you write your code it may be time to fire the client. 😛
Splitting this like Gianluca suggested is by far the best approach to this.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
November 28, 2011 at 10:24 am
Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”
November 28, 2011 at 10:52 am
SQL_By_Chance (11/28/2011)
Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)
There is nothing like normalized data, and your description is nothing like normalized data.
SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/".
Data in that type of format is not normalized at all and is a total PITA to deal with.
...the power of SQL vs .net
SQL is incredibly powerful at doing what it does best. .NET is incredibly powerful at doing what it does best. They do NOT do the same thing. That is why they are frequently used together to provide a complete solution. There are some things that can be done in .NET that should be done in SQL, and some things you can do in SQL should be left to .NET. The challenge is in knowing which technology is best suited for the task at hand.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
November 28, 2011 at 10:57 am
Sean Lange (11/28/2011)
SQL_By_Chance (11/28/2011)
Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)There is nothing like normalized data, and your description is nothing like normalized data.
SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/".Data in that type of format is not normalized at all and is a total PITA to deal with.
...the power of SQL vs .net
SQL is incredibly powerful at doing what it does best. .NET is incredibly powerful at doing what it does best. They do NOT do the same thing. That is why they are frequently used together to provide a complete solution. There are some things that can be done in .NET that should be done in SQL, and some things you can do in SQL should be left to .NET. The challenge is in knowing which technology is best suited for the task at hand.
+100 for it 😀
November 28, 2011 at 11:09 am
What I mean is the data is normalized in the table but I have created a derived column by concatenating the columns 🙂 .
I just wanted a way to match [pq/ab] = [ab/pq] that too out of curiosity other than using Array in .net. I apologise for my inability to express my hidden desires to you. But Cadaver's solution (no matter how under performing it was ) gave me a chance to think in that direction.
I can't express why exactly I asked the question. I think in weird directions 😛
http://www.mercurialneophyte.blogspot.com/
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply