March 27, 2012 at 9:41 am
I have two tables that I need to compare columns in. This is the design:
Both tables have the same field (column b in table 1, column cb in table 2). b is varchar(12), and will contain both numbers and alpha characters. cb is int, and only contains numbers.
I cannot compare the columns directly, because of the data type differences. So I've tried CAST and CONVERT - both of these work around the solution, but the run time is extremely long.
SELECT column1, column2, b,
(SELECT count(index) from TABLE2 WHERE cast(cb as varchar(12) = b) as count
FROM TABLE1
I've looked at adding a join to TABLE1 a second time, and only pull in those records where the column B only has numbers, but then I'm missing some of the data that I'd expect to retrieve.
I'm wondering if there is any other solutions that I might be overlooking. If you need any additional details, please ask. Thanks for all the help in advance!
March 27, 2012 at 9:46 am
Could you please post DDL and some data sample as per http://www.sqlservercentral.com/articles/Best+Practices/61537/
What exact output you watt to see?
How large your tables are?
March 27, 2012 at 10:23 am
While you are working on getting us proper DDL for the 2 tables, can you also clarify your SQL statement?
cast(cb as varchar(12) = b)
is not syntactically correct and will not run.
Jared
CE - Microsoft
March 27, 2012 at 10:25 am
Sure, here is some dummy data and the query that would provide my expected results:
--===== If the test table already exists, drop it
IF OBJECT_ID('TempDB..#mytable1','U') IS NOT NULL
DROP TABLE #mytable1
IF OBJECT_ID('TempDB..#mytable2','U') IS NOT NULL
DROP TABLE #mytable2
--===== Create the test table with
CREATE TABLE #mytable1(
col1 varchar(12),
col2 varchar(12),
b varchar(12))
INSERT INTO #mytable1(col1, col2, b)
SELECT 'test', 'test', '12345' UNION ALL
SELECT 'a', 'a', '23456' UNION ALL
SELECT 'cat', 'cat', '09876' UNION ALL
SELECT 'bat', 'bat', '22222' UNION ALL
SELECT 'dog', 'dog', '11111' UNION ALL
SELECT 'frog', 'frog', '33333' UNION ALL
SELECT 'weird', 'weird', 'IOK11'
CREATE TABLE #mytable2(
ind int IDENTITY(1,1),
cb int)
SET IDENTITY_INSERT #mytable2 ON
INSERT INTO #mytable2(ind, cb)
SELECT 1, '12345' UNION ALL
SELECT 2, '23456' UNION ALL
SELECT 3, '09876' UNION ALL
SELECT 4, '22222' UNION ALL
SELECT 5, '11111' UNION ALL
SELECT 6, '33333' UNION ALL
SELECT 7, '22222' UNION ALL
SELECT 8, '33333' UNION ALL
SELECT 9, '22222' UNION ALL
SELECT 10, '12345'
select col1, col2, b,
(Select count(ind) from #mytable2 where cast(cb as varchar(12)) = b) as cnt
FROM #mytable1
Regarding the size of my tables, they are quite large, over 1.5 million rows each.
March 27, 2012 at 10:37 am
Using cast or convert won't make any differencr performance wise.
And unfortunately using any of them in join or where clause will panalize you for performance.
Why don't you change the datatype of int to varchar?
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
March 27, 2012 at 10:44 am
Unfortunately, I cannot change the datatype. But that gave me a thought of potentially warehousing the data in that table overnight, and then having that warehoused datatype being varchar(12). Then, I could just change my query to look at the table with the varchar, and I'd hope my performance would be better.
If anyone has any other ideas, please feel free to chime in.
Thanks,
Jamey
March 27, 2012 at 10:48 am
S_Kumar_S (3/27/2012)
Using cast or convert won't make any differencr performance wise.
That's not true...
Jared
CE - Microsoft
March 27, 2012 at 11:24 am
There is one thing we know for sure and that is that there will be no match if the data in b is alphanumeric. Try this and see if it helps. Note: if you have characters like $ you might need to expand the check a bit.
(CASE WHEN IsNumeric(b) = 0 THEN 0 ELSE (Select count(ind) from #mytable2 where cb = cast(b as int)) END)
The sample is too small to test well enough for speed.
March 27, 2012 at 11:31 am
SELECT mt1.col1,
mt1.col2,
mt1.b,
cnt.countInd
FROM #mytable1 mt1
CROSS APPLY (SELECT COUNT(ind) countInd
FROM #mytable2
WHERE CAST(cb AS VARCHAR(12)) = b) cnt
Jared
CE - Microsoft
March 27, 2012 at 11:33 am
Oh another thing I just noticed which puts what I say ahead is this. You logged 09876 for cat in both tables, but if you store it in int the 0 is stripped off. From there when you compare 9876 to 09876 as varchar they are not equal. So the question is, should they be? If so then you will need to either strip the leading zero from b to be sure or convert b to int.
Also here is another possibility.
select col1, col2, b,
SUM(CASE WHEN B.cb IS NOT NULL THEN 1 ELSE 0 END) as cnt
FROM #mytable1 A
LEFT JOIN
#mytable2 B
ON
(CASE WHEN IsNumeric(b) = 0 THEN 0 ELSE CASE WHEN cb = cast(b as int) THEN 1 ELSE 0 END END) = 1
group by
col1, col2, b
March 27, 2012 at 12:15 pm
Thanks so much everyone. Really appreciate the help and the fresh perspectives on my issue. I will not have leading zeros, but it's something good to think about in the future.
Thanks!
March 28, 2012 at 2:05 am
Hi
Can you pls guide to some link that shows this difference. Most of the places I know just say that the performance is similar. e.g this one:
http://www.informit.com/articles/article.aspx?p=31283&seqNum=4
SQLKnowItAll (3/27/2012)
S_Kumar_S (3/27/2012)
Using cast or convert won't make any differencr performance wise.That's not true...
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
March 28, 2012 at 6:18 am
S_Kumar_S (3/28/2012)
HiCan you pls guide to some link that shows this difference. Most of the places I know just say that the performance is similar. e.g this one:
http://www.informit.com/articles/article.aspx?p=31283&seqNum=4
SQLKnowItAll (3/27/2012)
S_Kumar_S (3/27/2012)
Using cast or convert won't make any differencr performance wise.That's not true...
Why do you need a link when you can test? This is just one way to test and look at different metrics.
USE tempdb
CREATE TABLE testDate (testcol datetime)
INSERT INTO testDate
SELECT TOP 1000000 GETDATE()
FROM sys.columns c1
CROSS JOIN sys.columns c2
CROSS JOIN sys.columns c3
--RUN separately
SELECT CAST(DATEDIFF(d,0,testcol) AS datetime)
FROM testDate
SELECT CONVERT(datetime, datediff(d,0, testcol),126)
FROM testDate
--DROP TABLE testDate
Jared
CE - Microsoft
March 28, 2012 at 7:41 am
SQLKnowItAll (3/28/2012)
S_Kumar_S (3/28/2012)
HiCan you pls guide to some link that shows this difference. Most of the places I know just say that the performance is similar. e.g this one:
http://www.informit.com/articles/article.aspx?p=31283&seqNum=4
SQLKnowItAll (3/27/2012)
S_Kumar_S (3/27/2012)
Using cast or convert won't make any differencr performance wise.That's not true...
Why do you need a link when you can test? This is just one way to test and look at different metrics.
OK, I have to agree with Kumar either one can be used without a performance hit. The stats are always the same for either one. They produce the same reads/writes and generate the same execution plan. There is no difference between the two with performance. The same arguement has been made about IsNull and Coalesce. Other than parameter options they have the same performance.
March 28, 2012 at 7:45 am
I too executed your query several times on a development server. Depending on load, either of them worked faster but the tests never convinced me that one is faster than other....
May be a stand alone machine might show some standard result but I trust still the performance difference will be negligible...
Antares686 (3/28/2012)
SQLKnowItAll (3/28/2012)
S_Kumar_S (3/28/2012)
HiCan you pls guide to some link that shows this difference. Most of the places I know just say that the performance is similar. e.g this one:
http://www.informit.com/articles/article.aspx?p=31283&seqNum=4
SQLKnowItAll (3/27/2012)
S_Kumar_S (3/27/2012)
Using cast or convert won't make any differencr performance wise.That's not true...
Why do you need a link when you can test? This is just one way to test and look at different metrics.
OK, I have to agree with Kumar either one can be used without a performance hit. The stats are always the same for either one. They produce the same reads/writes and generate the same execution plan. There is no difference between the two with performance. The same arguement has been made about IsNull and Coalesce. Other than parameter options they have the same performance.
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply