December 19, 2007 at 10:45 am
Im trying to find ssn's that do not have a complete number, i.e. 213 instead of 213-12-3456 ect. all the numbers i need to find start with and only have the first 3 characters, how can i write this in a like statement?
Would something like this work?
select first_name
,last_name
,government_id
from people
where government_id like '[0-9][0-9][0-9]'
December 19, 2007 at 11:03 am
Perhaps, something like this...
select first_name
,last_name
,government_id
from people
where government_id like '[0-9][0-9][0-9]-%'
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2007 at 11:05 am
The ssn im looking for have ONLY 3 digits, not the full 9. So i dont need to liook further than the first three digits?
December 19, 2007 at 11:15 am
Oh, sorry... yes, your query is the ticket... mine is wrong.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2007 at 1:38 pm
For the sake of discussion...
I have in the past done this:
select first_name
,last_name
,government_id
from people
where len(government_id ) < 9
Would this be faster, slower?
December 19, 2007 at 1:56 pm
The LIKE will be just a tiny bit faster in this case because it returns only the rows that have precisely 3 digits... the LEN()<9 will return things that have 1 or 2 characters (actually, up to 8 characters) which may be more rows. The two methods are not functionally identical and rely on the condition of the data to come up with the same answer.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2007 at 2:02 pm
Edited for correctness.
And if there's an index on government_id, the like might allow the optimiser to use it for a partial scan, the len (since it's a function on a column) will force a full scan.
Depending on a really large number of factors, including the data values in the column
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 19, 2007 at 2:12 pm
Actually, both will do an index scan, I believe...
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2007 at 2:20 pm
Not sure. I'll have to check. (Tomorrow. Is bed time here. I shouldn't post this late at night)
Since the like determines that the first character is between 0 and 9, the optimiser might be able to do a partial scan, depending, of course, on what else is in the table. (emphasis on might. Thinking a second time, I'm not sure it's that smart)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 19, 2007 at 2:53 pm
Heh... no need :blush: ... dunno what I was thinking about, but I was wrong... the LIKE will use and Index Seek... the LEN will not... here's the code...
DROP TABLE JBMTest
GO
SELECT TOP 1000000
RowNum = IDENTITY(INT,1,1),
SomeDigits = CAST(ABS(CHECKSUM(NEWID()))%50000+1 AS VARCHAR(10))
INTO dbo.JBMTest
FROM Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 12 seconds to execute.
ALTER TABLE dbo.JBMTest
ADD PRIMARY KEY CLUSTERED (RowNum)
--===== Add an index...
CREATE INDEX IX_JBMTest_SomeDigits
ON dbo.JBMTest (SomeDigits)
GO
SET STATISTICS TIME ON
SET STATISTICS IO ON
--===== This does not...
SELECT RowNum,SomeDigits
FROM dbo.JBMTest
WHERE LEN(SomeDigits) < 4
PRINT '----------------------------------------------'
--===== This uses an index seek
SELECT RowNum,SomeDigits
FROM dbo.JBMTest
WHERE SomeDigits LIKE '[0-9][0-9][0-9]'
BUT... take a look at the execution times and the CPU usage... the Index Seek is twice as slow as the Index Scan in this case (3 times slower if you look at the CPU time), even though the Index Scan returned more rows...
Surprise, surprise... in the presence of the "correct" index, LEN() took less time than the LIKE...
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 3 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(19773 row(s) affected)
Table 'JBMTest'. Scan count 1, logical reads 2324, physical reads 0, read-ahead reads 0.
SQL Server Execution Times:
CPU time = 281 ms, elapsed time = 503 ms.
SQL Server Execution Times:
CPU time = 281 ms, elapsed time = 503 ms.
SQL Server Execution Times:
CPU time = 281 ms, elapsed time = 503 ms.
----------------------------------------------
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(17723 row(s) affected)
Table 'JBMTest'. Scan count 1, logical reads 2324, physical reads 0, read-ahead reads 0.
SQL Server Execution Times:
CPU time = 797 ms, elapsed time = 984 ms.
SQL Server Execution Times:
CPU time = 797 ms, elapsed time = 984 ms.
SQL Server Execution Times:
CPU time = 797 ms, elapsed time = 984 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2007 at 3:04 pm
P.S.
Technically speaking, the LEN() example does, in fact, come up with the WRONG answer if you are looking for precisely 3 digits...
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2007 at 11:18 pm
Jeff Moden (12/19/2007)
BUT... take a look at the execution times and the CPU usage... the Index Seek is twice as slow as the Index Scan in this case (3 times slower if you look at the CPU time), even though the Index Scan returned more rows...
Interesting. Showing (again) that sometimes an index scan IS the best way to run a query.
Did you try that on 2000 or 2005? I want to take a look at the exec plan, see if I can figure out why the seek is taking longer. One thing I notice is that the reads for the two are the same. Implies that the 'seek' read all of the leaf pages, just like the scan did.
The seek would probably perform better than the scan if there were rows in the table that didn't start with a digit.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 20, 2007 at 10:01 am
It was in 2k... Heh... I'm thinking that there's a hidden "bookmark" that didn't show up in the EP. But, correct, it does show that just because an index is used, it doesn't always mean faster or better... can't put my fingers on the code just now, but I've seen where a Table Scan proved to be a bit faster than an Index Seek... of course, it wasn't a clustered index.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 20, 2007 at 10:56 am
I'll try it out in 2005, see if the results are different.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 29, 2008 at 4:49 am
Finally got round to trying on sql 2005. (SP 2 developer edition)
The LEN - index scan. Rated by the execution plan as 98% of the overall cost of the two queries
Table 'JBMTest'. Scan count 1, logical reads 2333, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 266 ms, elapsed time = 472 ms.
----------------------------------------------
The Like - index seek. Rated by the execution plan as 2% of the overall cost of the two queries
Table 'JBMTest'. Scan count 1, logical reads 2333, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 672 ms, elapsed time = 691 ms.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply