May 30, 2017 at 5:59 pm
High everyone,
I've a simple table that holds UK postcodes and I want to do a search on the postcode. The thing is the same postcode could be stored in different formats due to who ever imputed the data, as an example the postcode 'SW183FD' could appear as 'SW18 3FD' or 'SW183FD' or 'SW 183 FD'. I know I could have set rules in the data entry coding to take out the spaces before the postcode is saved to the table, which would have made searching easier, unfortunately I've inherited the table from someone else and there was no such ruling in place.
I did try 'SELECT * FROM POSTCODES WHERE [POSTCODE} LIKE '%SW183FD%' but that obviously doesnt work. Can anyone suggest a way round this?
May 30, 2017 at 6:11 pm
There was another topic similar to this one recently, but I can't find it.
If it were me, I'd create a persisted computed column (populated with the spaces removed) query it instead. I don't know much about UK postal codes other than what I've read online, so I picked a length of 12 for the length and inserted a bunch of different possibilities for the data.
IF OBJECT_ID('tempdb.dbo.Zips', 'u') IS NOT NULL DROP TABLE dbo.Zips;
CREATE TABLE dbo.Zips (
ID Integer identity(1, 1),
CONSTRAINT Zips_PK PRIMARY KEY (ID),
PostalCode Varchar(12),
CleanedCode AS REPLACE(PostalCode, ' ', '') PERSISTED );
INSERT INTO dbo.Zips(PostalCode)
VALUES('410000'),
('100 000'),
(' 222 333 '),
('N6F 8D1'),
('SW 18 3FD'),
('SW 18 3F D'),
('SW1 83 FD'),
('SW18 3FD'),
('SW1 8 3FD');
Then you can query against CleanedCode using search values without spaces.
May 30, 2017 at 6:37 pm
Thanks Ed for that snippet, I can see many uses for it, unfortunately it's not quite what I was looking for.
May 31, 2017 at 12:28 am
Ed's idea of a new persisted column is your best bet. You can then even index the column for better performance when seraching.
But, it seems like you are looking for a non-performant query to do the work. So here goes
The fact that you are doing a wildcard search at the start of your string means that SQL has to do a full table/index scan to find your data anyway.
SELECT *
FROM POSTCODES
WHERE REPLACE([POSTCODE], ' ', '') LIKE '%SW183FD%';
May 31, 2017 at 5:16 am
DesNorton - Wednesday, May 31, 2017 12:28 AMEd's idea of a new persisted column is your best bet. You can then even index the column for better performance when seraching.But, it seems like you are looking for a non-performant query to do the work. So here goes
The fact that you are doing a wildcard search at the start of your string means that SQL has to do a full table/index scan to find your data anyway.
SELECT *
FROM POSTCODES
WHERE REPLACE([POSTCODE], ' ', '') LIKE '%SW183FD%';
Thanks, Des.
Mick, what do you have in mind?
Whatever solution you end up with, look at the reads and execution plans of the candidates. If you're searching for addresses, then it probably isn't a simple lookup table and you're going to have a lot of rows. Your solution should scale well so it doesn't have to be rewritten later.
May 31, 2017 at 9:24 am
Many thanks gentlemen, both solutions look good to me. I'm still learning SQL and these snippets are a useful way of adding to my knowledge. I'm going to try both methods on my project to see which performs better, before settling on the solution. I'm grateful for the quick response to my question.
Regards
Mick
May 31, 2017 at 11:29 am
mick burden - Wednesday, May 31, 2017 9:24 AMMany thanks gentlemen, both solutions look good to me. I'm still learning SQL and these snippets are a useful way of adding to my knowledge. I'm going to try both methods on my project to see which performs better, before settling on the solution. I'm grateful for the quick response to my question.Regards
Mick
Don't forget to add a nonclustered index to the persisted computed column. I expect it'll fly.
May 31, 2017 at 11:41 am
Will do, thanks Ed
May 31, 2017 at 12:10 pm
Ed Wagner - Wednesday, May 31, 2017 5:16 AMDesNorton - Wednesday, May 31, 2017 12:28 AMEd's idea of a new persisted column is your best bet. You can then even index the column for better performance when seraching.But, it seems like you are looking for a non-performant query to do the work. So here goes
The fact that you are doing a wildcard search at the start of your string means that SQL has to do a full table/index scan to find your data anyway.
SELECT *
FROM POSTCODES
WHERE REPLACE([POSTCODE], ' ', '') LIKE '%SW183FD%';Thanks, Des.
Mick, what do you have in mind?
Whatever solution you end up with, look at the reads and execution plans of the candidates. If you're searching for addresses, then it probably isn't a simple lookup table and you're going to have a lot of rows. Your solution should scale well so it doesn't have to be rewritten later.
I'd suggest the indexed computed column as well.
However, if we go the route searching on the original un-scrubbed column, and we're willing to assume that the users are not going to enter leading spaces, then leaving off the % at the front of the LIKE expression will at least perform a partial index scan.
SELECT *
FROM POSTCODES
WHERE [POSTCODE] LIKE 'SW%183%FD%';
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
May 31, 2017 at 12:49 pm
mick burden - Tuesday, May 30, 2017 5:59 PMHigh everyone,
I've a simple table that holds UK postcodes and I want to do a search on the postcode. The thing is the same postcode could be stored in different formats due to who ever imputed the data, as an example the postcode 'SW183FD' could appear as 'SW18 3FD' or 'SW183FD' or 'SW 183 FD'. I know I could have set rules in the data entry coding to take out the spaces before the postcode is saved to the table, which would have made searching easier, unfortunately I've inherited the table from someone else and there was no such ruling in place.
I did try 'SELECT * FROM POSTCODES WHERE [POSTCODE} LIKE '%SW183FD%' but that obviously doesnt work. Can anyone suggest a way round this?
You could still add rules to strip the spaces before saving the postcode.
Then fix the existing data
UPDATE POSTCODES
SET [POSTCODE] = REPLACE([POSTCODE], ' ', '');
If you don't already have one, add an index on the [POSTCODE] column.
Finally, you can have pretty decent index seeks with something like this
SELECT *
FROM POSTCODES
WHERE [POSTCODE] = 'SW183FD';
OR
SELECT *
FROM POSTCODES
WHERE [POSTCODE] LIKE 'SW18%';
June 1, 2017 at 12:46 am
Thank you again Gentlemen, all those suggestions are great and gives me more to play with. I hadn't realised I could have fixed the 'problem' data as easy as that. It's one of the great things about this forum, that not only do you members post possible solutions but you add why this would work.
Regards Mick
June 1, 2017 at 8:35 am
mick burden - Tuesday, May 30, 2017 6:37 PMThanks Ed for that snippet, I can see many uses for it, unfortunately it's not quite what I was looking for.
Not sure what you're looking for then, Mick. Would you clarify, please?
As a bit of a sidebar, I use the same wonderful trick for many things. You just need a function to convert (delete the spaces, whatever) what the user inputs as search criteria and the search will be lightning fast.
{Edit} Guess I posted the nearly the same time as you just did. Are you all set for sure?
--Jeff Moden
Change is inevitable... Change for the better is not.
June 1, 2017 at 8:40 am
Yes Jeff, Thanks
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply