March 19, 2002 at 1:14 pm
I am using a system to determine a country name from a telephone prefix. Right now I cursor through each of the prefixes in my codes table and update my other table with the country name. Like so:
declare x cursor for select prefix, country from countrycodes order by prefix desc
open x
while 1=1 begin
fetch next from x into @prefix, @country
if (@@fetch_status<>0) break
update calltable set country=@country where left(callnumber,len(@prefix))=@prefix and country is null
end
close x deallocate x
This is a very slow process when you are dealing with millions of records. How can I eliminate the cursor from this scenario?
March 19, 2002 at 1:24 pm
This should do it.
UPDATE calltable
SET ct.country = cc.country
FROM calltable ct
INNER JOIN
countrycodes cc
ON
left(ct.callnumber,len(cc.prefix)) = cc.prefix
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
March 20, 2002 at 9:10 am
I am having a little trouble getting this to work nicely. It is actually running about twice as long as the cursor. Also it is defaulting to the proper countrycode. Here is a sample of the prefix table (sorted descending)
998 Uzbekistan
996 Kyrgyzstan
995328 Georgia - Audiotext
995325 Georgia - Audiotext
995322 Georgia - Audiotext
995321 Georgia - Audiotext
995 Georgia
994 Azerbaijan
993 Turkmenistan
992 Tajikistan
989 Iran - Cellular
98 Iran
9779 Nepal - Cellular
977 Nepal
976 Mongolia
975 Bhutan
9745 Qatar - Cellular
97422 Qatar - Cellular
974 Qatar
9739 Bahrain - Cellular
973 Bahrain
9729 Israel - Cellular
97264 Israel - Cellular
9725 Israel - Cellular
9723 Israel - Tel Aviv
972 Israel
97179 United Arab Emirates - Cellular
97150 United Arab Emirates - Cellular
97144 United Arab Emirates - Cellular
971 United Arab Emirates
968 Oman
96779 Yemen - Cellular
967179 Yemen - Cellular
967 Yemen
966630 Saudi Arabia - Cellular
9665 Saudi Arabia - Cellular
966430 Saudi Arabia - Cellular
9663 Saudi Arabia - Dahran
9662 Saudi Arabia - Mecca
Let's say the actual country is Qatar - Cellular, the system is automatically defaulting to Qatar. The way I controlled this in the cursor was by cursoring through the prefixes one at a time in descending order and only updating null values. By this I would 'check off' the breakouts so to speak.
March 20, 2002 at 9:22 am
Can you post the DDL of the tables so I can understand the structure. Also cursors can sometimes actually perform better than straight code but it is hard to tell what I am truely looking at.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
March 20, 2002 at 9:28 am
Um, I am kinda a newby. I am not sure how to post the DDL.
March 20, 2002 at 9:57 am
Quickest way to get is open enterprise manager and drill to your db and the tables in question. Right click on each table an choose copy, then paste into notepad or something to old text while you get the other tabel as well, then just paste into forum reply screen.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
March 20, 2002 at 10:16 am
This is what I got. I scripted the tables out. I hope this is what you needed. They are in separate databases.
if exists (select * from sysobjects where id = object_id(N'[dbo].[CallTable]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[CallTable]
GO
CREATE TABLE [dbo].[CallTable] (
[recid] [int] IDENTITY (1, 1) NOT NULL ,
[CallDateTime] [datetime] NULL ,
[StatsDate] [datetime] NULL ,
[Inbound] [char] (10) NULL ,
[CustomerRatePlan] [varchar] (50) NULL ,
[Outbound] [char] (10) NULL ,
[ProviderRatePlan] [varchar] (50) NULL ,
[CallNumber] [nvarchar] (50) NULL ,
[StatsCountry] [varchar] (100) NULL ,
[CallANI] [nvarchar] (50) NULL ,
[CallDuration] [numeric](10, 0) NULL ,
[StatsMinutes] [numeric](18, 1) NULL ,
[Importfile] [varchar] (20) NULL ,
[CallClearingCause] [nvarchar] (50) NULL ,
[Unused] [nvarchar] (50) NULL ,
[ProviderMinutes] [numeric](18, 1) NULL ,
[CustomerRate] [numeric](18, 4) NULL ,
[ProviderRate] [numeric](18, 4) NULL ,
[R1] [numeric](18, 0) NULL ,
[R2] [numeric](18, 0) NULL ,
[R3] [numeric](18, 0) NULL ,
[R4] [numeric](18, 0) NULL ,
[ProviderCountry] [varchar] (100) NULL
)
GO
if exists (select * from sysobjects where id = object_id(N'[dbo].[CountryCodes]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[CountryCodes]
GO
CREATE TABLE [dbo].[CountryCodes] (
[Prefix] [char] (25) NOT NULL ,
[Country] [varchar] (100) NULL
)
GO
March 20, 2002 at 12:44 pm
This might be a dumb question, but on your table defs I can't see any indexes, you do have some don't you?
Assuming you have indexes I notice you're using a LEFT function in the where clause. Im not too sure in SQLSERVER, but in ORACLE using any form of modification on a WHERE value means the query engine does a full table scan, rather than use an index.
I have found it is possible to please all of the people all of the time if you do exactly what they want. Harold Macmillan 1961
March 20, 2002 at 1:00 pm
Yes, I have a unique clustered index on recid in the calltable, and index on the callnumber. I created a temp table for the countrycodes table which I tried to use many different types of indexes on the prefix field. A clustered index will not allow me to sort descending. A unique nonclustered index allows this.
I just did a test using a select instead of a update statement and I can see whre a lot of my problem lies:
22505661512 Ivory Coast 225
22505661512 Ivory Coast - Cellular 2250
22505661700 Ivory Coast 225
22505661700 Ivory Coast - Cellular 2250
22505661700 Ivory Coast 225
22505661700 Ivory Coast - Cellular 2250
22505662952 Ivory Coast 225
22505662952 Ivory Coast - Cellular 2250
22505663149 Ivory Coast 225
22505663149 Ivory Coast - Cellular 2250
22505664393 Ivory Coast 225
22505664393 Ivory Coast - Cellular 2250
22505664750 Ivory Coast 225
22505664750 Ivory Coast - Cellular 2250
22505667742 Ivory Coast 225
22505667742 Ivory Coast - Cellular 2250
22505667742 Ivory Coast 225
22505667742 Ivory Coast - Cellular 2250
22505668198 Ivory Coast 225
22505668198 Ivory Coast - Cellular 2250
22505668949 Ivory Coast 225
22505668949 Ivory Coast - Cellular 2250
22505669560 Ivory Coast 225
22505669560 Ivory Coast - Cellular 2250
22505670574 Ivory Coast 225
22505670574 Ivory Coast - Cellular 2250
22505673137 Ivory Coast 225
22505673137 Ivory Coast - Cellular 2250
22505675454 Ivory Coast 225
22505675454 Ivory Coast - Cellular 2250
22505678321 Ivory Coast 225
22505678321 Ivory Coast - Cellular 2250
22505678622 Ivory Coast 225
22505678622 Ivory Coast - Cellular 2250
22505678800 Ivory Coast 225
22505678800 Ivory Coast - Cellular 2250
22505679056 Ivory Coast 225
22505679056 Ivory Coast - Cellular 2250
22505681469 Ivory Coast 225
22505681469 Ivory Coast - Cellular 2250
22505682320 Ivory Coast 225
The join is causing it to find every instance which matches the phone number. If there is more than one prefix per country then I will get multiple instances of each. Any idea how to get around that?
March 20, 2002 at 1:08 pm
Crosspatch is right as I see no index, also the other problem with having to do the left is the size will vary based on the other tables value. I also noted you have items like
995321
995325
995328
and
995
so the update I wrote may actually be causing more updates than needs if the left value matches 995321 it will also match 995 so I have to think around this. I have SQL do work fine with LEFT function on indexes but it does an index scan instead of seek (which can be faster that a table scan but not as efficient as seek). I will sugest that if you don't have indexes at least put one clustered on your CountryCodes table on the column Prefix and a Non-Clustered on CallTable on the column callnumber can be helpfull.
Then here is a thought and a question with my answer. Does the column country in calltable start out null if does then you could put a Non-Clustered index on it as well then add the line
WHERE ct.country IS NULL
to speed up the update, then to make sure if a number changes you could have a trigger that sets the country = null on any record where column callnumber is changed so the update will fix.
Now on the subject of trigger if you do that then you could create updates for on UPDATE to process these as they are entered and not have to worry about the update it any other time. Something like this for updates
CREATE TRIGGER tr_SetCountry ON dbo.CallTable
FOR UPDATE
AS
IF UPDATE(callnumber)
BEGIN
UPDATE calltable
SET ct.country = cc.country
FROM calltable ct
INNER JOIN
countrycodes cc
ON
left(ct.callnumber,len(cc.prefix)) = cc.prefix AND
ct.recid in (SELECT recid FROM inserted)
END
Let me know what path helps or anything else that you question.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
March 20, 2002 at 1:11 pm
You apparently posted at the same time I was, I am trying to think about that to make sure it is covered.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
March 20, 2002 at 1:37 pm
Wow, you guys are really working hard on this!! Thanks so much for your help. Yes, the the callcountry column starts out as null. I did try the WHERE ct.callnumber is null thing and saw no improvement. I thought it may fix the problem since this is how I fixed my cursor to work correctly.
March 20, 2002 at 2:07 pm
Still thinking about the problem with the multiple problem which is the base issue and will slow you down. I will try a few things later and let you know what I think will work. Anyone else fell free to join in.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
March 20, 2002 at 7:22 pm
Ok this was really fun and here is how I figured to get around the duplicates
UPDATE calltable
SET ct.country = cc.country
FROM
calltable ct
INNER JOIN
countrycodes cc
ON
left(ct.callnumber,len(cc.prefix)) = cc.prefix
WHERE
cc.prefix = (SELECT TOP 1 pt.prefix FROM countrycodes pt WHERE pt.prefix = left(ct.callnumber,len(pt.prefix)) ORDER BY prefix DESC)
Which you can test the difference with the SELECT version if you prefer. I did have a non-clustered index on callnumber and a clustered index on prefix but unfortunately did not try your way of non-clustered sorted descending (may improve performance).
I am posting my execution plan to show that the indexes are being used (I'll leave it until I need to conserve space on my geocities account).
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
Edited by - antares686 on 03/20/2002 7:28:13 PM
March 21, 2002 at 5:00 am
Hit this about 30 minutes after going to bed last night and see it much better than my previous query.
UPDATE calltable SET ProviderCountry =
(SELECT TOP 1 country FROM countrycodes WHERE prefix = left(callnumber,len(prefix)) ORDER BY prefix DESC)
The I decided to add a non-clustered index to ProviderCountry which on an initial UPDATE when it was not set does not help. But on subsequent runs when I add more data doing
WHERE ProviderCountry IS NULL
really helped get it done faster than just doing the straight update on all records again. Run both as a select and look at the execution plan, the major difference is I get rid of several steps with the join method that I didn't need. To do as a select just do
SELECT recid, callnumber,
(SELECT TOP 1 country FROM countrycodes WHERE prefix = left(callnumber,len(prefix)) ORDER BY prefix DESC) AS Country
FROM calltable
I cannot see much more than can be done to this to improve and with it I recommend having an index as follows
CountryCodes = Clustered PirmaryKey Index on Prefix which will be utilized when looking for where prefix = left(callnumber,len(prefix))
CallTable = NonClustered NonUnique index on ProviderCountry which will help with added records later, an index on callnumber will not bennifit this query and does not have to exist if you do not need for any other query specifically, if so I suggest don't waste the space for a callnumber index.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply