July 12, 2002 at 12:25 pm
I'm creating a billing system for an international carrier's telephone switch. Each call that goes through the switch is logged in SQL server, in a table called DSC_Archive that includes the following fields, among others:
- OrigGrp (the ID of the customer initiating the call - i.e. BellSouth, Qwest, etc.)
- DialedNo (the number dialed)
- OrigDate (the date the call started)
- CallDur (the length of the call in minutes)
- RateID (the ID of the rate at which the call should be billed)
I also have a table called Rates, which holds the per-minute price charged each customer for each dialing area.
- RateID
- OrigGrp (the ID of the customer)
- Prefix (the first few digits of the dialed number, which identifies the dialing area)
- RegionName (the name of the dialing area)
What makes this problem challenging is that the dialing areas aren't mutually exclusive. For instance, you have these rates for a given customer:
RateID Prefix RegionName Rate
------ ------ ------------ ------
1727 62 Indonesia .17250
1728 6221 Indonesia - Jakarta .02870
1729 628 Indonesia - Mobile .12420
So if the number dialed is 628149234, the call is billed as "Indonesia - Mobile", while if the number dialed is 62595923, the call is billed as "Indonesia". The rule is that the _longest_ matching prefix determines the rate.
As the data is added, each call is tagged with the correct RateID. However, sometimes the Rates table is updated after the calls have been logged, and I need to go through the entire DSC_Archive table (which currently has 12 million rows, for three months' worth of practice data) and make sure the rates are correct.
Here's the code I use right now to go through DSC_Archive and assign the correct RateID to each call:
UPDATE DSC_Archive
SET
RateID=
(
SELECT TOP 1 r.RateID
FROM Rates r
WHERE d.DialedNo LIKE r.Prefix+'%' // match prefix
ANDd.OrigGrp = r.OrigGrp // match customer
ORDER BY LEN(r.Prefix) DESC // longest matching prefix wins
)
FROM DSC_Archive d
This works, but it takes hours with only three month's worth of data. Is there a more efficient way to do this?
Thanks
Herb
Edited by - hcaudill on 07/12/2002 12:28:46 PM
Edited by - hcaudill on 07/12/2002 12:30:39 PM
July 12, 2002 at 1:50 pm
I couldn't get the query to work so I don't have proof but my guess is that the Order By in your correlated subselect is killing you.
You don't state it but from your example my guess is that there are only 3 possible lengths for the prefixes. How about running the following queries:
UPDATE DSC_Archive
SET RateID = r.RateID
FROM Rates r INNER JOIN DSC_Archive d
ON r.Prefix = SUBSTRING(d.DialedNo,1,2)
UPDATE DSC_Archive
SET RateID = r.RateID
FROM Rates r INNER JOIN DSC_Archive d
ON r.Prefix = SUBSTRING(d.DialedNo,1,3)
UPDATE DSC_Archive
SET RateID = r.RateID
FROM Rates r INNER JOIN DSC_Archive d
ON r.Prefix = SUBSTRING(d.DialedNo,1,4)
The order is important and I would run them as one transaction. The solution isn't elegant and I haven't tried it with a large dataset but the joins are simple enough.
July 12, 2002 at 2:00 pm
I believe I helped with this same thing previously. Can you run the following so I can get an idea of what the Query Manager is doing. Run
SET SHOWPLAN_TEXT ON
GO
YourQueryRunsHere
GO
And paste the resulting output here. May can see a way to speed things along.
Also it may help to play with the Rates table by adding a length column and put a trigger on it for inserts and updates to keep the length accurate. Then place a clustered index key on that column in desc order so you can reduce the order time. (This may not help as I cannot test for sure, but it might).
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
July 15, 2002 at 2:06 pm
First the short term fix:
A helper view to make things easier
CREATE VIEW HelperView
AS
SELECT
Prefix,
DialedNo,
RateID,
CHARINDEX(REVERSE(Prefix), REVERSE(DialedNo)) AS PrefixPositionFromRight
FROM DSC_Archive a, Rates b
WHERE a.DialedNo LIKE b.Prefix + '%'
Finally, to get the data we want:
SELECT *
FROM HelperView a
WHERE
PrefixPositionFromRight = (SELECT MIN(PrefixPositionFromRight)
FROM HelperView b
WHERE a.DialedNo = b.DialedNo)
Since all the data you need is already in this query, I suggest removing/not using the RateID column in the DSC_Archive table. You can wrap this SELECT statement as another view so that other tools and users can use it without knowing how its implemented
The medium term fix:
Assign the value of RateID in DSC_Archive as calls are placed. Its likely that the rate should be determined when the call is placed and not some time after. You can "precompute" the rate so that your queries will effectively handle only a small subset of data at a time avoiding massive, long running UPDATE queries every month or so.
July 15, 2002 at 2:35 pm
I rewrote the SELECT statement I originally posted as it uses a correlated subquery and we all know correlated subqueries are almost always slower than an equivalent query implemented thru joins. Although the queries are equivalent, the query optimizer is not smart enough to know and rewrite the query itself.
The first version takes a little more than a minute. Any interactive scenario that takes more than a minute is too slow for me =). This on the computer Im working on as I browse the web in the foreground with the query running in the background for 1,000,000 rows on the table with the DialedNo column.
This is the rewritten version:
SELECT a.DialedNo, a.Prefix, a.PrefixPositionFromRight
FROM HelperView a, HelperView b
WHERE
a.DialedNo = b.DialedNo
GROUP BY
a.DialedNo, a.Prefix, a.PrefixPositionFromRight
HAVING
a.PrefixPositionFromRight = MIN(b.PrefixPositionFromRight)
This version takes only 15 seconds!!!!! I guess that should be fast enough for most purposes =)
July 15, 2002 at 2:36 pm
I rewrote the SELECT statement I originally posted as it uses a correlated subquery and we all know correlated subqueries are almost always slower than an equivalent query implemented thru joins. Although the queries are equivalent, the query optimizer is not smart enough to know and rewrite the query itself.
The first version takes a little more than a minute. Any interactive scenario that takes more than a minute is too slow for me =). This on the computer Im working on as I browse the web in the foreground with the query running in the background for 1,000,000 rows on the table with the DialedNo column.
This is the rewritten version:
SELECT a.DialedNo, a.Prefix, a.PrefixPositionFromRight
FROM HelperView a, HelperView b
WHERE
a.DialedNo = b.DialedNo
GROUP BY
a.DialedNo, a.Prefix, a.PrefixPositionFromRight
HAVING
a.PrefixPositionFromRight = MIN(b.PrefixPositionFromRight)
This version takes only 15 seconds!!!!! I guess that should be fast enough for most purposes =)
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply