Locking

Question

Post reply

Locking

chisholmd

SSCrazy

Points: 2490
More actions
September 22, 2005 at 11:21 am

#91313

Hi I run a daily job that looks up country codes based on IP. In one table are IP address ranges as "longs" and the matching country code. In another table I have log entries and dotted IP's. New rows are imported daily and I need to update them with country codes.
My strategy right now has been to run a daily job that populates a column in the logData rows with the country country code. My proc has two steps. First convert the dotted IP's to serial IP's. Then do a join update.
When this job runs it ends up blocking selects against the log table. The first statement uses a UDF to convert the dotted IP's to serial IP's. The second statement does the country code look ups.
Which proc do you think is causing the blocking?
Is there a better way to do this?
(The wmsLog table has about 15million rows)
PROC:
update dbo.wmsLog set longIP = dbo.IPStringToNumber(dbo.wmsLog.cip)
where dbo.wmsLog.longIP =0
and [datetime] > dateadd(d,-5,getdate())
UPDATE dbo.wmsLog
SET dbo.wmsLog.countrya2 = dbo.GeoIP.a2
FROM dbo.wmsLog INNER JOIN dbo.GeoIP ON dbo.wmsLog.longIP BETWEEN dbo.GeoIP.startLongIP AND dbo.GeoIP.endLongIP
where dbo.wmsLog.countrya2='' and not dbo.wmsLog.longIP =0
and [datetime] > dateadd(d,-5,getdate())
Dave
Trainmark.com IT Training B2B Marketplace
(Jobs for IT Instructors)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Jo Pattyn SSC-Dedicated Points: 32342 More actions · Answer 1

At which part there are blocking selects? section 1 (the conversion) or section 2 (assigning the language) of the code?

What is the code behind dbo.IPStringToNumber?

-------------------------------

Declare @DateInPast datetime

set @DateInPast=dateadd(d,-5,getdate())

update dbo.wmsLog set longIP = dbo.IPStringToNumber(dbo.wmsLog.cip)

/*converts 10.0.50.1 to long 010 000 050 001 ?*/

where dbo.wmsLog.longIP =0 /*long ip not set*/

and [datetime] > @DateInPast /*recent entry*/

UPDATE dbo.wmsLog /* logtable*/

SET dbo.wmsLog.countrya2 = dbo.GeoIP.a2 /*assign country code*/

FROM dbo.wmsLog wmsLog /*logtable*/

INNER JOIN dbo.GeoIP GeoIP /*contains IP ranges by country*/

ON wmsLog.longIP BETWEEN GeoIP.startLongIP AND GeoIP.endLongIP

/*Must be in a GEO IP-range used by a country*/

where wmsLog.countrya2='' /*country not filled in*/

and not wmsLog.longIP =0 /*must have an valid IP*/

/*is >0 as valid as not =0? this for a possible index*/

and [datetime] > @DateInPast /*recent entry*/

chisholmd SSCrazy Points: 2490 More actions · Answer 2

When I do a sp_who2 it shows the proc as blocking, not sure which statement in the proc is doing it. I guess I can split it into two and check it out.

Here is the function that I got somewhere on the net:

CREATE FUNCTION dbo.IPStringToNumber( @vcIPAddress varchar(15))

/**************************************************************************

DESCRIPTION: Returns Numeric IP, otherwise returns null

PARAMETERS:

@vcIPAddress- The string containing a valid IP

RETURNS:IP converted to bigint or null if not a valid IP

USAGE: SELECT dbo.IPStringToNumber( '10.255.255.255')

AUTHOR:Karen Gayda

DATE: 06/11/2003

MODIFICATION HISTORY:

WHODATEDESCRIPTION

----------------------------------------------------------------

***************************************************************************/

RETURNS bigint

AS

BEGIN

DECLARE

@biOctetA bigint,

@biOctetBbigint,

@biOctetCbigint,

@biOctetDbigint,

@biIP bigint

DECLARE @tblArray TABLE

(

OctetIDsmallint, --Array index

Octetbigint --Array element contents

)

--split the IP string and insert each octet into a table row

INSERT INTO @tblArray

SELECT ElementID, Convert(bigint,Element) FROM dbo.Split(@vcIPAddress, '.')

--check that there are four octets and that they are within valid ranges

IF (SELECT COUNT(*) FROM @tblArray WHERE Octet BETWEEN 0 AND 255) = 4

BEGIN

SET @biOctetA = (SELECT (Octet * 256 * 256 * 256) FROM @tblArray WHERE OctetID = 1)

SET @biOctetB = (SELECT (Octet * 256 * 256 ) FROM @tblArray WHERE OctetID = 2)

SET @biOctetC = (SELECT (Octet * 256 ) FROM @tblArray WHERE OctetID = 3)

SET @biOctetD = (SELECT (Octet) FROM @tblArray WHERE OctetID = 4)

SET @biIP = @biOctetA + @biOctetB + @biOctetC + @biOctetD

END

RETURN(@biIP)

END

Dave

Trainmark.com IT Training B2B Marketplace
(Jobs for IT Instructors)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 3

why don't you play with that... seems to be considerably faster :

CREATE FUNCTION dbo.IPStringToNumber_Set( @vcIPAddress varchar(15))

/**************************************************************************

DESCRIPTION: Returns Numeric IP, otherwise returns null

PARAMETERS:

@vcIPAddress - The string containing a valid IP

RETURNS: IP converted to bigint or null if not a valid IP

USAGE: SELECT dbo.IPStringToNumber( '10.255.255.255')

AUTHOR: Karen Gayda

DATE: 06/11/2003

MODIFICATION HISTORY:

WHO DATE DESCRIPTION

--- ---------- ---------------------------------------------------

RGR'us - 2005/09/23 - Removed the inserts/select for performance, set the answer right away

***************************************************************************/

RETURNS bigint

AS

BEGIN

DECLARE

@biOctetA bigint,

@biOctetB bigint,

@biOctetC bigint,

@biOctetD bigint,

@tiCount tinyint,

@biIP bigint

--split the IP string and insert each octet into a table row

SELECT

@biOctetA = dtValues.biOctetA

, @biOctetB = dtValues.biOctetB

, @biOctetC = dtValues.biOctetC

, @biOctetD = dtValues.biOctetD

, @tiCount = dtValues.Total

FROM

(

SELECT

MAX(CASE WHEN ElementID = 1 THEN Convert(bigint,Element) * 256 * 256 * 256 END) AS biOctetA,

MAX(CASE WHEN ElementID = 2 THEN Convert(bigint,Element) * 256 * 256 END) AS biOctetB,

MAX(CASE WHEN ElementID = 3 THEN Convert(bigint,Element) * 256 END) AS biOctetC,

MAX(CASE WHEN ElementID = 4 THEN Convert(bigint,Element) END) AS biOctetD,

MAX(ElementID) AS Total

FROM dbo.Split(@vcIPAddress, '.')

) dtValues

--check that there are four octets and that they are within valid ranges

SET @biIP = NULL

IF @tiCount = 4

BEGIN

SET @biIP = @biOctetA + @biOctetB + @biOctetC + @biOctetD

END

RETURN(@biIP)

END

GO

Select dbo.IPStringToNumber_Set('255.123.234.104') AS IP_SET

GO

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

Not sure if it's faster that way, but you could test it :

SELECT

@biIP = CASE WHEN @tiCount = 4 THEN biOctetA + biOctetB + biOctetC + biOctetD ELSE NULL END

FROM

...

RETURN (@biIP )

You can also try this split version to see if it works faster than the procedural one (don't expect much of a difference over a small string... but times 1 million records it might make one)

IF Object_id('fnSplit_Set') > 0

DROP FUNCTION dbo.fnSplit_Set

GO

IF Object_id('Numbers') > 0

DROP TABLE dbo.Numbers

GO

CREATE TABLE dbo.Numbers (PkNumber int identity(1,1), dude bit null, CONSTRAINT Pk_Number PRIMARY KEY CLUSTERED (PkNumber))

GO

INSERT INTO dbo.Numbers (dude)

SELECT NULL FROM (SELECT TOP 100 NULL AS A FROM master.dbo.spt_values) dt100 cross join (SELECT TOP 80 null AS A FROM master.dbo.spt_values) dt80

GO

ALTER TABLE dbo.Numbers

DROP COLUMN dude

GO

--Select min(PkNumber) as MinA, max(PkNumber) as MaxA from dbo.Numbers

--1, 8000

GO

CREATE FUNCTION [dbo].[fnSplit_Set] (@IDS as varchar(8000), @vcDelimiter varchar(3))

RETURNS TABLE

WITH SCHEMABINDING

AS

Return

Select dtSplitted.EachID, dtSplitted.Rank from (

SELECT SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter),

CHARINDEX(@vcDelimiter, @vcDelimiter + @IDs + @vcDelimiter, N.PkNumber + len(@vcDelimiter)) - N.PkNumber - len(@vcDelimiter)) as EachID

, (LEN(SUBSTRING(@IDs, 1, N.PkNumber)) - LEN (REPLACE (SUBSTRING(@IDs, 1, N.PkNumber), ',', ''))) + 1 AS Rank

FROM dbo.Numbers N

WHERE SUBSTRING(@vcDelimiter + @IDs + @vcDelimiter, N.PkNumber, len(@vcDelimiter)) = @vcDelimiter

AND PkNumber 0

GO

DECLARE @Ids as varchar(8000)

SET @IDS = ''

Select @IDS = @IDS + CAST(id as varchar(10)) + ',' from (Select TOP 10 id from dbo.SysObjects order by NEWID()) dtIds

SET @IDS = left(@IDS, ABS(len(@IDS) - 1))

PRINT @IDS

Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by O.Name

Select O.id, dtIDS.EachId, O.name, dtIDS.Rank from dbo.SysObjects O inner join (Select CAST(EachID as INT) AS EachID, Rank from dbo.fnSplit_Set (@IDS, ',')) dtIDS on O.id = dtIDS.EachID order by dtIDS.Rank

--Dont't ever forget to cast the eachid column to the same datatype as in the join to avoid table scans.

chisholmd SSCrazy Points: 2490 More actions · Answer 5

I am working a number of angles on this. I just realized that although I have 11 indexes on the main table in question I don't have one suitable for the joined update statement. Adding an index for the longIP field should help. Currently the two statements together take an hour to run which is far longer then should be neccessary.

Second I am trying to get my ADO connection object to do dirty reads. This is primarily a reporting application and 98% of the reports could care less about country code. So I would think that if I set the connection object to do dirty reads (adXactReadUncommitted) then shouldn't it ignore any locks that the update statements are placing?

btw: from what I have seen so far it does not appear to be the function itself that is the bottleneck, or rather the alternatives don't give me a large enough performance increase. I want this proc to run in minutes at the most if I can manage it.

Thanks all and any adivce still welcome.

Dave

Trainmark.com IT Training B2B Marketplace
(Jobs for IT Instructors)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 6

Can you post the full execution plan? Maybe we'll see something you missed.

chisholmd SSCrazy Points: 2490 More actions · Answer 7

OK I will. I have the execution plan on my monitor right now but it's the graphical one. I'll choose the text version and run it again later tonight when server traffic is low.

For now it is the update via join that consumes 98% of the execution time of 1:09:00. 60% of the cost is a clustered index seek on the geoIP table. my geoIP table is [country],[startIP],[endIP] and the where clause is a "between" the start and end. The innerJoin nested loop consumes another 26% so those two together account for 85% percent of the time.

Hard for me to see how I can do better then the

index seek thats taking 60% of the time. Maybe I can drop countries that are likly to appear as "all others" on the report.

I'll post the full plan tonight. Thanks.

Dave

Trainmark.com IT Training B2B Marketplace
(Jobs for IT Instructors)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

Just a taught... maybe it would be faster to run the clustered index seek if the startIP and endIP columns were at the start of the index instead of a char column (assuming country is in text).

chisholmd SSCrazy Points: 2490 More actions · Answer 9

Sorry, sloppy on my part when describing the fields. The clustered index is [start],[end],[country] already I just wrote it wrong in the post.

However, I just took a look at the index being used on the other table and that can be improved. There is no index with dbo.wmsLog.longIP. In the where claus of the join it is used as "not dbo.wmsLog.longIP = 0". Might be a small gain there.

Im not sure if an additional index for that column is worth it though because it would add time to the first statement updating the index after each update in the first statement. I'll try it and measure the difference.

I might just resign myself to running this job at 2am when the server is very idle.

thx

Dave

Trainmark.com IT Training B2B Marketplace
(Jobs for IT Instructors)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 10

Maybe you could speed this up if you added an indexed calculated field on the long ip (the ones you calculate on the fly)... maybe that could give you some more options... and at least flush one big update statement.