is there any difference between != and <>

  • Matt Miller (11/14/2007)


    Giants? Nah - I aint that tall....

    More like - boys with their toys:)

    You said it. 😉

    What I'll do is grab one of the string functions that I have in my main database, rewrite it in as efficient T-SQL as I can manage, then hack up a C# version, and see how they compare. Doubt it'll be before friday though. Considering that my sQL is way better than my C#, it won't be really fair

    Oh, I will add in an additional challenge if you two feel like it.

    Take data of the following form, with a lot, lot more data (say 100 000 rows and 250 categories)

    CREATE TABLE #Test

    Category INT,

    Code VARCHAR(5)

    )

    INSERT INTO #Test VALUES (1,'a')

    INSERT INTO #Test VALUES (1,'t')

    INSERT INTO #Test VALUES (1,'e')

    INSERT INTO #Test VALUES (1,'s')

    INSERT INTO #Test VALUES (1,'u')

    INSERT INTO #Test VALUES (2,'u')

    INSERT INTO #Test VALUES (2,'e')

    INSERT INTO #Test VALUES (2,'n')

    INSERT INTO #Test VALUES (2,'a')

    INSERT INTO #Test VALUES (3,'i')

    INSERT INTO #Test VALUES (3,'z')

    and output data of the form. (order of the characters isn't important, since there's nothing to order by)

    1, 'a,t,e,s,u'

    2, 'u,e,n,a'

    3, 'i,z'

    My gut feel is that a user-defined aggregate will out perform any T-SQL by quite a way.

    Or if Jeff wants to produce the T-SQL, I can give the C# a try, but it won't be for a couple days.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • All right - here's the CLR solution for the Comedy-Separated Value field challenge.

    First - the CLR function:

    Imports System

    Imports System.Data

    Imports System.Data.SqlClient

    Imports System.Data.SqlTypes

    Imports Microsoft.SqlServer.Server

    Imports System.Runtime.InteropServices

    Partial Public Class UserDefinedFunctions

    Private Class SplitInfo

    Private _val As Object

    Private _pos As Integer

    Property value() As Object

    Get

    Return _val

    End Get

    Set(ByVal value As Object)

    _val = value

    End Set

    End Property

    Property position() As Integer

    Get

    Return _pos

    End Get

    Set(ByVal value As Integer)

    _pos = value

    End Set

    End Property

    End Class

    <Microsoft.SqlServer.Server.SqlFunction(FillRowMethodName:="FillRowSplitExp", IsDeterministic:=True, IsPrecise:=True, TableDefinition:="val nvarchar(max),pos integer")> _

    Public Shared Function RegexSplitExp(ByVal input As SqlString, ByVal pattern As SqlString) As IEnumerable

    Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value)

    Dim c As New Collection

    Dim s As SplitInfo

    Dim i As Integer

    Dim T As String()

    Dim m As System.Text.RegularExpressions.MatchCollection

    If input.IsNull Then

    s = New SplitInfo

    s.value = System.DBNull.Value

    s.position = -1

    c.Add(s)

    Else

    If rex.IsMatch(input.Value) = False Then

    s = New SplitInfo

    s.position = 0

    s.value = ""

    c.Add(s)

    Else

    T = rex.Split(input.Value)

    m = rex.Matches(input.Value)

    For i = 0 To T.GetUpperBound(0)

    s = New SplitInfo

    s.value = T(i).ToString

    If i = 0 Then

    s.position = 0

    Else

    s.position = m(i - 1).Index + 1

    End If

    'If s.value.Equals("") Then

    ' s.value = " "

    'End If

    c.Add(s)

    Next

    End If

    End If

    Return c

    End Function

    Public Shared Sub FillRowSplitExp(ByVal obj As Object, ByRef pos As Integer)

    Dim s As SplitInfo

    s = CType(obj, SplitInfo)

    Val = s.value.ToString

    pos = s.position

    End Sub

    End Class

    the SQL code is next

    use testing

    go

    dbcc freeproccache

    dbcc dropcleanbuffers

    --set things up

    declare @StartTime datetime

    IF OBJECT_ID('TempDB..#mattcomedyTest','U') IS NOT NULL

    drop table #mattcomedyTest

    PRINT REPLICATE('=',78)

    PRINT SPACE(12)+'Matt''s CLR Solution for Indexed Parsing.'

    PRINT REPLICATE('-',78)

    --the clock is ticking...

    set @StartTime=getdate()

    --build the table

    select someid,

    case when s.pos=-1 then null else s.val end as val, --handle the NULL values

    s.pos+1 as idx

    into #mattcomedyTest

    from jbmtest

    cross apply dbo.regexsplitExp(somecsv,',') s

    --the unique index - not clustered since you're not doing clustered either :)

    CREATE UNIQUE INDEX IX_mattcomedyTest_SomeID_Idx

    ON #mattcomedyTest (SomeID,[Idx])

    --===== Display the duration

    PRINT CONVERT(CHAR(12),GETDATE()-@StartTime,114) + ' Duration (hh:mi:ss:mmm)'

    PRINT REPLICATE('=',78)

    --The 100

    select top 100 * from #mattcomedyTest order by SomeID,Idx

    As expected - the T-SQL wins thanks to the CROSS Apply in the CLR version.

    results:

    =============================================================================

    Matt's CLR Solution for Indexed Parsing.

    ------------------------------------------------------------------------------

    (999956 row(s) affected)

    (1 row(s) affected)

    00:00:13:383 Duration (hh:mi:ss:mmm)

    =============================================================================

    vs

    =============================================================================

    Jeff Moden's Tally Table Solution for Indexed Parsing.

    ------------------------------------------------------------------------------

    (999956 row(s) affected)

    00:00:08:927 Duration (hh:mi:ss:mmm)

    =============================================================================

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Phone Test phase 1 - CLR version....

    the SQL

    use testing

    go

    dbcc freeproccache

    dbcc dropcleanbuffers

    --set things up

    declare @StartTime datetime

    IF OBJECT_ID('TempDB..#mattPhoneExtTest','U') IS NOT NULL

    drop table #mattPhoneExtTest

    PRINT REPLICATE('=',78)

    PRINT SPACE(12)+'Matt''s CLR Solution for Extracting Phone Numbers'

    PRINT REPLICATE('-',78)

    --the clock is ticking...

    set @StartTime=getdate()

    --create the table

    select

    rid,

    case when r.pos=-1 then null else r.val end as val,

    r.pos+1 as idx

    into #mattPhoneExtTest

    from MattTestText

    cross apply dbo.regexmatches(doc,'\([0-9]{3}\)[ ]{0,1}[0-9]{3}\-[0-9]{4} ') r

    --the unique index - not clustered since you're not doing clustered either :)

    CREATE UNIQUE INDEX IX_mattPhoneExtTestt_RID_Idx

    ON #mattPhoneExtTest (rID,[Idx])

    --===== Display the duration

    PRINT CONVERT(CHAR(12),GETDATE()-@StartTime,114) + ' Duration (hh:mi:ss:mmm)'

    PRINT REPLICATE('=',78)

    --The 100

    select top 100 * from #mattPhoneExtTest order by rID,Idx

    The CLR function

    Imports System

    Imports System.Data

    Imports System.Data.SqlClient

    Imports System.Data.SqlTypes

    Imports Microsoft.SqlServer.Server

    Imports System.Runtime.InteropServices

    Partial Public Class UserDefinedFunctions

    Private Class MatchInfo

    private _rank as integer

    Private _val As Object

    Private _pos As Integer

    Private _len As Integer

    Public Property rank() As Integer

    Get

    Return _rank

    End Get

    Set(ByVal value As Integer)

    _rank = value

    End Set

    End Property

    Public Property value() As Object

    Get

    Return _val

    End Get

    Set(ByVal val As Object)

    _val = val

    End Set

    End Property

    Public Property position() As Integer

    Get

    Return _pos

    End Get

    Set(ByVal value As Integer)

    _pos = value

    End Set

    End Property

    Public Property length() As Integer

    Get

    Return _len

    End Get

    Set(ByVal value As Integer)

    _len = value

    End Set

    End Property

    End Class

    <Microsoft.SqlServer.Server.SqlFunction(FillRowMethodName:="FillRowMatches", IsDeterministic:=True, IsPrecise:=True, TableDefinition:="Occurrence int,val nvarchar(100),POS int,length int")> _

    Public Shared Function RegexMatches(ByVal input As SqlChars, ByVal pattern As SqlString) As IEnumerable

    Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value, optionS)

    Dim t As System.Text.RegularExpressions.MatchCollection

    Dim matchlist As New Collection

    Dim m_info As MatchInfo

    Dim i As Integer

    If input.IsNull Then

    m_info = New MatchInfo

    m_info.value = System.DBNull.Value

    m_info.position = -1

    m_info.length = -1

    matchlist.Add(m_info)

    Else

    If rex.IsMatch(input.Value) = False Then

    m_info = New MatchInfo

    m_info.position = 0

    m_info.length = 0

    m_info.value = ""

    matchlist.Add(m_info)

    Else

    t = rex.Matches(input.Value)

    For i = 0 To t.Count - 1

    m_info = New MatchInfo

    m_info.rank = i

    m_info.value = t(i).Value

    m_info.position = t(i).Index

    m_info.length = t(i).Length

    matchlist.Add(m_info)

    Next

    End If

    End If

    Return matchlist

    rex = Nothing

    matchlist = Nothing

    End Function

    Public Shared Sub FillRowMatches(ByVal obj As Object, <Out()> ByRef Occurrence As SqlInt32, <Out()> ByRef Val As SqlString, <Out()> ByRef Pos As SqlInt32, <Out()> ByRef Length As SqlInt32)

    Val = CType(CType(obj, MatchInfo).value, SqlString)

    Pos = CType(CType(obj, MatchInfo).position, SqlInt16)

    Length = CType(CType(obj, MatchInfo).Length, SqlInt16)

    Occurrence = CType(CType(obj, MatchInfo).rank, SqlInt16)

    End Sub

    End Class

    The results

    ==============================================================================

    Matt's CLR Solution for Extracting Phone Numbers

    ------------------------------------------------------------------------------

    (4040205 row(s) affected)

    00:02:02:923 Duration (hh:mi:ss:mmm)

    ==============================================================================

    UPDATE - this time with NO other disk activity slowing things down

    ==============================================================================

    Matt's CLR Solution for Extracting Phone Numbers

    ------------------------------------------------------------------------------

    (4040205 row(s) affected)

    00:01:46:083 Duration (hh:mi:ss:mmm)

    ==============================================================================

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Phone test - Phase 2 (redaction)

    The CLR code is:

    Imports System

    Imports System.Data

    Imports System.Data.SqlClient

    Imports System.Data.SqlTypes

    Imports Microsoft.SqlServer.Server

    Imports System.Text.RegularExpressions

    Partial Public Class UserDefinedFunctions

    _

    Public Shared Function RegexReplace(ByVal input As SqlChars, ByVal pattern As SqlString, ByVal rep As SqlString) As SqlString

    ' Add your code here

    Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value)

    If input.IsNull Then

    Return New SqlString(SqlString.Null.Value)

    Else

    Return New SqlString(rex.Replace(New String(input.Value), rep.Value))

    End If

    End Function

    End Class

    Here's the SQL

    use testing

    go

    dbcc freeproccache

    dbcc dropcleanbuffers

    --clean out the column

    update mattTestText

    set docred=null

    go

    dbcc freeproccache

    dbcc dropcleanbuffers

    go

    --set things up

    declare @StartTime datetime

    PRINT REPLICATE('=',78)

    PRINT SPACE(12)+'Matt''s CLR Solution for Phone Redaction'

    PRINT REPLICATE('-',78)

    --the clock is ticking...

    set @StartTime=getdate()

    update matttesttext

    set docred=dbo.regexreplace(doc,'\([0-9]{3}\)(? [0-9]{4}) ','(XXX)${1}YYY-${2} ')

    from matttesttext

    --===== Display the duration

    PRINT CONVERT(CHAR(12),GETDATE()-@StartTime,114) + ' Duration (hh:mi:ss:mmm)'

    PRINT REPLICATE('=',78)

    --The 100

    select top 100 rid,doc,docred, len(doc),len(docred) from matttesttext

    No cross apply this time, so time flies when you're having fun...

    Results -

    ==============================================================================

    Matt's CLR Solution for Phone Redaction

    ------------------------------------------------------------------------------

    (500000 row(s) affected) -- also known as 4040205 phone numbers redacted

    00:01:04:063 Duration (hh:mi:ss:mmm)

    ==============================================================================

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Shaun McGuile (11/14/2007)


    Good idea Gail, I'm not gonna get in the way of battling giants!:D

    Matt said it right... "boys with their toys"... just having a bit of fun... c'mon in... water's fine here...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GilaMonster (11/14/2007)


    Or if Jeff wants to produce the T-SQL, I can give the C# a try, but it won't be for a couple days.

    Heck ya... great idea for a test, Gail...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • --the unique index - not clustered since you're not doing clustered either

    Boy! Talk about a typo on my part :sick:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GilaMonster (11/14/2007)


    Matt Miller (11/14/2007)


    Giants? Nah - I aint that tall....

    More like - boys with their toys:)

    You said it. 😉

    What I'll do is grab one of the string functions that I have in my main database, rewrite it in as efficient T-SQL as I can manage, then hack up a C# version, and see how they compare. Doubt it'll be before friday though. Considering that my sQL is way better than my C#, it won't be really fair

    Oh, I will add in an additional challenge if you two feel like it.

    Take data of the following form, with a lot, lot more data (say 100 000 rows and 250 categories)

    CREATE TABLE #Test

    Category INT,

    Code VARCHAR(5)

    )

    INSERT INTO #Test VALUES (1,'a')

    INSERT INTO #Test VALUES (1,'t')

    INSERT INTO #Test VALUES (1,'e')

    INSERT INTO #Test VALUES (1,'s')

    INSERT INTO #Test VALUES (1,'u')

    INSERT INTO #Test VALUES (2,'u')

    INSERT INTO #Test VALUES (2,'e')

    INSERT INTO #Test VALUES (2,'n')

    INSERT INTO #Test VALUES (2,'a')

    INSERT INTO #Test VALUES (3,'i')

    INSERT INTO #Test VALUES (3,'z')

    and output data of the form. (order of the characters isn't important, since there's nothing to order by)

    1, 'a,t,e,s,u'

    2, 'u,e,n,a'

    3, 'i,z'

    My gut feel is that a user-defined aggregate will out perform any T-SQL by quite a way.

    Or if Jeff wants to produce the T-SQL, I can give the C# a try, but it won't be for a couple days.

    There's a XML trick that might be worth throwing into the mix for this one. it's VERY slick, but I've been reading mixed reviews as to its speed on long strings. or rather, concatenation of LOTS of rows.

    Of course - it's your test - you make up the rules 🙂

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Jeff Moden (11/14/2007)


    --the unique index - not clustered since you're not doing clustered either

    Boy! Talk about a typo on my part :sick:

    It's alright - at least we're comparing apples to apples this way. It wouldn't have built it if it weren't unique, so it proved the point.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • ==============================================================================

    Matt's CLR Solution for Extracting Phone Numbers

    ------------------------------------------------------------------------------

    (4040205 row(s) affected)

    00:01:46:083 Duration (hh:mi:ss:mmm)

    ==============================================================================

    Ok... I might have just missed in in your code and because of the rowcount you have (I know it's random but I think it should be closer)... did you remember to NOT include the 3rd telephone number in a row?

    And, I've been up for almost 40 hours... Haven't had much of a chance to work on the 2nd part of the phone number thing and would probably drool on myself a bit if I tried right now... I'll take a whack at it in 8 or 10 hours :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/14/2007)


    ==============================================================================

    Matt's CLR Solution for Extracting Phone Numbers

    ------------------------------------------------------------------------------

    (4040205 row(s) affected)

    00:01:46:083 Duration (hh:mi:ss:mmm)

    ==============================================================================

    Ok... I might have just missed in in your code and because of the rowcount you have (I know it's random but I think it should be closer)... did you remember to NOT include the 3rd telephone number in a row?

    And, I've been up for almost 40 hours... Haven't had much of a chance to work on the 2nd part of the phone number thing and would probably drool on myself a bit if I tried right now... I'll take a whack at it in 8 or 10 hours :hehe:

    :Whistling:

    umh...sure...it's there - it's just the ranger eyes 🙂

    Oh all right - I will rerun it!

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (11/14/2007)


    There's a XML trick that might be worth throwing into the mix for this one. it's VERY slick, but I've been reading mixed reviews as to its speed on long strings. or rather, concatenation of LOTS of rows.

    Of course - it's your test - you make up the rules 🙂

    I'd like to see that trick. At the moment the ways I know to do this require either a cursor or a UDA

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Me too! Anyone know where there might be a posting for it? Or, maybe post it here so we can do some testing on it?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • New results with the CORRECT query for the phone extraction challenge. The "no 3rd" took a chunk out of it (as expected), although the runs just seem slower today anyway.

    I actually ran it two different ways. One was with the restriction in the CLR, and the other in the T-SQL. The two versions:

    CLR has the restriction: 00:02:29:497 Duration (hh:mi:ss:mmm)

    SQL has the restriction: 00:02:36:077 Duration (hh:mi:ss:mmm)

    which kind of makes sense to me. 500,000 rows fewer received from CLR, meaning less effort in the cross apply, and fewer rows to have to throw out.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Gail - as to your test - I will try to dig out and use the XML data() trick, we'll let Jeff come up with the "pure" T-SQL trick, and you can handle the CLR?

    Let us know what the test data looks like, and set up whatever other rules you wish. We can have a three-way cage match (unless someone else has another version they'd like).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 31 through 45 (of 62 total)

You must be logged in to reply to this topic. Login to reply