Accessing Functions in Master Data Services..

  • I need to use some Sql functions like 'Similarity' that exist in Master Data Services database. I am not going to do any web application development, will just need to use these functions in stored procedures.

    So what I am trying to find is, do I need to run masterdataservices.msi and go through the whole process or is there a way to be able to just create the database and access all the functions in it.

    I see the following steps and am wondering if I can get by with just #1. What would be the best way to achieve my objective.

    1.Create a Master Data Services Database

    2.Create a Master Data Manager Web Application (Master Data Services)

    3.Associate a Master Data Services Database and Web Application

    4.Enable Web Services (Master Data Services)

    Thanks,

    Bvip.

  • I used to have all of the functions and stored procs but can't find them at the moment.

    If you create the Master Data Services DB (#1) you just need to create the assembly then you can create the functions and store procs.

    This article will show you how to create the assembly:

    http://xmlsqlninja.blogspot.com/2013/04/setting-up-mdqxmltransform.html

    Again, I don't have all of them but here's what I do have at the moment:

    ITVF CLR:

    --NGrams:

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    CREATE FUNCTION [mdq].[NGrams](@input [nvarchar](4000), @n [tinyint], @padSpace [bit])

    RETURNS TABLE (

    [Sequence] [int] NULL,

    [Token] [nvarchar](255) NULL

    ) WITH EXECUTE AS CALLER

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[NGrams]

    GO

    --Regex Matches:

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    ALTER FUNCTION [mdq].[RegexMatches](@input [nvarchar](4000), @pattern [nvarchar](4000), @mask [tinyint] = 0)

    RETURNS TABLE (

    [Sequence] [int] NULL,

    [Token] [nvarchar](4000) NULL

    ) WITH EXECUTE AS CALLER

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[RegexMatches]

    --RegexSplit

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    ALTER FUNCTION [mdq].[RegexSplit](@input [nvarchar](4000), @splitPattern [nvarchar](4000), @tokenPattern [nvarchar](4000), @mask [tinyint])

    RETURNS TABLE (

    [Sequence] [int] NULL,

    [Token] [nvarchar](4000) NULL,

    [IsValid] [bit] NULL

    ) WITH EXECUTE AS CALLER

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[RegexSplitWithCheck]

    -- Split

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    CREATE FUNCTION [mdq].[Split](@input [nvarchar](4000), @separators [nvarchar](10), @removeEmpty [bit], @tokenPattern [nvarchar](4000), @mask [tinyint])

    RETURNS TABLE (

    [Sequence] [int] NULL,

    [Token] [nvarchar](4000) NULL,

    [IsValid] [bit] NULL

    ) WITH EXECUTE AS CALLER

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[SplitWithCheck]

    Scalar CLR:

    --RegexExtract

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    CREATE FUNCTION [mdq].[RegexExtract](@input [nvarchar](4000), @pattern [nvarchar](4000), @GROUP [nvarchar](4000), @mask [tinyint])

    RETURNS [nvarchar](4000) WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[RegexExtract]

    GO

    --RegexIsMatch

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    CREATE FUNCTION [mdq].[RegexIsMatch](@input [nvarchar](4000), @pattern [nvarchar](4000), @mask [tinyint])

    RETURNS [bit] WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[RegexIsMatch]

    GO

    --RegexReplace

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    CREATE FUNCTION [mdq].[RegexReplace](@input [nvarchar](4000), @pattern [nvarchar](4000), @REPLACE [nvarchar](4000), @mask [tinyint])

    RETURNS [nvarchar](4000) WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[RegexReplace]

    GO

    --Similarity

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    CREATE FUNCTION [mdq].[Similarity](@input1 [nvarchar](4000), @input2 [nvarchar](4000), @method [tinyint], @containmentBias [float], @minScoreHint [float])

    RETURNS [float] WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[Similarity]

    GO

    --SimilarityDate

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER OFF

    GO

    CREATE FUNCTION [mdq].[SimilarityDate](@date1 [datetime], @date2 [datetime])

    RETURNS [float] WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT

    AS

    EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[SimilarityDate]

    GO

    A few things to consider:

    (1) mdq.xmltransform is available in the aforementioned article with some examples of how to use.

    (2) If you need to split a string based on a single-character delimiter, Jeff Moden's T-SQL Splitter (in my signature line) is faster than mdq.split.

    (3) The very limited Microsoft documentation does not come up in a google search but can be found here: http://msdn.microsoft.com/en-us/library/ee633712%28SQL.105%29.aspx

    (4) The Similarity Function is nonsensical - for a better Levenshtein (Damerau-Levenshtein) implementation see: http://www.sqlservercentral.com/articles/Fuzzy+Match/92822/[/url]

    (5) For examples of the regex functions you can google, "mdq.regex clr"... I have a couple articles out there on how to use them; there are a few others out there as well. That said, if you learn how to use the splitters referenced in my signature line, STUFF, CHARINDEX, PATINDEX and a tally table you will find that SQL Server handles strings just fine and you don't need regex.

    Hopefully this post helps. 😉

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thanks Alan!

    This will certainly help.

    BVip.

  • No problem. Glad to help.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply