How to use values keyword as parameter of a function

Question

How to use values keyword as parameter of a function

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004435 More actions · Answer 1

ScottPletcher (11/30/2012)
Jeff Moden (11/29/2012)
capnhector (11/29/2012)
EDIT: This is also for me to play with a couple of things and get more experience with them. so it may not be fast or work in your situation but does use some things i have been studying.
No.... you're absolutely on the correct path. The MIN=MAX method the others used is twice as slow as the original function. You can make it a bit faster still by turning it into an iTVF instead of a scalar function. And lose the join. If you rework the GenericTable TYPE just a bit, you won't need it. Give it a shot. You've got this, Cap'n!
Really??
Doesn't the DISTINCT require an (expensive) sort but min/max don't?

Correct but the min/max still turns out slower. I'll try to remember to post my test code when I get home from work.

I also have an idea for solving the "base" problem here that might make both solutions seem slow.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004435 More actions · Answer 2

Actually, I spoke a bit too soon. I was testing with 10 columns per row to be compared. There’s a cross-over point where the MIN/MAX method wins for 1 to 4 comparison columns and the COUNT(DISTINCT) wins for 5 and above.

Here’s the code I’ve been using to test with. To make things faster, I did like CapnHector did and converted both functions to iTVFs instead of leaving them as mTVFs.

--=====================================================================================================================

-- This section conditionally drops all of the objects and rebuilds them to make reruns in SSMS easier.

--=====================================================================================================================

--===== Do this in a nice safe place that everyone has.

USE tempdb

;

--===== Conditionally drop objects in dependency order to make reruns easier in SSMS

IF OBJECT_ID('dbo.AreEqualCount' ,'IF') IS NOT NULL DROP FUNCTION dbo.AreEqualCount;

IF OBJECT_ID('dbo.AreEqualMinMax','IF') IS NOT NULL DROP FUNCTION dbo.AreEqualMinMax;

IF OBJECT_ID('tempdb..#TestTable','U' ) IS NOT NULL DROP TABLE #TestTable;

IF EXISTS (SELECT * FROM sys.types WHERE name = 'GenericTable') DROP TYPE dbo.GenericTable;

GO

--===== First, rework the GenericTable just a bit

CREATE TYPE dbo.GenericTable AS TABLE (I SQL_Variant, D SQL_VARIANT);

GO

--===== Create the COUNT function with column "I" added

CREATE FUNCTION dbo.AreEqualCount

(@T AS dbo.GenericTable READONLY)

RETURNS TABLE WITH SCHEMABINDING

RETURN

SELECT I,

AllEqual = CASE WHEN COUNT(DISTINCT D) = 1 THEN 1 ELSE 0 END

FROM @T

GROUP BY I

;

GO

--===== Create the Min/Max version of the function with column "I" added

CREATE FUNCTION AreEqualMinMax

(@T AS dbo.GenericTable READONLY)

RETURNS TABLE WITH SCHEMABINDING

RETURN

SELECT I,

AllEqual = CASE WHEN MIN(D) = MAX(D) THEN 1 ELSE 0 END

FROM @T

GROUP BY I

;

GO

--===== Create the Test Table and populate it,

-- Note that it has some "L" columns that we don't want to work with.

SELECT TOP 100000

I = IDENTITY(INT,1,1),

D1 = ABS(CHECKSUM(NEWID()))%3+1,

D2 = ABS(CHECKSUM(NEWID()))%3+1,

D3 = ABS(CHECKSUM(NEWID()))%3+1,

D4 = ABS(CHECKSUM(NEWID()))%3+1,

D5 = ABS(CHECKSUM(NEWID()))%3+1,

D6 = ABS(CHECKSUM(NEWID()))%3+1,

D7 = ABS(CHECKSUM(NEWID()))%3+1,

D8 = ABS(CHECKSUM(NEWID()))%3+1,

D9 = ABS(CHECKSUM(NEWID()))%3+1,

D10 = ABS(CHECKSUM(NEWID()))%3+1,

L1 = ABS(CHECKSUM(NEWID()))%3+1,

L2 = ABS(CHECKSUM(NEWID()))%3+1,

L3 = ABS(CHECKSUM(NEWID()))%3+1

INTO #TestTable

FROM master.sys.all_columns ac1

CROSS APPLY master.sys.all_columns ac2

;

--===== Set every 5th row to have all equal values so we can see that things work correctly.

UPDATE #TestTable

SET D1 = I,

D2 = I,

D3 = I,

D4 = I,

D5 = I,

D6 = I,

D7 = I,

D8 = I,

D9 = I,

D10 = I

WHERE I%5 = 0

;

--=====================================================================================================================

-- This is the test section for the objects that we know of, so far.

--=====================================================================================================================

--===== Declare some obviously name variables

DECLARE @T GenericTable,

@StartTime DATETIME,

@Bitbucket SQL_VARIANT

;

--===== This unpivots the data and inserts it into the generic table variable.

-- Notice that we have to enter all of the desired column names

INSERT INTO @T

(I,D)

SELECT t.I, cav.D

FROM #TestTable t

CROSS APPLY (VALUES (D1),(D2),(D3),(D4),(D5),(D6),(D7),(D8),(D9),(D10)) cav (D) --Change here to check for cross-over point

;

--===== Test the COUNT function

SELECT @StartTime = GETDATE()

;

SELECT @Bitbucket = I,

@Bitbucket = AllEqual

FROM dbo.AreEqualCount(@T)

;

SELECT CountDur = DATEDIFF(ms,@StartTime, GETDATE())

;

--===== Test the MinMax function

SELECT @StartTime = GETDATE()

;

SELECT @Bitbucket = I,

@Bitbucket = AllEqual

FROM dbo.AreEqualMinMax(@T)

;

SELECT MinMaxDur = DATEDIFF(ms,@StartTime, GETDATE())

;

Here are the durations from the run.

CountDur

-----------

3650

MinMaxDur

-----------

4373

Now, my question is, what is the actual functionality that the OP is asking for in the original post? Although it looks like “using VALUES” in a function, I think the actual goal is to be able to point the code at virtually any table and with as little fuss as possible, identify the columns to compare to see if they’re equal. The number of columns can vary, according to the OP. I’ll also assume that the actual columns names will all be similar to each other (i.e. “follow a pattern”).

I’ll also assume that the OP not only wants this to be easy to use and, of course, accurate, but would also like it to be as fast as possible.

Enter the “Dark Horse” in the form of a very old but tried and true method that avoids RBAR, avoids having to unpivot the rows, is easy to understand, can be made to work in virtually any version of SQL, and runs fast as the wind. Here’s the code and the test.

--=====================================================================================================================

-- If the purpose of this exercise is to make it so that we can generically test if multiple similarly named

-- columns in are identical on any table, then does it matter which method we use? Why can't it be a stored

-- procedure that takes the table name and a pattern of the columns we're looking for? And, why don't we use

-- an old trick with "CASE" to really speed up the works?

--=====================================================================================================================

--===== Conditionally drop the stored procedure to make reruns easier in SSMS.

IF OBJECT_ID('tempdb.dbo.AreEqualDynamic','P') IS NOT NULL DROP PROCEDURE dbo.AreEqualDynamic;

GO

--===== This creates the permanent stored procedure that will do all the work directly on the given table

CREATE PROCEDURE dbo.AreEqualDynamic

@pTableName VARCHAR(128),

@pPkColName VARCHAR(128),

@pColNamePattern VARCHAR(128)

AS

--===== Declare the obviously named local variable(s)

DECLARE @sql VARCHAR(MAX);

;

--===== Build the dynamic SQL. No SQL Injection possible here

WITH

cteEnumerate AS

( --=== Enumerate the column names with the assumption that the first column is the key

SELECT ColNum = ROW_NUMBER() OVER (ORDER BY column_id),

ColName = name

FROM sys.columns

WHERE object_id = OBJECT_ID(@pTableName)

AND name LIKE @pColNamePattern

) --=== Build the comparisons for dynamic SQL

SELECT @sql = ISNULL(@SQL + ' AND ','') + QUOTENAME(e1.ColName) + '=' + QUOTENAME(e2.ColName)

FROM cteEnumerate e1

JOIN cteEnumerate e2

ON e1.ColNum +1 = e2.ColNum

;

--===== Build the rest of the dynamic SQL and execute it

SELECT @sql = '

SELECT PK = ' + (SELECT QUOTENAME(name) FROM sys.columns WHERE object_id = OBJECT_ID(@pTableName) AND name = @pPkColName) + ',

AreEqual = CASE

WHEN ' + @sql + '

THEN 1

ELSE 0

END

FROM #TestTable

;

'

EXEC (@SQL)

print @sql

;

GO

;

--=====================================================================================================================

-- Now, test the code like we did with the others.

--=====================================================================================================================

--===== Test the dynamic CASE stored proc method

DECLARE @StartTime DATETIME

;

SELECT @StartTime = GETDATE()

;

EXEC dbo.AreEqualDynamic 'tempdb..#TestTable', 'I', 'D%'

;

SELECT CaseDur = DATEDIFF(ms,@StartTime, GETDATE())

;

Here’s the duration from that run on the very same data as the other two runs.

CaseDur

-----------

583

Other than the obvious advantage of it blowing the doors off the other two methods even though it didn’t dump its output to variables, there’s a huge hidden advantage. It doesn’t use the memory that a Table Variable requires. Instead, it uses the data from the table directly and it does so in an SQL Injection free manner.

BTW... here's the code that the dynamic SQL came up as...

SELECT PK = ,

AreEqual = CASE

WHEN [D1]=[D2] AND [D2]=[D3] AND [D3]=[D4] AND [D4]=[D5] AND [D5]=[D6] AND [D6]=[D7] AND [D7]=[D8] AND [D8]=[D9] AND [D9]=[D10]

THEN 1

ELSE 0

END

FROM #TestTable

;

Simple, easy to understand, and nasty fast.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

_simon_ SSCrazy Points: 2329 More actions · Answer 3

@jeff Moden:

Now, my question is, what is the actual functionality that the OP is asking for in the original post? Although it looks like “using VALUES” in a function, I think the actual goal is to be able to point the code at virtually any table and with as little fuss as possible, identify the columns to compare to see if they’re equal. The number of columns can vary, according to the OP. I’ll also assume that the actual columns names will all be similar to each other (i.e. “follow a pattern”).

Why I need this: our company is preparing and printing lottery games. Every game is created as a sql table and I am responsible for control of the data.

An example: Lottery ticket contains 9 fields (3x3 matrix). If there are 3 equal integers (or chars or bits or strings,...) in each horizontal, vertical or diagonal rows, then the ticket contains a win. The easyest way to test this is to have a function which accepts 3 parameters (of any type) and return 1 if all parameters are true and false otherwise. I could use something like where T1 = T2 and T1 = T3which isn't that bad really. But what if I need to test inequality, then I would need to write something like this:

where T1 <> T2 and T1 <> T3 and T2 <> T3

As the number of parameters grows the query gets more and more complex. So a function like NotEquals(... random number parameters of same type ...) would reduce complexity of my queries dramaticaly.

As you can imagine, there is a vast amout of possible games with complex logic behind it, so a function which accepts any number of generic parameters and returns a single value is very handy for my job. And it has to be a scalar function (and not procedure), so that I can use it in selects and/or where clause, for example:

select TicketId, Equals(T1, T2, T3), Equals(T1, T5, T9)...

select * from MyTable where Equals(T1, T2, T3)

...

I got a lot of interesting solutions in this thread, but for now I just created functions that I need in a seperate database and wrote a set of tests for them. Then I add a synonym to all other databases with a stored procedure which uses sp_msforeachdb. This way the functions are in one place and thoroughly tested. The good: simple syntax, the bad: I have to write and test each function by it self. Example functions:

create function EqualI2(@i1 int, @i2 int)...

create function EqualI3(@i1 int, @i2 int, @i3 int)...

...

create function EqualS2(@s1 varchar(max), @s2 varchar(max)...

create function EqualS3(@s1 varchar(max), @s2 varchar(max), @s3 varchar(max))...

...

create function EqualB2(@b1 bit, @b2 bit)...

create function EqualB3(@b1 bit, @b2 bit, @b3 bit,...)...

...

Edit:

- yes, this kind of function is easy to write in C# for example (with the 'params' keyword, generic parameters and EqualityComparer), but I like to use sql for set-based problems.

- I don't like the tally table approach, because then you have to use something like 'Equal(convert(varchar, I1) + ', ' + convert(varchar, I2) + ', ' ...

- the basic goal is simple syntax, because the final file of a single control could be hunderts of rows long and I really don't like to make it any longer...

Jeff Moden SSC Guru Points: 1004435 More actions · Answer 4

_simon_ (12/4/2012)
But what if I need to test inequality, then I would need to write something like this:
where T1 <> T2 and T1 <> T3 and T2 <> T3

I'm on my way to work and will have to do a deep dive on all that you wrote tonight but thought I'd quickly answer this...

I would suggest that you never test for such an inequality in such a manner. Always test for the equality and if that test "fails", then it must be an inequality, right?. Testing for equality is almost always faster than testing for inequalities.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

_simon_ SSCrazy Points: 2329 More actions · Answer 5

I would suggest that you never test for such an inequality in such a manner. Always test for the equality and if that test "fails", then it must be an inequality, right?. Testing for equality is almost always faster than testing for inequalities.

Well, this is embarrassing... :hehe: