January 6, 2010 at 9:40 am
Hi all,
I have a function that needs to work to export some numeric values to an old Cobol system.
For those of you familiar with the Format, I need to export the price (e.g. 123.45) as a 9(8)V99 format.
In example, the value 123.45 would give 0000012345, since we want 8 digits for the numeric part, and 2 for the decimal part. We always have to pad with 0's, for the Cobol format, and need to remove the decimal separator.
I have come up with this function, and test data, (10K rows), but it does not scale well, it takes about 7-10 seconds for 10K rows, and I'm looking to extract roughly 1 million weekly, so that tends to take a long time.
Anywayz, if anyone wants to take a shot a it, I'd really appreciate. Here's the code:
IF EXISTS ( SELECT *
FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].[fnGICSPFParseNumericForCobol]')
AND type IN ( N'FN', N'IF', N'TF', N'FS', N'FT' ) )
DROP FUNCTION [dbo].[fnGICSPFParseNumericForCobol]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION fnGICSPFParseNumericForCobol
(
@InputNumber NUMERIC(38, 15) ,
@IntegerPart INT ,
@DecimalPart INT
)
RETURNS VARCHAR(50)
AS
BEGIN
DECLARE @StrIntegerPart VARCHAR(50) -- Integerpart in String
DECLARE @StrDecimalPart VARCHAR(50) -- Decimalpart in String
DECLARE @StrInputNumber VARCHAR(50) -- InputNumber in string
SET @StrInputNumber = CONVERT(VARCHAR(50), @InputNumber) -- Parse the number to make a string.
-- Substring the integer part of the number and
-- trim to the desired amount of characters (@IntegerPart)
SET @StrIntegerPart = RIGHT(SUBSTRING(@StrInputNumber, 0,
PATINDEX('%[.,]%',
@StrInputNumber)),
@IntegerPart)
-- Pad with '0' to fill the number in the desired value.
-- i.e. 12.123 -> 9(04)V9(03) = '0012.123' = 0012123
WHILE LEN(@StrIntegerPart) < @IntegerPart
BEGIN
SET @StrIntegerPart = '0' + @StrIntegerPart
END
-- Get the decimal part of the InputNumber. Then fill with '0' at the right.
SET @StrDecimalPart = LEFT(SUBSTRING(@StrInputNumber,
PATINDEX('%[.,]%',
@StrInputNumber) + 1,
LEN(@StrInputNumber)),
@DecimalPart)
-- Return Integer + Decimal parts.
RETURN @StrIntegerPart
+ @StrDecimalPart
END
GO
--=== Drop the table if it exists
IF OBJECT_ID('TestFormat') IS NOT NULL
DROP TABLE TestFormat ;
--=== Create the test Table for performance testing
CREATE TABLE TestFormat
(
ID INT IDENTITY(1, 1) ,
Val1 NUMERIC(10, 2) ,
val2 NUMERIC(10, 2)
)
--=== Create 10K rows of numeric data, for performance testing
INSERT INTO TestFormat
( Val1 ,
Val2
)
SELECT TOP 10000
CAST(ABS(CHECKSUM(NEWID()) % 1000000000000 / 100.0) AS DECIMAL(10,
2)) ,
CAST(ABS(CHECKSUM(NEWID()) % 1000000000000 / 100.0) AS DECIMAL(10,
2))
FROM Master.dbo.SysColumns sc1
CROSS JOIN Master.dbo.SysColumns sc2
SET STATISTICS TIME ON ;
--=== Try the poor performing function, on 10K Rows
SELECT Val1,
dbo.fnGICSPFParseNumericForCobol(Val1, 8, 2) AS ComputedVal1,
Val2,
dbo.fnGICSPFParseNumericForCobol(Val2, 6, 2) AS ComputedVal2
FROM TestFormat
SET STATISTICS TIME OFF ;
Thanks in advance, There must be a way to use a Cte, or Inline function, to replace this Scalar one, but I can't find it.
Cheers,
J-F
January 6, 2010 at 11:23 am
J-F if possilbe, can you get rid of the function and just add a persisted calculated column instead? persisted stores the value in the data pages, and would be much faster to access;
would this work for you?
typical results:
ID Val1 val2 Val1_formatted Val2_formatted
--- ---- ---- ------------ ------------- -------------- --------------
10 10 1 21471601.93 15630264.86 2147160193 63026486
10 10 2 5960743.79 3111563.78 0596074379 11156378
10 10 3 3246489.88 2933431.45 0324648988 93343145
10 10 4 16442183.35 13033996.38 1644218335 03399638
10 10 5 4365166.84 3532365.53 0436516684 53236553
the change:
ALTER TABLE TestFormat ADD Val1_formatted AS RIGHT('0000000000' + CONVERT(VARCHAR,CONVERT(INT,Val1 * 100)),10) PERSISTED
ALTER TABLE TestFormat ADD Val2_formatted AS RIGHT('00000000' + CONVERT(VARCHAR,CONVERT(INT,Val2 * 100)),8) PERSISTED
SELECT len(Val1_formatted),len(Val1_formatted),* FROM TestFormat
Lowell
January 6, 2010 at 11:43 am
Even if he can't add persisted columns... if I run a straight select and then a select with your computations it runs in the exact same time over 1M records (1 sec on a low end work station WITH XP SP3 AND SQL 2008 SP1 Standard).
I agree that I'd add the persisted columns as pers choice, but for speed only, it does not make that much of a difference with your code.
January 6, 2010 at 11:54 am
Not really, that is not an option, since the function is used across the system, everytime we need an output for cobol. It is in many tables, and sometimes the number of decimals/digits change, that is the challenge with the function...
Cheers,
J-F
January 6, 2010 at 11:57 am
Ninja's_RGR'us (1/6/2010)
Even if he can't add persisted columns... if I run a straight select and then a select with your computations it runs in the exact same time over 1M records (1 sec on a low end work station WITH XP SP3 AND SQL 2008 SP1 Standard).I agree that I'd add the persisted columns as pers choice, but for speed only, it does not make that much of a difference with your code.
Well, how do you handle it when I need 6 digits, and 4 decimals...
Eg. 100.35 would give 0001003500
You need the leading AND trailing zeros to fit the requirement, and remove the demiliter.. that is what I have trouble with! 😉
Cheers,
J-F
January 6, 2010 at 11:59 am
Going in a meeting, i'll be back in a few, if you have any other suggestions, i'd greatly appreciate!
Cheers,
J-F
January 6, 2010 at 12:22 pm
EDIT: HOLD ON, I think I have a problem!!!
2nd EDIT: Actually I don't, it was the data, it was getting truncated.
The test data is all NUMERIC(10,2) so when I rechecked my results the values didn't match. When I changed the test query to use
10, 2 instead of 8, 2 and 6, 2 the values looked good.
Since it appears you are using SQL Server 2005, try this:
CREATE FUNCTION dbo.fnGICSPFParseNumericForCobol
(
@InputNumber NUMERIC(38, 15) ,
@precision INT ,
@Scale INT
)
RETURNS TABLE
AS return (select right(replicate('0', @precision) + cast(cast(@InputNumber * power(10,@Scale) as int) as varchar(64)), @precision) as CobolData)
GO
SET STATISTICS TIME ON ;
--=== Try the poor performing function, on 10K Rows
SELECT
tf.Val1,
parse1.CobolData AS ComputedVal1,
tf.Val2,
parse2.CobolData AS ComputedVal2
FROM
dbo.TestFormat tf
cross apply dbo.fnGICSPFParseNumericForCobol(tf.Val1, 8, 2) parse1
cross apply dbo.fnGICSPFParseNumericForCobol(tf.Val2, 6, 2) parse2
SET STATISTICS TIME OFF ;
On my server here at work against 10,000 rows:
(10000 row(s) affected)
SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 207 ms.
January 6, 2010 at 1:47 pm
Lynn, would schema binding that function make any difference to the performance? Could it be modified to a scalar UDF and would that make it perform any differently?
Just questions - I guess I could test it myself 😉
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
January 6, 2010 at 1:53 pm
Not really sure. I just sort of stumbled on to this method of converting data. We used an scaler UDF to convert dates to a specified format for an export. I found out that turning it into an in-line TVF actually sped the export up significantly. At this time, I really can't explain why.
January 6, 2010 at 1:56 pm
To continue, I did write a blog about it Comparing Hardcoded functions, In-Line TVF's, and Scalar Functions
January 6, 2010 at 2:01 pm
Just finished a quick test - converting to a scalar UDF took the following on my system:
SQL Server Execution Times:
CPU time = 984 ms, elapsed time = 1044 ms.
So, using Lynn's approach is going to be the best - I think.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
January 6, 2010 at 2:41 pm
Lynn Pettis (1/6/2010)
EDIT: HOLD ON, I think I have a problem!!!2nd EDIT: Actually I don't, it was the data, it was getting truncated.
The test data is all NUMERIC(10,2) so when I rechecked my results the values didn't match. When I changed the test query to use
10, 2 instead of 8, 2 and 6, 2 the values looked good.
Since it appears you are using SQL Server 2005, try this:
CREATE FUNCTION dbo.fnGICSPFParseNumericForCobol
(
@InputNumber NUMERIC(38, 15) ,
@precision INT ,
@Scale INT
)
RETURNS TABLE
AS return (select right(replicate('0', @precision) + cast(cast(@InputNumber * power(10,@Scale) as int) as varchar(64)), @precision) as CobolData)
GO
SET STATISTICS TIME ON ;
--=== Try the poor performing function, on 10K Rows
SELECT
tf.Val1,
parse1.CobolData AS ComputedVal1,
tf.Val2,
parse2.CobolData AS ComputedVal2
FROM
dbo.TestFormat tf
cross apply dbo.fnGICSPFParseNumericForCobol(tf.Val1, 8, 2) parse1
cross apply dbo.fnGICSPFParseNumericForCobol(tf.Val2, 6, 2) parse2
SET STATISTICS TIME OFF ;
On my server here at work against 10,000 rows:
(10000 row(s) affected)
SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 207 ms.
Lynn, this is more than what I was looking for. I'm really surprised you've come with a solution that works with the parametrised leading/trailing zeros. The trailing zeroes gave me such a hard time to think of!
I will test your solution a bit more, but this looks exactly like what I need!
Thanks a lot, this is really nice.
I guess that's what happens when you provide sample data, 😉
Cheers,
J-F
January 6, 2010 at 2:50 pm
Previous employer, Retail System was COBOL. When we started using SQL Server for reporting and other activities, we found it easier to complete updates on the SQL side then simply update the ISAM databases for the application through simple COBOL update programs instead of writing the updates in COBOL to update the system. The later would take 2 weeks to complete a price change (for instance) and using SQL we could do it in a couple of hours.
January 6, 2010 at 11:25 pm
Ya beat me to it on the POWER thing. Nicely done, Lynn.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 6, 2010 at 11:32 pm
Lynn Pettis (1/6/2010)
To continue, I did write a blog about it Comparing Hardcoded functions, In-Line TVF's, and Scalar Functions
Nice... I'm curious... why blog these instead of submitting them as articles? Articles don't have to be long to be effective. This would have been a perfect article.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply