TextToDecimal
SUMMARY:
This UDF script takes a text value(nvarchar) and returns a decimal(18,6) number. If the text value can't be interpreted as numeric, the UDF returns NULL.
-----------------------------------------------
USAGE:
SET @MyDecimal = dbo.TextToDecimal('-$123,456.73')
@MyDecimal will now be -123456.730000
SET @MyDecimal = dbo.TextToDecimal('-$123,4560.73') --bad number format
@MyDecimal will now be NULL
------------------------------------------------------
DESCRIPTION:
The ISNUMERIC function incorrectly returns 1 (True) for many non-numeric text values. Even worse, converting to a numeric data type by passing one of these non-numeric values to a CONVERT or CAST function will cause an untrappable error that immediately exits an SP, function, or T-SQL statement batch. Here are a few values that ISNUMERIC thinks are numbers, but will crash with CONVERT or CAST:
--anything with a dollar sign or a a comma in it.--
'-' --single non-numeric characters
'+'
','
'$'
',0' --misplaced commas
'1,'
'1,0'
'1.0,'
',1.0'
'1.0,012,'
'1.0,012,000'
'$1,00'
'$1,00.'
'$1,00.0,'
'10E2' --legacy exp formats
'10D2'
'0000000.,01' --more bizarre forms
'.,'
CREATE FUNCTION dbo.TextToDecimal (@TextNum nvarchar(400) )
--Richard Moldwin 9/16/05
--Replacement for the very buggy ISNUMERIC and CONVERT functions
--ISNUMERIC returns true for many non-numbers with
-- non-CONVERT-able embedded characters like commas, "-", "+", and "$", etc,
-- and also returns true for single characters that are definitely not numbers like: , # $ + -
--CONVERT produces non-trappable errors when "$", commas, leading + sign
-- and related formats are present in number strings.
-- Since the errors are non-trappable (they abort the entire T-SQL statement batch) , we can't work around them with T-SQL code.
--This function fixes all of the above problems, and:
--Accepts a text string of up to 400 Unicode characters.
--Returns a decimal number in 18,6 format
-- (precision = 18, scale = 6) from a input string: (123456789012.123456)
--Returns NULL if the string can't be converted into a decimal format
--The input @TextNum will be:
--Trimmed to remove leading and trailing spaces
--stripped of legal starting characters
--checked for illegal non-numeric characters
--checked for duplicate decimal points
--checked for commas in illegal positions
--stripped of all commas
--checked to ensure that the maximum integer size is small enough to be converted to (18, 6) format
--This function:
--does not convert scientific notation (e.g. 10E4) at this time,
-- because conversion to the resultant decimal could easily
-- produce overflow errors even with the maximum current fixed decimal(38) format.
-- Commas
--Any leading comma (before the first digit) is illegal and produces a non-numeric string and a NULL output
--Commas not obeying the every-third digit rule is illegal, and will produce a non-numeric string and a NULL output
--Commas after a decimal point are illegal and will produce a non-numeric string and a NULL output
--In a string with otherwise valid commas, zeros may be used to pad the leading digit up to 3 digits:
-- 001,100 and 01,100, 000,000,001 and 1,100 are all valid.
-- More than 3 leading zeros when commas are present (e.g. 0001,000 or 0123,000,000) are always illegal,
-- and will produce a non-numeric string and a NULL output
--Leading and trailing spaces are stripped; embedded spaces produce a non-numeric string and a NULL output
--Leading $, -$ or +$ characters are allowed.
--Single leading + or - is allowed.
--Only a single decimal point is allowed.
--To fit Decimal(18, 6) format:
-- No more that 12 digits (18 - 6 = 12) to the left of the decimal are allowed; if there are more than 12, we return NULL
-- if there are more than 6 digits to the right of the decimal point, the extra digits will be rounded and truncated by the CONVERT function.
-- (Decimal(18, 6) may be changed to another convenient format, as desired.)
RETURNS decimal(18, 6) --although we can easily support the maximum decimal(38) size (e.g. (38, 12) ) , this version will support only Decimal(18,6)
BEGIN
DECLARE @Neg AS nvarchar(1) --flag for a negative number
DECLARE @TextLen AS int --length of the current working input text value
--strip leading and trailing spaces
SET @TextNum = LTRIM(RTRIM(@TextNum))
SET @TextLen = LEN(@TextNum)
--abort for NULL text or illegal single characters
IF @TextLen = 0 RETURN NULL
IF @TextLen = 1
IF @TextNum LIKE '[^0-9]' RETURN NULL --Don't allow non-numeric single characters.
--strip off leading + or - or $ or -$ or +$
IF LEFT(@TextNum, 1) = '-'
BEGIN
SET @TextNum = SUBSTRING(@TextNum, 2, @TextLen)
SET @Neg = '-' --flag for a negative number
SET @TextLen = LEN(@TextNum)
END
ELSE
BEGIN
SET @Neg = ''
IF LEFT(@TextNum, 1) = '+'
BEGIN
SET @TextNum = SUBSTRING(@TextNum, 2, @TextLen)
SET @TextLen = LEN(@TextNum)
END
END
IF LEFT(@TextNum, 1) = '$'
BEGIN
SET @TextNum = SUBSTRING(@TextNum, 2, @TextLen)
SET @TextLen = LEN(@TextNum)
END
--Now that we have removed any leading characters: + - $ -$ +$
--we can check for any character that is not a digit or comma, or decimal point
IF PATINDEX('%[^0-9,.]%' ,@TextNum) >0 --non-numeric character (other than decimal point or comma) in any position
BEGIN
RETURN NULL
END
------------Decimal & Comma handling section--------------------------------------------------------------------------------------------------
--Here we see if any commas or decimal points are present or misplaced:
DECLARE @CommaPos AS TinyInt
DECLARE @CommaPosOld AS TinyInt
DECLARE @LastCommaPos AS TinyInt
DECLARE @NoCommaNum nVARCHAR(64) --@num with all commas removed
DECLARE @NumberOfCommas AS int
DECLARE @LastDecimalPos AS TinyInt --position of last decimal point
DECLARE @TextInt AS nvarchar(64) --Integer version of @TextNum
--get position of first comma in string
SET @CommaPos = CHARINDEX(',',@TextNum)
IF @CommaPos = @TextLen RETURN NULL --last character is a comma
IF @CommaPos = 1 RETURN NULL --first character is a comma
--find the position of the LAST decimal point in a string
SET @LastDecimalPos = 1 +@TextLen - CHARINDEX('.', REVERSE(@TextNum))
--Make sure we have no more than 1 decimal point
IF NOT( @LastDecimalPos = @TextLen + 1 --Postition of the right-most decimal
OR @LastDecimalPos = CHARINDEX('.', @TextNum)) --Position of right-most decimal point equals position of left-most decimal point; i.e there is only one decimal point
RETURN NULL
--find the position of the right-most comma, and ensure it occurs to the left of the decimal point
SET @LastCommaPos = 1 +@TextLen- CHARINDEX(',', REVERSE(@TextNum))
IF @LastCommaPos <= @TextLen --we have found a comma
IF @LastCommaPos > @LastDecimalPos RETURN NULL --comma present after decimal, so this is not a proper number
--Get the text to the left of the decimal (the integer part): this is the only place where we will allow commas.
SET @TextInt = SUBSTRING(@TextNum, 1, @LastDecimalPos-1)
IF LEN( REPLACE(@TextInt, ',', '')) >12 RETURN NULL --largest integer part we will handle is 12 digits for decimal (18,6 format)
--This is to prevent CONVERT errors below, when the integer part has more than 12 digits.
--also, digits after the decimal point will be truncated to 6 digits for our preferred decimal (18, 6) format.
--now we find the last comma position in the integer part of the text (BEFORE the last decimal point) : (e.g. produces the value '7' for 1,1bc3,456.001,0)
SET @LastCommaPos = 1 + LEN(@TextInt) - CHARINDEX(',', REVERSE(@TextInt))
DECLARE @IntLen AS int --length of the integer part of the text
DECLARE @Result AS int --search results when looking for valid characters.
SET @IntLen = LEN(@TextInt)
--IF at least one comma is present, make sure commas are in the correct pattern
IF @LastCommaPos < @IntLen + 1
BEGIN --start of long section for handling commas
SELECT @Result=
CASE @IntLen
WHEN 3 THEN
PATINDEX('[0-9][0-9][0-9]', @TextInt) --3 digits, no commas allowed
WHEN 2 THEN
PATINDEX('[0-9][0-9]', @TextInt) --2 digits, no commas allowed
WHEN 1 THEN
PATINDEX('[0-9]', @TextInt) --1 digit, no commas allowed
ELSE
PATINDEX('%,[0-9][0-9][0-9]', @TextInt)
END
if @Result = 0 RETURN NULL
WHILE @IntLen >3 --while a comma is present
--We only allow commas at 3 digit intervals; if comma is out of place, we have an illegal number
BEGIN
SET @TextInt = SUBSTRING(@TextInt, 1, @LastCommaPos-1) --lop off the right side of the integer text at the position of the last comma
SET @IntLen = LEN(@TextInt)
SET @LastCommaPos = 1 +@IntLen - CHARINDEX(',', REVERSE(@TextInt))
SELECT @Result=
CASE @IntLen
WHEN 3 THEN
PATINDEX('[0-9][0-9][0-9]', @TextInt) --3 digits, no commas allowed
WHEN 2 THEN
PATINDEX('[0-9][0-9]', @TextInt) --2 digits, no commas allowed
WHEN 1 THEN
PATINDEX('[0-9]', @TextInt) --1 digit, no commas allowed
ELSE
PATINDEX('%,[0-9][0-9][0-9]', @TextInt) --comma allowed just before the right-most 3 digits
END
IF @Result = 0 RETURN NULL
END
--Now strip out all the commas
SET @TextNum =REPLACE(@TextNum, ',','')
END
-----------End of Decimal & Comma handling section ----------------------------------------------------------------------
--At this point, the input text (@TextNum) has been:
--Trimmed to remove leading and trailing spaces
--stripped of legal starting characters
--checked for illegal non-numeric characters
--checked for duplicate decimal points
--checked for commas in illegal positions, and
--stripped of all commas
--checked to ensure that the maximum integer size is small enough to be converted to (18, 6) format
--If we got to this point, the CONVERT function should never produce errors!
RETURN CONVERT(decimal(18, 6), @Neg + @TextNum)--
END