November 12, 2022 at 11:00 pm
Hello All,
My first statement works but performs poorly. Subsequent statements generate errors.
Any insight into why the error? Any alternate syntaxes that might work that I can try on my live datasets?
DROP TABLE IF EXISTS #Test
CREATE TABLE #Test (TestData VARCHAR(800),TestDataType VARCHAR(10))
INSERT INTO #Test VALUES ('1','Integer')
INSERT INTO #Test VALUES ('A','Integer')
INSERT INTO #Test VALUES ('1.1','Decimal')
INSERT INTO #Test VALUES ('1A','Decimal')
INSERT INTO #Test VALUES ('2022-01-01','Date')
INSERT INTO #Test VALUES ('1','Date')
--This works but performs poorly with larger data sets
SELECT * FROM #Test
WHERE
(TestDataType = 'Integer' AND TRY_CONVERT(INT,TestData) IS NULL) OR
(TestDataType = 'Decimal' AND TRY_CONVERT(DECIMAL,TestData) IS NULL) OR
(TestDataType = 'Date' AND TRY_CONVERT(DATE,TestData) IS NULL)
--This does not work; I hoped it might perform better
--SELECT * FROM (
SELECT
CASE TestDataType
WHEN 'Integer' THEN TRY_CONVERT(INT,TestData)
WHEN 'Decimal' THEN TRY_CONVERT(DECIMAL,TestData)
WHEN 'Date' THEN TRY_CONVERT(DATE,TestData)
END AS TryConvertedValue,
*
FROM #Test
--) T WHERE TryConvertedValue IS NULL
SELECT
CASE
WHEN TestDataType = 'Integer' THEN TRY_CONVERT(INT,TestData)
WHEN TestDataType = 'Decimal' THEN TRY_CONVERT(DECIMAL,TestData)
WHEN TestDataType = 'Date' THEN TRY_CONVERT(DATE,TestData)
END AS TryConvertedValue,
*
FROM #Test
November 13, 2022 at 12:02 am
The reason why it produces an error is because the first thing to qualify in the CASE expression is an INTEGER and so the resulting column formed by the CASE is assigned the INT datatype. When it gets to doing the proper conversion to the DATE data type, the conversion is successful and a DATE is returned but it cannot put a DATE into an INTEGER column.
The only way you can do exactly what your code's intent is to convert the result to an SQL_VARIANT datatype, and I don't which SQL_VARIANT on anyone, with the possible exception of using it in a "By-Column" Audit Table, which I've done in the past and still hate it. 😀
You've posted code according to WHAT you're trying to do... If you tell us WHY you're trying to do it, perhaps we could suggest a decent or better alternative.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 13, 2022 at 3:45 pm
Hello All,
My first statement works but performs poorly. Subsequent statements generate errors.
Any insight into why the error? Any alternate syntaxes that might work that I can try on my live datasets?
Quick questions:
😎
November 23, 2022 at 3:30 pm
Thank you both for replying. I apologize for my delayed response, I though I would get an email when follow-ups were posted.
I am trying to validate data during an ETL process. The column can contain text, dates or integers, classic EAV form. I do have metadata that defines each column as Integer, Decimal or Date. I am trying to use TRY_CONVERT to identify records to delete. My delete statement would then use the OUTPUT statement to log deleted data so there would be a record of data that was thrown away. End goal is to remove and log data that would break the ETL process.
What are you trying to achieve?-Above.
What is the error?-Operand type clash: decimal is incompatible with date
The structure of the actual dataset?-Very similar to what I posted. the column data is VARCHAR(800)
Sample data that conforms with the actual dataset?-Very similar to what I posted. I haven't seen any bad data yet so I am just trying to be proactive. I can't provide exact DDL and data out of security concerns with my employer. I know this is not ideal but I do attempt to accurately represent the problem.
This article was helpful in getting better performance out of my actual data.
In my record set of 600,000 records adding the computed columns reduced query time from 90 to 3 seconds. Of interest creating an index containing the computed columns did not seem to have any impact, the performance improvement occurs with just the creation of the computed columns.
Again apologize for my late response and my inability to provide actual DDL, data and execution plans. Hopefully my response is interesting or could help someone in the future.
November 23, 2022 at 7:40 pm
Thank you both for replying. I apologize for my delayed response, I though I would get an email when follow-ups were posted.
You thought correctly except that functionality was broken starting on Oct 26th and they just fixed it a few days ago.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 23, 2022 at 7:49 pm
Shifting gears back to the problem, is there a UNIQUE column in the table to identify individual rows with?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 23, 2022 at 10:15 pm
Thanks Jeff
There is not a UNIQUE column in the table to identify individual rows with. Its an EAV slowly changing dimension.
DROP TABLE IF EXISTS #Test
CREATE TABLE #Test
(
Entity VARCHAR(20),
Attribute VARCHAR(800),
AttributeType VARCHAR(10),--This is actually in a seperate reference table keyed off Attribute
Value VARCHAR(800),
StartDate DATE NOT NULL,
EndDate DATE NULL,
)
As an aside I did not get an email with your last post. Last email I get from the forum was back in August and my email has not changed. I am checking 'Notify me of follow-up replies via email'.
November 25, 2022 at 3:34 am
But, you DO have a UNIQUE set of column keys based on Entity, Attribute, and StartDate.
This looks suspiciously like a column based audit table or Slowly Changing Dimension format. Is that what the ultimate use is? Where does all the Decimal, Integer, and VARCHAR stuff come into play here? Would that be the purpose of the AttributeType column?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 25, 2022 at 4:08 pm
Yes a unique index can be created on Entity, Attribute, and StartDate and yes this is a Slowly Changing Dimension format. Refined code and data below.
At this point my performance is acceptable once I created the computed columns. I haven't created the index like in the Ozar article.
So I think my problem is resolved but any additional insight is welcome.
DROP TABLE IF EXISTS #Test
CREATE TABLE #Test
(
Entity VARCHAR(20),
Attribute VARCHAR(800),
AttributeType VARCHAR(10),
Value VARCHAR(800),
StartDate DATE NOT NULL,
EndDate DATE NULL,
--These are the computed columns which improve performance on large data sets
ValueTryConvertDate AS TRY_CONVERT(DATE,Value),
ValueTryConvertDecimal AS TRY_CONVERT(DECIMAL,Value),
ValueTryConvertInteger AS TRY_CONVERT(INT,Value)
)
IF NOT EXISTS(SELECT * FROM sys.indexes WHERE Name = 'UI_Test') CREATE UNIQUE INDEX UI_Test ON dbo.#Test (Entity,Attribute,StartDate)
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('A','AppointmentDate','Date','2023-01-02','2022-11-05',NULL)
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('B','Temperataure','Decimal','100.3','2022-11-04','2022-11-04')
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('B','Temperataure','Decimal','100.1','2022-11-05',NULL)
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('C','Grade','Integer','8','2022-11-04','2022-11-04')
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('C','Grade','Integer','5','2022-11-05','2022-11-08')
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('C','Grade','Integer','9','2022-11-09',NULL)
--These records should be flagged becauae TRY_CONVERT converts to NULL
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('D','AppointmentDate','Date','0000-11-05','2022-11-05',NULL)
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('E','Temperataure','Decimal','A','2022-11-04','2022-11-04')
INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('F','Grade','Integer','9.9','2022-11-10',NULL)
SELECT * FROM #Test
WHERE
(AttributeType = 'Integer' AND TRY_CONVERT(INT,Value) IS NULL) OR
(AttributeType = 'Decimal' AND TRY_CONVERT(DECIMAL,Value) IS NULL) OR
(AttributeType = 'Date' AND TRY_CONVERT(DATE,Value) IS NULL)
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply