September 19, 2012 at 3:41 am
I have a query that I run daily to pull data from a table but I only wish to take out the rows of data with today's date..
so my table is:
CREATE TABLE [dbo].[DATAFLOW_CLOB](
[DC_INDEX_FK] [numeric](15, 0) NOT NULL,
[RAW_CLOB] [text] NULL,
[RAW_BLOB] [text] NULL,
CONSTRAINT [UNQDFCL] UNIQUE NONCLUSTERED
(
[DC_INDEX_FK] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)
and sample data is:
DC_INDEX_FK
1
RAW_CLOB
AAA|E0221002|D|20120830080542|EC|UK|BP|01|6591|OPER|
FCH|20120830|20120906|20120830075948|56993185|22|
EAC|C|
OED|33094|AP|2|AP|P|20110712||
ECD|33094|AP|
EDD|20120830|20120830|||||
ECP|22||-9.5|
RAW_BLOB
Null
and my query is:
SELECT *
FROM [GATEKEEPER].[dbo].[DATAFLOW_CLOB]
WHERE RAW_CLOB LIKE 'AAA|E0221002|_|20120830%ecp%'
GO
What I want to do is instead of naming the date in my query '20120830' I wish to just take today's date if this is possible?
This would save me having to change the date every day.
Thanks for any help, it's greatly appreciated.
September 19, 2012 at 3:57 am
I guess the easiest thing to do would be to change your query to this: -
SELECT *
FROM [dbo].[DATAFLOW_CLOB]
WHERE RAW_CLOB LIKE 'AAA|E0221002|_|'+CONVERT(VARCHAR(8),GETDATE(),112)+'%ecp%';
You should look into string splitting, as well.
September 19, 2012 at 3:58 am
SELECT a.* FROM OPENROWSET( BULK 'c:\test\values.txt', FORMATFILE = 'c:\test\values.fmt') AS a;
your looking for something like this , additional details available in BOL
edit :- sorry didnt read the question properly , but why store it as clob when u have the delimiter anyway , isnt it better to load it as columns and have them indexed for your queries
September 19, 2012 at 4:07 am
Jayanth_Kurup (9/19/2012)
SELECT a.* FROM OPENROWSET( BULK 'c:\test\values.txt', FORMATFILE = 'c:\test\values.fmt') AS a;your looking for something like this , additional details available in BOL
edit :- sorry didnt read the question properly , but why store it as clob when u have the delimiter anyway , isnt it better to load it as columns and have them indexed for your queries
We don't have control over it.. this data gets sent to us in this format and goes into the database.. I'm just trying to pull the relevant data out for each day.
September 19, 2012 at 4:08 am
Cadavre (9/19/2012)
I guess the easiest thing to do would be to change your query to this: -
SELECT *
FROM [dbo].[DATAFLOW_CLOB]
WHERE RAW_CLOB LIKE 'AAA|E0221002|_|'+CONVERT(VARCHAR(8),GETDATE(),112)+'%ecp%';
You should look into string splitting, as well.
Thanks Cadavre, I will try this now. I have used string splitting for other data very similar to this and it works well, but it is not required for this data. Thank you anyway 🙂 I'll let you know how I get on.
September 19, 2012 at 4:20 am
Cadavre (9/19/2012)
I guess the easiest thing to do would be to change your query to this: -
SELECT *
FROM [dbo].[DATAFLOW_CLOB]
WHERE RAW_CLOB LIKE 'AAA|E0221002|_|'+CONVERT(VARCHAR(8),GETDATE(),112)+'%ecp%';
You should look into string splitting, as well.
That worked great, thank you. I'm slowly learning..
September 19, 2012 at 4:35 am
sqlrd22 (9/19/2012)
That worked great, thank you. I'm slowly learning..
Glad it worked. Check this link for other formatting options for datetimes. Bear in mind that this is generally slower than a SARGable query.
sqlrd22 (9/19/2012)
We don't have control over it.. this data gets sent to us in this format and goes into the database.. I'm just trying to pull the relevant data out for each day.
Does the data get sent to you in a CSV or other format, which you guys then dump into your database? What I alluded to and "Jayanth_Kurup" suggested is that you should instead split this data into columns while you're sticking it into your database.
I'm assuming that your number of deliminators doesn't change and that each deliminator is a piece of data that you know about. If you split this all out, then you could make your query into something like this: -
SELECT *
FROM yourSplitData
WHERE theDateColumn = DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0);
If you indexed your table, then your query would be much faster than the current "LIKE" string operation that you're doing.
September 19, 2012 at 4:44 am
Cadavre (9/19/2012)
sqlrd22 (9/19/2012)
That worked great, thank you. I'm slowly learning..Glad it worked. Check this link for other formatting options for datetimes. Bear in mind that this is generally slower than a SARGable query.
sqlrd22 (9/19/2012)
We don't have control over it.. this data gets sent to us in this format and goes into the database.. I'm just trying to pull the relevant data out for each day.Does the data get sent to you in a CSV or other format, which you guys then dump into your database? What I alluded to and "Jayanth_Kurup" suggested is that you should instead split this data into columns while you're sticking it into your database.
I'm assuming that your number of deliminators doesn't change and that each deliminator is a piece of data that you know about. If you split this all out, then you could make your query into something like this: -
SELECT *
FROM yourSplitData
WHERE theDateColumn = DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0);
If you indexed your table, then your query would be much faster than the current "LIKE" string operation that you're doing.
We don't dump the data into the database, this is done automatically by the software, we didn't create the table either and don't have the rights to change anything within it as it then wouldn't be the right format when the data is put in there by the sending party.
The no. of deliminators does change in each row, I had just shortened the sample data dramatically for the purposes of posting it on here.
September 19, 2012 at 5:23 am
Cadavre (9/19/2012)
sqlrd22 (9/19/2012)
That worked great, thank you. I'm slowly learning..Glad it worked. Check this link for other formatting options for datetimes. Bear in mind that this is generally slower than a SARGable query.
Sorry to slightly hijack this topic, but I've never understood this, am I missing something?
We're talking about comparing a column to a scalar value of a particular type, why does it matter how you arrive at your scalar value? Assuming it ends up as a type that is either the same, or has a lower data type precedence than the column it's comparing, it's SARGable.
An example below. I'm sure I'm missing something as people say all the time that you should use the dateadd method rather than convert, but I can't seem to get a non-SARGable version as long as you're not wrapping a function around the column, rather than the scalar part (in which case, both methods become unSARGable). If you need to get rid of the time part of a datetime/datetime2 column, converting to DATE is the only one that allows a seek in this case.
CREATE TABLE #TEST (DateColumn DATETIME2(0) NOT NULL PRIMARY KEY CLUSTERED);
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b),
E4(N) AS (SELECT 1 FROM E2 a, E2 b)
--code pinched from Jeff's tally splitter
INSERT INTO #test
( DateColumn )
SELECT CAST(DATEADD(Day,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), GETDATE()-1000) AS DATE) FROM E4
--generate 10000 dates
--various different versions of comparing dates and their result (tested SQL 2008 SP3)
--dateadd
SELECT * FROM #test
WHERE DateColumn=DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0)
--seek
--convert to varchar
SELECT * FROM #test
WHERE DateColumn=CONVERT(VARCHAR(8),GETDATE(),112)
--seek (with implicit conversion)
--convert to varchar with explicit convert back to datetime
SELECT * FROM #test
WHERE DateColumn=CONVERT(DATETIME,CONVERT(VARCHAR(8),GETDATE(),112))
--seek
--convert to DATE data type
SELECT * FROM #test
WHERE DateColumn=CONVERT(DATE,GETDATE())
--seek
--variations with functions wrapped around the column rather than the scalar value
SELECT * FROM #test
WHERE CONVERT(VARCHAR(8),DateColumn,112)=CONVERT(VARCHAR(8),GETDATE(),112)
--scan
SELECT * FROM #test
WHERE DATEADD(dd, DATEDIFF(dd, 0, DateColumn), 0)=DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0)
--scan
SELECT * FROM #test
WHERE CONVERT(DATE,DateColumn)=CONVERT(DATE,GETDATE())
--seek (within a range)
October 12, 2012 at 2:51 am
I'm sorry Howard, I completely missed this.
You're correct that so long as the convert is wrapped around the scalar value it doesn't make a huge difference. It does make a small difference over larger datasets, but the DATEADD and VARCHAR methods will probably be equivalent. Let's see, shall we? 🙂
SET NOCOUNT ON;
IF object_id('tempdb..#TEST') IS NOT NULL
BEGIN
DROP TABLE #TEST;
END;
CREATE TABLE #TEST (ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
DateColumn DATETIME2(0) NOT NULL);
WITH CTE1(N) AS (SELECT 1 FROM(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))a(N)),
CTE2(N) AS (SELECT 1 FROM CTE1 a, CTE1 b),
CTE3(N) AS (SELECT 1 FROM CTE2 a, CTE2 b),
CTE4(N) AS (SELECT 1 FROM CTE3 a, CTE3 b)
INSERT INTO #TEST
SELECT TOP 1000000
RAND(CHECKSUM(NEWID())) * 30000 /*(Number of days in range)*/ +
CAST('1945' AS DATETIME) /*(Start date, e.g. '1945-01-01 00:00:00*/
FROM CTE4;
DECLARE @HOLDER_VARIABLE /*To take display time out of the equation*/ DATETIME;
PRINT '========== BASELINE ==========';
SET STATISTICS TIME, IO ON;
SELECT @HOLDER_VARIABLE = DateColumn FROM #TEST;
SET STATISTICS TIME, IO OFF;
PRINT REPLICATE('=',80);
PRINT '========== DATEADD ==========';
SET STATISTICS TIME, IO ON;
SELECT @HOLDER_VARIABLE = DateColumn FROM #TEST WHERE DateColumn=DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0);
SET STATISTICS TIME, IO OFF;
PRINT REPLICATE('=',80);
PRINT '========== CONVERT VARCHAR ==========';
SET STATISTICS TIME, IO ON;
SELECT @HOLDER_VARIABLE = DateColumn FROM #TEST WHERE DateColumn=CONVERT(VARCHAR(8),GETDATE(),112);
SET STATISTICS TIME, IO OFF;
PRINT REPLICATE('=',80);
PRINT '========== CONVERT DATE ==========';
SET STATISTICS TIME, IO ON;
SELECT @HOLDER_VARIABLE = DateColumn FROM #TEST WHERE DateColumn = CONVERT(DATE,GETDATE());
SET STATISTICS TIME, IO OFF;
PRINT REPLICATE('=',80);
PRINT '========== VARIATION OF CONVERT VARCHAR ==========';
SET STATISTICS TIME, IO ON;
SELECT @HOLDER_VARIABLE = DateColumn FROM #TEST WHERE CONVERT(VARCHAR(8),DateColumn,112)=CONVERT(VARCHAR(8),GETDATE(),112);
SET STATISTICS TIME, IO OFF;
PRINT REPLICATE('=',80);
PRINT '========== VARIATION OF DATEADD ==========';
SET STATISTICS TIME, IO ON;
SELECT @HOLDER_VARIABLE = DateColumn FROM #TEST WHERE DATEADD(dd, DATEDIFF(dd, 0, DateColumn), 0)=DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0);
SET STATISTICS TIME, IO OFF;
PRINT REPLICATE('=',80);
PRINT '========== VARIATION OF CONVERT DATE ==========';
SET STATISTICS TIME, IO ON;
SELECT @HOLDER_VARIABLE = DateColumn FROM #TEST WHERE CONVERT(DATE,DateColumn)=CONVERT(DATE,GETDATE());
SET STATISTICS TIME, IO OFF;
PRINT REPLICATE('=',80);
========== BASELINE ==========
Table '#TEST'. Scan count 1, logical reads 2353, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 187 ms, elapsed time = 196 ms.
================================================================================
========== DATEADD ==========
Table '#TEST'. Scan count 1, logical reads 2353, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 62 ms, elapsed time = 73 ms.
================================================================================
========== CONVERT VARCHAR ==========
Table '#TEST'. Scan count 1, logical reads 2353, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 75 ms.
================================================================================
========== CONVERT DATE ==========
Table '#TEST'. Scan count 1, logical reads 2353, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 63 ms, elapsed time = 58 ms.
================================================================================
========== VARIATION OF CONVERT VARCHAR ==========
Table '#TEST'. Scan count 1, logical reads 2353, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 531 ms, elapsed time = 525 ms.
================================================================================
========== VARIATION OF DATEADD ==========
Table '#TEST_______________________________________________________________________________________________________________000000000104'. Scan count 1, logical reads 2353, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 219 ms, elapsed time = 220 ms.
================================================================================
========== VARIATION OF CONVERT DATE ==========
Table '#TEST'. Scan count 1, logical reads 2353, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 94 ms, elapsed time = 101 ms.
================================================================================
Note the 20ms that you save when converting your GETDATE to a DATE over both the DATEADD and VARCHAR comparisons.
The reason I stress that converting to VARCHAR for date comparisons is bad is because I don't want new SQL developers to start wrapping their columns convert functions. I should learn to clarify when I'm writing on a forum.
October 12, 2012 at 7:42 am
HowardW (9/19/2012)
We're talking about comparing a column to a scalar value of a particular type, why does it matter how you arrive at your scalar value?[/code]
I absolutely agree... the difference in performance is nearly nil.
Some will say that CONVERT isn't portable and while that may be true, writing truly portable but still high performance or complex code is usually an oxymoron.
I agree with Cadavre, though. It's like practicing the piano. There are times where it does make a difference (admittedly usually with millions of rows where every ns saved per row can be a blessing). If you simply practice the fastest method for doing something such as stripping time all the time, then you won't hit the proverbial wrong notes. The old proverb of "Mind the pennies and the dollars will take care of themselves" frequently applies on such things.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply