April 20, 2010 at 9:58 am
I am currently doing string manipulations with T-sql. In the where statement of my code I have the following substring function:
WHERE
SUBSTRING(LineRead,2,4) IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')
This produces the output that I want albeit it takes a while. I have been reading up about using EXISITS instead of IN. I just wanted to know if there is another way of rewriting this where clause.
April 20, 2010 at 10:15 am
eseosaoregie (4/20/2010)
I am currently doing string manipulations with T-sql. In the where statement of my code I have the following substring function:
WHERE
SUBSTRING(LineRead,2,4) IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')
This produces the output that I want albeit it takes a while. I have been reading up about using EXISITS instead of IN. I just wanted to know if there is another way of rewriting this where clause.
The reason it takes a while is that it can't use an index if one exists. Can you add a computed column to the table with the substring values and index it?
April 21, 2010 at 12:12 am
If i add the computed colum presumably the index to create is a non clustered index?
April 21, 2010 at 7:47 am
eseosaoregie (4/21/2010)
If i add the computed colum presumably the index to create is a non clustered index?
yes it should be non clus. index
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 21, 2010 at 8:38 am
I created the computed column and the index on that column. The computed column is as follows
--Add Computed Column
ALTER TABLE MyTable
ADD comp_column AS substring(LineRead,2,4) varchar
--Create Column
CREATE INDEX IX_COMPColumn ON MyTable(comp_column)
However when I run the query it takes the same time. How can I alter the where clause syntax such that it can utilize the index on the computed column?
April 21, 2010 at 8:53 am
eseosaoregie (4/21/2010)
I created the computed column and the index on that column. The computed column is as follows
--Add Computed Column
ALTER TABLE MyTable
ADD comp_column AS substring(LineRead,2,4) varchar
--Create Column
CREATE INDEX IX_COMPColumn ON MyTable(comp_column)
However when I run the query it takes the same time. How can I alter the where clause syntax such that it can utilize the index on the computed column?
it should not contain any function like "substring" , it will force the sql optimizer NOT TO USE indexes so here
your query WHERE derived_cloumn like 'text%'
Can you also post exec plan along with table and indexes. ?
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 21, 2010 at 9:04 am
try:
comp_column IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')
[font="Arial Narrow"]bc[/font]
April 21, 2010 at 9:34 am
my tables are:
---index
CREATE
INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])
WITH
DROP_EXISTING
ON [PRIMARY]
---tables
CREATE TABLE [stg_LogFiles] (
[LineRead] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[Importeddate] [datetime] NULL CONSTRAINT [DF__stg_LogFi__Impor__5441852A] DEFAULT (getdate()),
[Firstfour] AS (substring([LineRead],2,4))
) ON [PRIMARY]
GO
CREATE TABLE [fct_Logs] (
[LogDate] [datetime] NULL ,
[Application] [int] NULL ,
[Database] [int] NULL ,
[User] [int] NULL ,
[Retrieval] [int] NULL
) ON [PRIMARY]
GO
---the query
INSERT INTO [ESSBASE_TEST_DW].[dbo].[fct_Logs]([LogDate], [Application], [Database], [User], [Retrieval])
(SELECT
CONVERT(datetime,SUBSTRING(B.[Date],5,7)+ RIGHT(B.[Date],4) + SUBSTRING(B.[Date],11,9),108),
A.ApplicationID,
A.DatabaseID,
U.UserID,
1
FROM
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--CHARINDEX('[',LineRead) = 1
--AND
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//' -- ONLY SELECT records where user id is present
AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
LEFT JOIN dbo.dimApplication A ON A.Application = B.Application
LEFT JOIN dbo.dimUser U ON U.UserName = B.[User]
)
April 21, 2010 at 10:03 am
It looks like it still needs to scan the whole table for this:
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//'
April 21, 2010 at 4:26 pm
eseosaoregie (4/21/2010)
my tables are:
---index
CREATE
INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])
WITH
DROP_EXISTING
ON [PRIMARY]
---tables
CREATE TABLE [stg_LogFiles] (
[LineRead] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[Importeddate] [datetime] NULL CONSTRAINT [DF__stg_LogFi__Impor__5441852A] DEFAULT (getdate()),
[Firstfour] AS (substring([LineRead],2,4))
) ON [PRIMARY]
GO
CREATE TABLE [fct_Logs] (
[LogDate] [datetime] NULL ,
[Application] [int] NULL ,
[Database] [int] NULL ,
[User] [int] NULL ,
[Retrieval] [int] NULL
) ON [PRIMARY]
GO
---the query
INSERT INTO [ESSBASE_TEST_DW].[dbo].[fct_Logs]([LogDate], [Application], [Database], [User], [Retrieval])
(SELECT
CONVERT(datetime,SUBSTRING(B.[Date],5,7)+ RIGHT(B.[Date],4) + SUBSTRING(B.[Date],11,9),108),
A.ApplicationID,
A.DatabaseID,
U.UserID,
1
FROM
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--CHARINDEX('[',LineRead) = 1
--AND
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//' -- ONLY SELECT records where user id is present
AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
LEFT JOIN dbo.dimApplication A ON A.Application = B.Application
LEFT JOIN dbo.dimUser U ON U.UserName = B.[User]
)
Now, if you'd provide a little test data for that, you'd have an answer by now. See the first link in my signature line below for how to do that properly. Thanks.
As a side bar, it looks like you need a very simple split function to hammer all of this out... again, correctly posted test data would go a long way in helping resolve this in the most effecient manner possible.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 21, 2010 at 11:34 pm
I have now added sample data along with all the code for tables and indexes etc. Hopefully this will make things clearer
---tables
CREATE TABLE [stg_LogFiles] (
[LineRead] [varchar] (100) NULL ,
[Importeddate] [datetime] DEFAULT (getdate()),
[Firstfour] AS (substring([LineRead],2,4))
) ON [PRIMARY]
GO
CREATE TABLE [fct_Logs] (
[LogDate] [datetime] NULL ,
[Application] [int] NULL ,
[Database] [int] NULL ,
[User] [int] NULL ,
[Retrieval] [int] NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[dimApplication] (
[Application] [varchar] (5) NULL ,
[ApplicationID] [int] IDENTITY (1, 1) NOT NULL ,
[Database] [varchar] (6) NULL ,
[DatabaseID] [int] NULL
) ON [PRIMARY]
--user table
CREATE TABLE [dbo].[dimUser] (
[UserID] [int] IDENTITY (1, 1) NOT NULL ,
[UserName] [nvarchar] (10) NULL ,
[Country] [nvarchar] (20) NULL ,
[CountryID] [int] NULL
) ON [PRIMARY]
GO
--index for user table
CREATE INDEX [IX_USER] ON [dbo].[dimUser]([UserName]) ON [PRIMARY]
GO
--index for application table
CREATE INDEX [IX_Application] ON [dbo].[dimApplication]([Application]) ON [PRIMARY]
GO
--index for computed column on stg_LogFiles table
CREATE INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])
ON [PRIMARY]
--data for application table
INSERT INTO [dimApplication]
SELECT 'GMR08','1','OLTCHK','1' UNION ALL
SELECT 'GMR09','2','OLTCHK','1' UNION ALL
SELECT 'GMR10','3','OLTCHK','1'
--data for user table
INSERT INTO [dimUser]
SELECT '1','KRRHEJ','Venezuela','43' UNION ALL
SELECT '2','SGTAYL','Venezuela','43' UNION ALL
SELECT '3','PHGANI','Venezuela','43' UNION ALL
SELECT '4','ROBAIR','Venezuela','43' UNION ALL
SELECT '5','silven','Venezuela','43' UNION ALL
SELECT '6','COAGUJ','Venezuela','43' UNION ALL
SELECT '7','USFITT','Venezuela','43' UNION ALL
SELECT '8','admin','Netherlands','21'
--data for stg_LogFiles table
INSERT INTO [dbo].[stg_LogFiles] (LineRead)
SELECT '[Mon Jan 05 13:51:05 2009]Local/GMR09///Info(1013205)' UNION ALL
SELECT '[Mon Jan 05 13:51:18 2009]Local/GMR09///Info(1013210)' UNION ALL
SELECT '[Mon Jan 05 13:51:30 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:52:59 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:53:19 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:53:28 2009]Local/GMR09///Info(1013210)' UNION ALL
SELECT '[Mon Jan 05 13:53:40 2009]Local/GMR09///Info(1013210)'
---the query
INSERT INTO [ESSBASE_TEST_DW].[dbo].[fct_Logs]([LogDate], [Application], [Database], [User], [Retrieval])
(SELECT
CONVERT(datetime,SUBSTRING(B.[Date],5,7)+ RIGHT(B.[Date],4) + SUBSTRING(B.[Date],11,9),108),
A.ApplicationID,
A.DatabaseID,
U.UserID,
1
FROM
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--CHARINDEX('[',LineRead) = 1
--AND
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//' -- ONLY SELECT records where user id is present
AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
LEFT JOIN dbo.dimApplication A ON A.Application = B.Application
LEFT JOIN dbo.dimUser U ON U.UserName = B.[User]
)
April 21, 2010 at 11:40 pm
Not all people ( including me ) can understand exec paln iin any format (here xlsm format) . i found it all impossible to read it. so always try to post it in XMl/graphical format.
Moreover, do yoo have indexes oon "application" and "users" column, which are being used in LEFT JOINs in query ??
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 21, 2010 at 11:44 pm
there are indexes on the Application and User Tables. My previous post has all the structures and sampl data which i should have posted initially.
April 22, 2010 at 12:05 am
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--SUBSTRING(LineRead, CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) ,2) <> '//' AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B AND
WHERE
--SUBSTRING(LineRead, CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) ,2) <> '//'
both sql are causing table scan ,
1) i tried it with clustered index replacing your indexes , found some improvement
2) "WHERE SUBSTRING" should also be avoided to force optimzer to select SEEK operation
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 22, 2010 at 12:17 am
How did you replace my indexes with a clustered index? Were they added to the apllication and user table?
Viewing 15 posts - 1 through 15 (of 33 total)
You must be logged in to reply to this topic. Login to reply