Substring Function in Where Clause

Question

Substring Function in Where Clause

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 1

eseosaoregie (4/22/2010)
How did you replace my indexes with a clustered index? Were they added to the apllication and user table?

DROP INDEX [IX_USER] ON [dbo].[dimUser]

GO

DROP INDEX [IX_Application] ON [dbo].[dimApplication]

GO

DROP INDEX [IX_FirstFour] ON stg_LogFiles

--

CREATE CLUSTERED INDEX [IX_USER] ON [dbo].[dimUser]([UserName]) ON [PRIMARY]

GO

CREATE CLUSTERED INDEX [IX_Application] ON [dbo].[dimApplication]([Application]) ON [PRIMARY]

GO

CREATE CLUSTERED INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour]) ON [PRIMARY]

--

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

eseosaoregie SSCertifiable Points: 6889 More actions · Answer 2

Why does the clustered index improve performance. More specifically, what is the clustered index doing that is different? I will try this once I get into work.

Also Jeff Moden earlier in the post suggested using a split function. I am just interested in how that might improve perfromance. Does it prevent the optimizer from perfoming a table scan?

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 3

eseosaoregie (4/22/2010)
Why does the clustered index improve performance. More specifically, what is the clustered index doing that is different? I will try this once I get into work.

Create clus index

then

SET STATISTICS TIME ON

your query

SET STATISTICS TIME OFF

see if you can find some improvement.

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 4

eseosaoregie (4/21/2010)
I have now added sample data along with all the code for tables and indexes etc. Hopefully this will make things clearer
---tables
CREATE TABLE [stg_LogFiles] (
[LineRead] [varchar] (100) NULL ,
[Importeddate] [datetime] DEFAULT (getdate()),
[Firstfour] AS (substring([LineRead],2,4))
) ON [PRIMARY]
GO
CREATE TABLE [fct_Logs] (
[LogDate] [datetime] NULL ,
[Application] [int] NULL ,
[Database] [int] NULL ,
[User] [int] NULL ,
[Retrieval] [int] NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[dimApplication] (
[Application] [varchar] (5) NULL ,
[ApplicationID] [int] IDENTITY (1, 1) NOT NULL ,
[Database] [varchar] (6) NULL ,
[DatabaseID] [int] NULL
) ON [PRIMARY]
--user table
CREATE TABLE [dbo].[dimUser] (
[UserID] [int] IDENTITY (1, 1) NOT NULL ,
[UserName] [nvarchar] (10) NULL ,
[Country] [nvarchar] (20) NULL ,
[CountryID] [int] NULL
) ON [PRIMARY]
GO
--index for user table
CREATE INDEX [IX_USER] ON [dbo].[dimUser]([UserName]) ON [PRIMARY]
GO
--index for application table
CREATE INDEX [IX_Application] ON [dbo].[dimApplication]([Application]) ON [PRIMARY]
GO
--index for computed column on stg_LogFiles table
CREATE INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])
ON [PRIMARY]
--data for application table
INSERT INTO [dimApplication]
SELECT 'GMR08','1','OLTCHK','1' UNION ALL
SELECT 'GMR09','2','OLTCHK','1' UNION ALL
SELECT 'GMR10','3','OLTCHK','1'
--data for user table
INSERT INTO [dimUser]
SELECT '1','KRRHEJ','Venezuela','43' UNION ALL
SELECT '2','SGTAYL','Venezuela','43' UNION ALL
SELECT '3','PHGANI','Venezuela','43' UNION ALL
SELECT '4','ROBAIR','Venezuela','43' UNION ALL
SELECT '5','silven','Venezuela','43' UNION ALL
SELECT '6','COAGUJ','Venezuela','43' UNION ALL
SELECT '7','USFITT','Venezuela','43' UNION ALL
SELECT '8','admin','Netherlands','21'
--data for stg_LogFiles table
INSERT INTO [dbo].[stg_LogFiles] (LineRead)
SELECT '[Mon Jan 05 13:51:05 2009]Local/GMR09///Info(1013205)' UNION ALL
SELECT '[Mon Jan 05 13:51:18 2009]Local/GMR09///Info(1013210)' UNION ALL
SELECT '[Mon Jan 05 13:51:30 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:52:59 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:53:19 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:53:28 2009]Local/GMR09///Info(1013210)' UNION ALL
SELECT '[Mon Jan 05 13:53:40 2009]Local/GMR09///Info(1013210)'
---the query
INSERT INTO [ESSBASE_TEST_DW].[dbo].[fct_Logs]([LogDate], [Application], [Database], [User], [Retrieval])
(SELECT
CONVERT(datetime,SUBSTRING(B.[Date],5,7)+ RIGHT(B.[Date],4) + SUBSTRING(B.[Date],11,9),108),
A.ApplicationID,
A.DatabaseID,
U.UserID,
1
FROM
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--CHARINDEX('[',LineRead) = 1
--AND
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//' -- ONLY SELECT records where user id is present
AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
LEFT JOIN dbo.dimApplication A ON A.Application = B.Application
LEFT JOIN dbo.dimUser U ON U.UserName = B.[User]
)

A fair number of clock cycles have been expended in code trying to determine if characters 2 through 4 are a 3 letter character representation of a week day so let me ask this... are there any rows in the stg_LogFiles table where those characters DON'T actually contain the 3 letter character representation of a week day at characters 2 through 4??? If so, would you include those in the test data, as well, please? And, yes... it's very important so I can set up a large amount of test data to find the most efficient method for the split you're trying to do.

I also notice that all of the "rows of interest" that need to be split seem to have precisely 4 slashes in them. Will that ALWAYS be the case for these "rows of interest"?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 5

bc_ (4/21/2010)
try:
comp_column IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')

Why? comp_column only contains 3 characters according to the OP's posted code.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 6

Ah... almost forgot... I also need to know how many rows are in the staging table that you've been testing with and how long the split runs have been.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

eseosaoregie SSCertifiable Points: 6889 More actions · Answer 7

There are rows which do not contain the 3 letter character representation of a week day at characters 2 through 4. They are:

'[Spreadsheet Extractor Big Block Allocs -- Dyn.Calc.Cache : [15] non-Dyn.Calc.Cache : [0]]'

'[Clear Active on User [NLMAJS] Instance [1]'

'[The sheet contains an unknown member: IS: charged by R-share - Manufacturing.]'

'[The sheet contains an unknown member: A7057120.]'

I have been testing with rought 17million rows. The query is in a stored procedure which takes bout 7 1/2mins

bc_ Hall of Fame Points: 3168 More actions · Answer 8

Jeff Moden (4/22/2010)
bc_ (4/21/2010)
try:
comp_column IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')
Why? comp_column only contains 3 characters according to the OP's posted code.

eseosaoregie (4/21/2010)
I created the computed column and the index on that column. The computed column is as follows
--Add Computed Column
ALTER TABLE MyTable
ADD comp_column AS substring(LineRead,2,4) varchar
--Create Column
CREATE INDEX IX_COMPColumn ON MyTable(comp_column)
However when I run the query it takes the same time. How can I alter the where clause syntax such that it can utilize the index on the computed column?

I take that to be 4 characters in length.

The original post had spaces after each day and he labeled the column as "Firstfour" later on as well. Not sure what other data is present in the column, but possibly to prevent picking up other strings like 'sunny','money','thus', etc.

[font="Arial Narrow"]bc[/font]

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 9

eseosaoregie (4/22/2010)
There are rows which do not contain the 3 letter character representation of a week day at characters 2 through 4. They are:
'[Spreadsheet Extractor Big Block Allocs -- Dyn.Calc.Cache : [15] non-Dyn.Calc.Cache : [0]]'
'[Clear Active on User [NLMAJS] Instance [1]'
'[The sheet contains an unknown member: IS: charged by R-share - Manufacturing.]'
'[The sheet contains an unknown member: A7057120.]'
I have been testing with rought 17million rows. The query is in a stored procedure which takes bout 7 1/2mins

Thanks... I'm at work so I can't get to this until tonight (ie: after work). I could do it on a break but I need to write code to gen a couple of million rows and I don't believe they'd appreciate me hammering even on the test box. 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

eseosaoregie SSCertifiable Points: 6889 More actions · Answer 10

eseosaoregie

SSCertifiable

Points: 6889

April 22, 2010 at 10:10 am

#1156269

Thanks. I will wait till later.

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 11

{edit} Just noticed a boo-boo... had to take the code I posted down. I'll be back soon.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 12

Let's try this again...

Here's the test harness I used... make sure you read the comments because it builds 17 million test rows in something less than 10 minutes...

-- WARNING... USE THIS LINE AT YOUR OWN RISK, drop table [stg_LogFiles],[fct_Logs],[dimApplication],[dimUser]

---tables (Time to build test table on my machine... 00:09:35)

-- Note that this test setup build 17 million rows.

go

--===== I changed the calculated column with a little "Modecan Tweekin" here.

-- As a side bar, it also calculates the day of the week in case you need it.

-- second part of the query makes the DoW negative if any information is skipped

-- according to having two slashes together. And, it's NASTY FAST.

CREATE TABLE dbo.stg_LogFiles

(

LineRead varchar (100) NULL ,

Importeddate datetime DEFAULT (getdate()),

DoW AS ((CHARINDEX(LEFT(LineRead,5),'[Mon [Tue [Wed [Thu [Fri [Sat [Sun ')-1)/5+1)

* SIGN(CHARINDEX('/',LineRead)-CHARINDEX('//',LineRead)) PERSISTED

)

GO

CREATE TABLE [fct_Logs] (

[LogDate] [datetime] NULL ,

[Application] [int] NULL ,

[Database] [int] NULL ,

[User] [int] NULL ,

[Retrieval] [int] NULL

) ON [PRIMARY]

GO

CREATE TABLE [dbo].[dimApplication] (

[Application] [varchar] (5) NULL ,

[ApplicationID] [int] NOT NULL PRIMARY KEY CLUSTERED, --I added this PK, took IDENTITY off because of the inserts

[Database] [varchar] (6) NULL ,

[DatabaseID] [int] NULL

) ON [PRIMARY]

--user table

CREATE TABLE [dbo].[dimUser] (

[UserID] [int] NOT NULL PRIMARY KEY CLUSTERED, --I added this PK, took IDENTITY off because of the inserts

[UserName] [nvarchar] (10) NULL ,

[Country] [nvarchar] (20) NULL ,

[CountryID] [int] NULL

) ON [PRIMARY]

GO

--index for user table

CREATE INDEX [IX_USER] ON [dbo].[dimUser]([UserName]) ON [PRIMARY]

GO

--index for application table

CREATE INDEX [IX_Application] ON [dbo].[dimApplication]([Application]) ON [PRIMARY]

GO

--index for computed column on stg_LogFiles table

--===== Note that we don't need this any more.

--CREATE INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])

--ON [PRIMARY]

--data for application table

INSERT INTO [dimApplication]

SELECT 'GMR08','1','OLTCHK','1' UNION ALL

SELECT 'GMR09','2','OLTCHK','1' UNION ALL

SELECT 'GMR10','3','OLTCHK','1'

--data for user table

INSERT INTO [dimUser]

SELECT '1','KRRHEJ','Venezuela','43' UNION ALL

SELECT '2','SGTAYL','Venezuela','43' UNION ALL

SELECT '3','PHGANI','Venezuela','43' UNION ALL

SELECT '4','ROBAIR','Venezuela','43' UNION ALL

SELECT '5','silven','Venezuela','43' UNION ALL

SELECT '6','COAGUJ','Venezuela','43' UNION ALL

SELECT '7','USFITT','Venezuela','43' UNION ALL

SELECT '8','admin','Netherlands','21'

--data for stg_LogFiles table

--===== I modified this a bit so I could build a 17 million rows in under 10 minutes.

-- I also included the "odd" lines that have no DoW at characters 2-4.

INSERT INTO dbo.stg_LogFiles (LineRead)

SELECT TOP (17000000) d.LineRead

FROM (

SELECT '[Mon Jan 05 13:51:05 2009]Local/GMR09///Info(1013205)' UNION ALL

SELECT '[Mon Jan 05 13:51:18 2009]Local/GMR09///Info(1013210)' UNION ALL

SELECT '[Spreadsheet Extractor Big Block Allocs -- Dyn.Calc.Cache : [15] non-Dyn.Calc.Cache : [0]]' UNION ALL

SELECT '[Mon Jan 05 13:51:30 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL

SELECT '[Clear Active on User [NLMAJS] Instance [1]' UNION ALL

SELECT '[Tue Jan 06 13:52:59 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL

SELECT '[Wed Jan 07 13:53:19 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL

SELECT '[Clear Active on User [NLMAJS] Instance [1]' UNION ALL

SELECT '[Thu Jan 08 13:53:28 2009]Local/GMR09///Info(1013210)' UNION ALL

SELECT '[Sun Jan 11 13:53:40 2009]Local/GMR09///Info(1013210)' UNION ALL

SELECT '[The sheet contains an unknown member: IS: charged by R-share - Manufacturing.]' UNION ALL

SELECT '[The sheet contains an unknown member: A7057120.]'

) d (LineRead),

Master.sys.All_Columns ac1,

Master.sys.All_Columns ac2

GO

... and here's the code to solve the problem. It runs in about 4 minutes on my 8 year old single 1.8Ghz 1GB RAM desktop. It should scream on your server. Again, read the comments... they're important for similar problems in the future... hint... "Divide'n'Conquer" makes code shorter, more readable, more maintainable, and faster... a lot faster. 😉

--the query (This runs in just under 4 minutes on my machine which is an 8 year old desktop with a single 1.8GHz

-- CPU and 1GB of RAM. This should scream on your server. Notice how the "Divide'n'Conquer" method

-- has made this code a lot easier to read and maintain. It's also made it quite a bit faster.

-- Notice that I didn't have to do a bunch of juggling on the date conversion.

-- You've just gotta try everything instead of assuming that something doesn't work.

WITH --===== This is where I made ALL the changes except for the calculated column on the staging table

cteFirstSplit AS

( --=== This does the proper split upto the first slash and carries the rest forward

SELECT CAST(SUBSTRING(LineRead,6,20) AS DATETIME) AS [LogDate],

SUBSTRING(LineRead, CHARINDEX('/',LineRead)+1, 8000) AS [TheRest]

FROM dbo.stg_LogFiles

WHERE DoW > 0 --Also filters out NULLs which can't be compared this way

)

,

cteSecondSplit AS

( --=== This splits out Application and carries the rest forward

SELECT [LogDate],

SUBSTRING(TheRest, 1, CHARINDEX('/',TheRest)-1) AS [Application],

SUBSTRING(TheRest, CHARINDEX('/',TheRest)+1, 8000) AS [TheRest]

FROM cteFirstSplit

)

,

cteThirdSplit AS

( --=== This splits out DataBase and carries the rest forward

SELECT [LogDate], [Application],

SUBSTRING(TheRest, 1, CHARINDEX('/',TheRest)-1) AS [DataBase],

SUBSTRING(TheRest, CHARINDEX('/',TheRest)+1, 8000) AS [TheRest]

FROM cteSecondSplit

)

,

cteFourthSplit AS

( --==== This splits out the User and forgets "TheRest"

SELECT [LogDate], [Application], [DataBase],

SUBSTRING(TheRest, 1, CHARINDEX('/',TheRest)-1) AS [User]

FROM cteThirdSplit

)

INSERT INTO [dbo].[fct_Logs]

([LogDate], [Application], [Database], [User], [Retrieval])

SELECT B.LogDate,

A.ApplicationID,

A.DatabaseID,

U.UserID,

1 AS Retreival

FROM cteFourthSplit AS B

LEFT JOIN dbo.dimApplication A ON A.Application = B.Application --Original Join

LEFT JOIN dbo.dimUser U ON U.UserName = B.[User] --Original Join

As a side bar, notice that I used NO indexes on the staging table. Even a clustered index didn't add much value and it took several minutes to build.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

eseosaoregie SSCertifiable Points: 6889 More actions · Answer 13

Thanks jeff. Will try it this weekend and let you know how it runs. Much appreciated

eseosaoregie SSCertifiable Points: 6889 More actions · Answer 14

Hi Jeff,

Just ran this at work on some of the data and it the first CTE failed because it could convert the string type to datetime.

After running some queries to find the row I found a row containing

'[AAB Locations / Hyperion Enterprise Structure ] is an invalid member name in database [OTLCHK]'

I hadn't come across text this before and I am sure there are probably more amongst the millions of rows. How can I amend the computed column to cater for rows like this?

Jeff Moden SSC Guru Points: 1004362 More actions · Answer 15

eseosaoregie (4/23/2010)
Hi Jeff,
Just ran this at work on some of the data and it the first CTE failed because it could convert the string type to datetime.
After running some queries to find the row I found a row containing
'[AAB Locations / Hyperion Enterprise Structure ] is an invalid member name in database [OTLCHK]'
I hadn't come across text this before and I am sure there are probably more amongst the millions of rows. How can I amend the computed column to cater for rows like this?

The problem is that I tried to get too fancy. We have a couple of changes to make in anticipation of other data "anomolies"...

First, change the calculated column in the staging table to the following (instead of Dow)...

LineTest AS CHARINDEX(LEFT(LineRead,5),'[Mon [Tue [Wed [Thu [Fri [Sat [Sun ')

* SIGN(CHARINDEX('/',LineRead)-CHARINDEX('//',LineRead)) PERSISTED

Second, change the final code to the following (again, a change away from DoW and using LineTest, instead)...

--the query (This runs in just under 4 minutes on my machine which is an 8 year old desktop with a single 1.8GHz

-- CPU and 1GB of RAM. This should scream on your server. Notice how the "Divide'n'Conquer" method

-- has made this code a lot easier to read and maintain. It's also made it quite a bit faster.

-- Notice that I didn't have to do a bunch of juggling on the date conversion.

-- You've just gotta try everything instead of assuming that something doesn't work.

WITH --===== This is where I made ALL the changes except for the calculated column on the staging table

cteFirstSplit AS

( --=== This does the proper split upto the first slash and carries the rest forward

SELECT CAST(SUBSTRING(LineRead,6,20) AS DATETIME) AS [LogDate],

SUBSTRING(LineRead, CHARINDEX('/',LineRead)+1, 8000) AS [TheRest]

FROM dbo.stg_LogFiles

WHERE LineTest > 0 --Also filters out NULLs which can't be compared this way

)

,

cteSecondSplit AS

( --=== This splits out Application and carries the rest forward

SELECT [LogDate],

SUBSTRING(TheRest, 1, CHARINDEX('/',TheRest)-1) AS [Application],

SUBSTRING(TheRest, CHARINDEX('/',TheRest)+1, 8000) AS [TheRest]

FROM cteFirstSplit

)

,

cteThirdSplit AS

( --=== This splits out DataBase and carries the rest forward

SELECT [LogDate], [Application],

SUBSTRING(TheRest, 1, CHARINDEX('/',TheRest)-1) AS [DataBase],

SUBSTRING(TheRest, CHARINDEX('/',TheRest)+1, 8000) AS [TheRest]

FROM cteSecondSplit

)

,

cteFourthSplit AS

( --==== This splits out the User and forgets "TheRest"

SELECT [LogDate], [Application], [DataBase],

SUBSTRING(TheRest, 1, CHARINDEX('/',TheRest)-1) AS [User]

FROM cteThirdSplit

)

INSERT INTO [dbo].[fct_Logs]

([LogDate], [Application], [Database], [User], [Retrieval])

SELECT B.LogDate,

A.ApplicationID,

A.DatabaseID,

U.UserID,

1 AS Retreival

FROM cteFourthSplit AS B

LEFT JOIN dbo.dimApplication A ON A.Application = B.Application --Original Join

LEFT JOIN dbo.dimUser U ON U.UserName = B.[User] --Original Join

Let me know if you have any other problems.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)