How to avoid first part of the string and consider the rest of it?

Question

Post reply

How to avoid first part of the string and consider the rest of it?

malavika.ramanathan

SSCrazy

Points: 2790
More actions
October 19, 2010 at 1:40 am

#388039

Hi All,
I have a table with lakhs of records which can probably be grouped into some 25-30 groups based on the names of the objects.
My objects column will have a fairly lengthy name with first 20 characters of it always being constant.
Ex: "Microsoft SQL Server" has 20 characters totally. Now my data would be like:
"Microsoft SQL Server 2005"
"Microsoft SQL Server 2008"
"Microsoft SQL Server 2008R2"
I am trying to make groups so that my data will look like:
2005 2008 2008R2
x y z
for this, since i know first 20 char are constant always, can i start with 21st char and take it till end?
(after 21st letter the length varies.)
In that case how can i consider the end point?
Please suggest me something.
Thanks in advance.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 1

You can do something like:

SELECT RTRIM ( LTRIM ( REPLACE (ColA, 'Microsoft SQL Server','') ) ) TrimmedString

FROM

(

SELECT 'Microsoft SQL Server 2005' UNION ALL

SELECT 'Microsoft SQL Server 2008' UNION ALL

SELECT 'Microsoft SQL Server 2008R2'

) Tab (ColA)

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 2

If u insist in SUBSTRING funtion, then take a look at this:

SELECT

RTRIM ( LTRIM ( REPLACE (ColA, 'Microsoft SQL Server','') ) ) TrimmedString_ReplaceFn,

RTRIM ( LTRIM ( SUBSTRING( ColA , LEN ('Microsoft SQL Server') + 1 , (LEN(ColA) - LEN('Microsoft SQL Server')) ) )) TrimmedString_SUBFn

FROM

(

SELECT 'Microsoft SQL Server 2005' UNION ALL

SELECT 'Microsoft SQL Server 2008' UNION ALL

SELECT 'Microsoft SQL Server 2008R2'

) Tab (ColA)

handkot SSCertifiable Points: 5250 More actions · Answer 3

SUBSTRING (Transact-SQL)

Returns part of a character, binary, text, or image expression

SUBSTRING (value_expression ,start_expression ,length_expression )

length_expression

Is a positive integer or bigint expression that specifies how many characters of the value_expression will be returned. If length_expression is negative, an error is generated and the statement is terminated. If the sum of start_expression and length_expression is greater than the number of characters in value_expression, the whole value expression beginning at start_expression is returned.

I Have Nine Lives You Have One Only
THINK!

malavika.ramanathan SSCrazy Points: 2790 More actions · Answer 4

Hi All,

Thanks for the reply.

What am i trying to do here is to make groups based on names. Currently it is done using case like:

case when name like 'Microsoft SQL Server 2005' then 2005

when name like 'Microsoft SQL Server 2008' then 2008

when name like 'Microsoft SQL Server 2008R2' then 200R2

like this i have more than 30 groups. Now the query is taking a long time. So i thought of tuning this query.

To start, i thought of avoiding the repetition of the first 20 characters.

Is my approach correct? from query tuning point of view, is there anything i can do?

prajey SSC Enthusiast Points: 172 More actions · Answer 5

Hi,

for lakhs of records, sql server is well scaled to perform, but given the problem, it is very open that using such a case statement is bad enough to hit the app's performance. If there is probably another column which can segregate the data, numerically, then it would be of great use.

From the tuning point, you can try to get in the RANK() function and use it to group the data.

Regards,

Prajey

grahamc SSCertifiable Points: 6762 More actions · Answer 6

Is it worthwhile creating a new column that removes the repeated portion of data?

Is it even worthwhile storing the repeated data if its the same for every row?

MysteryJimbo SSC-Insane Points: 24203 More actions · Answer 7

grahamc (10/19/2010)
Is it worthwhile creating a new column that removes the repeated portion of data?
Is it even worthwhile storing the repeated data if its the same for every row?

Exactly what I was thinking. With 25-30 repeating values these could be moved into another table and the repeating column changed to a foreign key reference.

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 8

Malavika, i suggested a solution using REPLACE... that will remove the repeated data leaving only the data you require.. if there are multiple values to be replace, you can use CROSS APPLY, REPLACE, GROUP data....

malavika.ramanathan SSCrazy Points: 2790 More actions · Answer 9

Thanks for all your replies.

@Cold coffee,

i am thinking of moving these groups into a temp table. how about this?

if this is not wrong , then how to proceed on this?

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 10

I did not quite get your full requirement, 'cuz of the fact that, you did not

1. Provide sample data to work with

2. Give a clear visual example of your desired result

Assuming from your initial post, i am posting a code piece that may interest you. I am not sure how this will perform on your multi-lakh table.

; WITH CTE0 (ColA) AS

(

SELECT 'Microsoft SQL Server 2005'

UNION ALL SELECT 'Microsoft SQL Server 2008'

UNION ALL SELECT 'Microsoft SQL Server 2008R2'

UNION ALL SELECT 'Microsoft Visual Studio 2005'

UNION ALL SELECT 'Microsoft Visual Studio 2008'

UNION ALL SELECT 'Microsoft Visual Studio 2010'

UNION ALL SELECT 'Windows 95'

UNION ALL SELECT 'Windows 98'

UNION ALL SELECT 'Windows XP'

UNION ALL SELECT 'Windows Vista'

UNION ALL SELECT 'Windows 7'

UNION ALL SELECT 'Windows 7 Home Premium'

),

CTE1 (TrimmedString) AS

(

SELECT

--Replace the pattern from CROSS APPLY with a blank space and RTRIM and LTRIM it

--

RTRIM ( LTRIM ( REPLACE (ColA, Vals,'') ) ) TrimmedString

FROM

CTE0

CROSS APPLY -- here is where we input the patterns we need to remove

-- add them as a serious of SELECT clauses

(

SELECT 'Microsoft SQL Server'

UNION ALL SELECT 'Microsoft Visual Studio'

UNION ALL SELECT 'Windows'

) Trimmer(Vals)

WHERE CHARINDEX(Vals ,ColA ) > 0 -- This will ensure that only those rows that has a match is processed

)

SELECT

TrimmedString , COUNT(*) CNT

-- INTO TempTable - If you need a persisted table for the grouped count, you can uncomment this

FROM

CTE1

GROUP BY

TrimmedString