Query Help - Grouping same domain in EmailId

  • Hi All,

    I have a more than 3 lacs emailId in my customer table. I want to extract domain name from emailid and wanted to know who are from the same domain. For Exa.

    there are 4 emailId.

    1). xyz@sqlservercentral.com

    2). abc@sqlservercentral.com

    3). kkr@yahoo.com

    4). pqr@gmail.com

    Here 2 emailId are from sqlservercentral.com domain, 1 from yahoo.com and 1 from gmail.com. So i want the output like this.

    Domain Count

    -------------------- -------

    sqlservercentral.com 2

    yahoo.com 1

    gmail.com 1

    ----------------------

    Although my requirement is, i have to ignore those emailIds who are from Yahoo, Gmail, Hotmail, Reddifmail...etc..

    Here is what i have so far..

    Select count(EmailId), Stuff(Ltrim(Rtrim(Emailid)), 1, charindex('@',Ltrim(Rtrim(Emailid))), '') from dbo.Customermaster Where

    Stuff(Ltrim(Rtrim(Emailid)), 1, charindex('@',Ltrim(Rtrim(Emailid))), '') not in ('gmail.com', 'yahoo.com', 'yahoo.co.in', 'msn.com', 'hotmail.com', 'hotmail.co.in', 'rediffmail.com',

    'rediff.com', 'ymail.com', 'indiatimes.com', 'yaghoo.com', 'localmail.com', 'yahoo.co.uk', 'yahoo.in', 'reidffmail.com', 'yahoo.com ', 'rediffmai.com')

    Group by Stuff(Ltrim(Rtrim(Emailid)), 1, charindex('@',Ltrim(Rtrim(Emailid))), '')

    Is there any better way rather than this ????

    need your help

  • i tried the following and found by turning on Client statistics and sql profiler that my version using RIGHT performs consitantly better by a very small margin

    create table #temp(emailid varchar(100))

    insert into #temp select 'mv@yahoo.com'

    insert into #temp select 'mv@yahoo.co.uk'

    insert into #temp select 'mv@yahoo.co.uk'

    insert into #temp select 'mv@yahoo.org'

    insert into #temp select 'mv@yahoo.tv'

    insert into #temp select 'mv@google.com'

    insert into #temp select 'mv@google.com'

    insert into #temp select 'mv@yahoo.com'

    insert into #temp select 'mv@yahoo.com'

    select RIGHT(ltrim(rtrim(Emailid)),LEN(Ltrim(Rtrim(Emailid)))-charindex('@',Ltrim(Rtrim(Emailid))))

    ,count(EmailId)

    from #temp

    group by

    RIGHT(ltrim(rtrim(Emailid)),LEN(Ltrim(Rtrim(Emailid)))-charindex('@',Ltrim(Rtrim(Emailid))))

    GO

    Select count(EmailId), Stuff(Ltrim(Rtrim(Emailid)), 1, charindex('@',Ltrim(Rtrim(Emailid))), '')

    from #temp

    group by Stuff(Ltrim(Rtrim(Emailid)), 1, charindex('@',Ltrim(Rtrim(Emailid))), '')

    drop table #temp

    i scaled it up to 140k rows and the difference is still marginal

    MVDBA

  • How about SUBSTRING?

    select count(EmailId), SUBSTRING(EmailId, charindex('@', EmailId)+1, LEN(EmailId))

    FROM dbo.Customermaster

    where SUBSTRING(EmailId, charindex('@', EmailId)+1, LEN(EmailId)) NOT IN ('gmail.com', 'yahoo.com', 'yahoo.co.in', 'msn.com', 'hotmail.com', 'hotmail.co.in', 'rediffmail.com',

    'rediff.com', 'ymail.com', 'indiatimes.com', 'yaghoo.com', 'localmail.com', 'yahoo.co.uk', 'yahoo.in', 'reidffmail.com', 'yahoo.com ', 'rediffmai.com')

    group by SUBSTRING(EmailId, charindex('@', EmailId)+1, LEN(EmailId))

    Can't comment on performance, but it's a little more readable!

  • Hi,

    I m grateful to all of u for taking the time to offer help.

    Thanks again to all of u.....

  • Consider adding a computed column to your CustomerMaster table to produce the domain name. If you can't update the structure, then you can use a CROSS APPLY to make your code much more legible.

    Select count(EmailId), Domain

    from Customermaster

    CROSS APPLY (

    SELECT SUBSTRING(EmailID, PATINDEX('%@%', EmailID) + 1, 255) AS Domain

    ) AS Domain

    Where Domain not in ('gmail.com', 'yahoo.com', 'yahoo.co.in', 'msn.com', 'hotmail.com', 'hotmail.co.in', 'rediffmail.com',

    'rediff.com', 'ymail.com', 'indiatimes.com', 'yaghoo.com', 'localmail.com', 'yahoo.co.uk', 'yahoo.in', 'reidffmail.com', 'yahoo.com ', 'rediffmai.com')

    Group by Domain

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • computed and PERSISTED would be even better

    MVDBA

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply