February 4, 2009 at 10:04 am
If I run this query
SELECT
sc.SourceContactID
FROM SourceLists_SourceContacts slsc
INNER JOIN [sourceContacts] sc ON sc.SourceContactID = slsc.SourceContactID
LEFT JOIN customers c ON sc.email = c.email
LEFT JOIN vw_UnsubscribedEmails AS ue ON sc.email = ue.email
WHERE slsc.SourceListID = 59
AND ue.email IS NULL -- filter out unsubscribed
it executes in about 5 seconds. However, if I simply add AND c.customerID IS NULL -- filter out customers to the WHERE clause,i.e. this query
SELECT
sc.SourceContactID
FROM SourceLists_SourceContacts slsc
INNER JOIN [sourceContacts] sc ON sc.SourceContactID = slsc.SourceContactID
LEFT JOIN customers c ON sc.email = c.email
LEFT JOIN vw_UnsubscribedEmails AS ue ON sc.email = ue.email
WHERE slsc.SourceListID = 59
AND ue.email IS NULL -- filter out unsubscribed
AND c.customerID IS NULL -- filter out customers
it will run "forever". I will literally run is ManagementStudio for over 10 minutes and I finally have to cancel the query.
customers table is a simple with customerID, email, firstName, etc.
sourceContacts is a simple table with sourceContactID, email, firstName, etc
sourceLists_sourceContacts is just a joining table: sourceListID to sourceContactID
vw_UnsubscribedEmails is
CREATE VIEW [dbo].[vw_UnsubscribedEmails]
AS
SELECT email, dateCreated, source
FROM dbo.unsubscribedEmails
UNION
SELECT email, NULL AS dateCreated, NULL AS source
FROM dbo.customers
WHERE (isOnEmailList = 0 AND isEmailRestricted = 1) OR
(badEmail = 1) OR
(archiveBit = 1)
I simply want to filter out any source contacts where they have a matching email in the customers table. What could be causing the query to run long?
February 4, 2009 at 10:11 am
If use the initial query as a subquery and then filter by customerID IS NULL, i.e.
SELECT qry.sourceContactID
FROM
(
SELECT
sc.SourceContactID
FROM SourceLists_SourceContacts slsc
INNER JOIN [sourceContacts] sc ON sc.SourceContactID = slsc.SourceContactID
LEFT JOIN customers c ON sc.email = c.email
LEFT JOIN vw_UnsubscribedEmails AS ue ON sc.email = ue.email
WHERE slsc.SourceListID = 59
AND ue.email IS NULL -- filter out unsubscribed
) qry
INNER JOIN sourceContacts sc ON qry.sourceContactID = sc.sourceContactID
LEFT JOIN customers c2 ON sc.email = c2.
WHERE c2.[customerID] IS NULL
It runs in about 5 seconds. Not sure why doing it that way is better. Any thoughts?
-Phil
February 4, 2009 at 10:12 am
Hi
By looking at your table structure CustomerID is Primary Key. Not quite sure why do you want to have CustomoerID IS NULL Condition in the query.
Can you also tell us the cardinality between the tables.
Thanks -- Vj
February 4, 2009 at 10:34 am
I want to filter out any sourceContacts that have a matching email in the customer table. Due to the LEFT JOIN, if there is no match by email then customerID (or any of the other customer fields )should be null.
SourceLists and SourceContacts have a many-to-many relationship. Customers and SourceContacts each have email as unique contstraints but have no direct relationship via primary/foreign keys just joining on email. The vw_UnsubscribedEmails only returns unique emails.
-Phil
February 4, 2009 at 10:37 am
Just curious (since we don't have the DDL for the tables, sample data for the tables, or expected results based on the sample data), what happens to your query if you modify the jons to use inner joins instead of outer joins? From what I see reading your code, the WHERE clause is basically doing that any way.
February 4, 2009 at 10:49 am
Lynn Pettis (2/4/2009)
Just curious (since we don't have the DDL for the tables, sample data for the tables, or expected results based on the sample data), what happens to your query if you modify the jons to use inner joins instead of outer joins? From what I see reading your code, the WHERE clause is basically doing that any way.
I can't inner join to customers. sourceContacts is basically a list of email recipients. There might be 100,000 recipients associated with a given sourceListID. Of those 100K, there will be a small percentage of recipients that have the same email in the customers table. I'm doing the left join so I make sure to return all the source contacts and I check there is no match to customers with the customerID null check.
But to answer your question, if I do an inner join it returns quickly (but doesn't give me the result I need).
-Phil
February 4, 2009 at 11:30 am
Then we could use the following: table DDL, sample data (as INSERT statements that can be cut, paste, and executed to load the tables), and the expected results based on the sample data. Please be sure the sample covers all possible data possibilities.
Any questions on this, please read the first article linked below in my signature block regarding asking for assistance.
February 4, 2009 at 4:29 pm
philcruz (2/4/2009)
vw_UnsubscribedEmails isCREATE VIEW [dbo].[vw_UnsubscribedEmails]
AS
SELECT email, dateCreated, source
FROM dbo.unsubscribedEmails
UNION
SELECT email, NULL AS dateCreated, NULL AS source
FROM dbo.customers
WHERE (isOnEmailList = 0 AND isEmailRestricted = 1) OR
(badEmail = 1) OR
(archiveBit = 1)
I'm not sure why you are using vw_UnsubscribedEmails which includes some email values from the customers table and then joining the whole customers to remove the rest. Have you tried this?
SELECT
sc.SourceContactID
FROM SourceLists_SourceContacts slsc
INNER JOIN [sourceContacts] sc ON sc.SourceContactID = slsc.SourceContactID
LEFT JOIN customers c ON sc.email = c.email
LEFT JOIN dbo.UnsubscribedEmails AS ue ON sc.email = ue.email
WHERE slsc.SourceListID = 59
AND ue.email IS NULL -- filter out unsubscribed
AND c.customerID IS NULL -- filter out customers
Derek
February 4, 2009 at 5:16 pm
I agree, union with Customers table in the view looks suspicious.
I tried to create the tables and run the queries to look at the execution plan, but I do not have any problems. May be the problem is in the indexes that you have? Post create scripts for you tables, please. Or the execution plan for the second query, may be.
February 4, 2009 at 6:55 pm
Derek Dongray (2/4/2009)
philcruz (2/4/2009)
vw_UnsubscribedEmails isCREATE VIEW [dbo].[vw_UnsubscribedEmails]
AS
SELECT email, dateCreated, source
FROM dbo.unsubscribedEmails
UNION
SELECT email, NULL AS dateCreated, NULL AS source
FROM dbo.customers
WHERE (isOnEmailList = 0 AND isEmailRestricted = 1) OR
(badEmail = 1) OR
(archiveBit = 1)
I'm not sure why you are using vw_UnsubscribedEmails which includes some email values from the customers table and then joining the whole customers to remove the rest. Have you tried this?
SELECT
sc.SourceContactID
FROM SourceLists_SourceContacts slsc
INNER JOIN [sourceContacts] sc ON sc.SourceContactID = slsc.SourceContactID
LEFT JOIN customers c ON sc.email = c.email
LEFT JOIN dbo.UnsubscribedEmails AS ue ON sc.email = ue.email
WHERE slsc.SourceListID = 59
AND ue.email IS NULL -- filter out unsubscribed
AND c.customerID IS NULL -- filter out customers
Well, the idea for the view is that it abstracts who is unsubscribed. So for other people writing queries that need to filter out unsubscribed emails, they don't have to know about the details of that, they just use the view.
In this particular case, I'm filtering out customers altogether, so doing as you suggest does indeed work much better. So, at this point, I have 2 acceptable solutions: using the subquery or a join not using the view.
It's just not clear to me why I can run the query using the view but adding c.customerID IS NULL makes it run very long. Unfortunately, I can't even get it to complete so that I can look at the execution plan.
I appreciate everyone's input on this. (Sorry I couldn't provide sample data but since it concerns customer/email data I can't share it and can't easily generate it).
-Phil
February 5, 2009 at 3:11 am
philcruz (2/4/2009)
Derek Dongray (2/4/2009)
I appreciate everyone's input on this. (Sorry I couldn't provide sample data but since it concerns customer/email data I can't share it and can't easily generate it).-Phil
I'd still be interested in seeing the full table definitions, indexes and constraints. It's fairly easy to generate some fake email addresses, names, phone numbers (if needed) etc and see if the problem occurs with dummy data as well as the real thing. The only thing then needed would be some statisitcs with rough percentages of each flag value etc.
Regarding the view, I understand why it's there, but in this case I'd use the table if it fixes the problem and put a comment in/near the SELECT indicating why the view wasn't being used.
Derek
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply