June 19, 2015 at 12:02 pm
I have an UPDATE statement that joins two table by SendId. One table, I'll call it T1, has a clustered index on SendId ASC. The other table I will call T2 also has a clustered index on SendID ASC. All the columns from T2 are used to update T1. The execution plan shows a Clustered index scan on T2 and a Clustered Index Seek on T1 going into a Nested Loops inner join. Immediately following is a Distinct Sort that is done on SendId ASC. Why the Distinct SORT if the tables are already ordered by SendID?
June 19, 2015 at 12:35 pm
lmarkum (6/19/2015)
I have an UPDATE statement that joins two table by SendId. One table, I'll call it T1, has a clustered index on SendId ASC. The other table I will call T2 also has a clustered index on SendID ASC. All the columns from T2 are used to update T1. The execution plan shows a Clustered index scan on T2 and a Clustered Index Seek on T1 going into a Nested Loops inner join. Immediately following is a Distinct Sort that is done on SendId ASC. Why the Distinct SORT if the tables are already ordered by SendID?
Without the query at a bare minimum there is nothing we can do help here. It would probably also help to know the table structures and indexes.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 19, 2015 at 12:57 pm
And also post the actual execution plan as a *.sqlplan file.
June 19, 2015 at 2:48 pm
lmarkum (6/19/2015)
I have an UPDATE statement that joins two table by SendId. One table, I'll call it T1, has a clustered index on SendId ASC. The other table I will call T2 also has a clustered index on SendID ASC. All the columns from T2 are used to update T1. The execution plan shows a Clustered index scan on T2 and a Clustered Index Seek on T1 going into a Nested Loops inner join. Immediately following is a Distinct Sort that is done on SendId ASC. Why the Distinct SORT if the tables are already ordered by SendID?
Are the two clustered indexes both unique?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
June 19, 2015 at 3:56 pm
Here is the SQL and the attached sql plan
DECLARE @SendId INT
SET @SendId = 1
UPDATEEmailTrackingSummary
SETEmailName = tets.EmailName,
SentDate = tets.SentDate,
SendDate = tets.SendDate,
NumberSent = tets.NumberSent,
NumberDelivered = tets.NumberDelivered,
Duplicates = tets.Duplicates,
UniqueClicks = tets.UniqueClicks,
UniqueOpens = tets.UniqueOpens,
HardBounces = tets.HardBounces,
Unsubscribes = tets.Unsubscribes,
ExistingUndeliverables = tets.ExistingUndeliverables,
ExistingUnsubscribes = tets.ExistingUnsubscribes,
InvalidAddresses = tets.InvalidAddresses,
NumberExcluded = tets.NumberExcluded,
NumberTargeted = tets.NumberTargeted,
OtherBounces = tets.OtherBounces,
SoftBounces = tets.SoftBounces,
PreviewURL = tets.PreviewURL,
ts = GetDate(),
Reconciled = 0
FROMEmailTrackingSummary ets
INNER JOIN TempEmailTrackingSummary tets ON tets.SendId = ets.SendId
WHERE@SendId IS NULL OR TETS.SendId = @SendId
June 19, 2015 at 4:02 pm
lmarkum (6/19/2015)
Here is the SQL and the attached sql planDECLARE @SendId INT
SET @SendId = 1
UPDATEEmailTrackingSummary
SETEmailName = tets.EmailName,
SentDate = tets.SentDate,
SendDate = tets.SendDate,
NumberSent = tets.NumberSent,
NumberDelivered = tets.NumberDelivered,
Duplicates = tets.Duplicates,
UniqueClicks = tets.UniqueClicks,
UniqueOpens = tets.UniqueOpens,
HardBounces = tets.HardBounces,
Unsubscribes = tets.Unsubscribes,
ExistingUndeliverables = tets.ExistingUndeliverables,
ExistingUnsubscribes = tets.ExistingUnsubscribes,
InvalidAddresses = tets.InvalidAddresses,
NumberExcluded = tets.NumberExcluded,
NumberTargeted = tets.NumberTargeted,
OtherBounces = tets.OtherBounces,
SoftBounces = tets.SoftBounces,
PreviewURL = tets.PreviewURL,
ts = GetDate(),
Reconciled = 0
FROMEmailTrackingSummary ets
INNER JOIN TempEmailTrackingSummary tets ON tets.SendId = ets.SendId
WHERE@SendId IS NULL OR TETS.SendId = @SendId
Can you post the query plan and DDL for EmailTrackingSummary and TempEmailTrackingSummary?
-- Itzik Ben-Gan 2001
June 19, 2015 at 4:58 pm
Answer removed, bad advice.
I think I got to the bottom of this... Give me a few minutes...
Ok I'm back. Again, DDL would really be helpful. Note the link in my signature line (or Sean's or Lynn's) for best practices on getting help/asking questions. What I'm really curious about is if there's a one-to-many relationship between EmailTrackingSummary and TempEmailTrackingSummary. If there is then there is a problem with your logic. Take the query below where I'm using a join to set the val of dbo.T_one/col1 with the value of dbo.T_many/col1.
USE tempdb
GO
IF OBJECT_ID('tempdb.dbo.T_One') IS NOT NULL DROP TABLE dbo.T_One;
IF OBJECT_ID('tempdb.dbo.T_Many') IS NOT NULL DROP TABLE dbo.T_Many;
GO
CREATE TABLE dbo.T_One
(
to_id int identity primary key,
col1 varchar(10)
);
CREATE TABLE dbo.T_Many
(
tm_id int identity primary key,
to_id int,
col1 varchar(10)
);
INSERT INTO dbo.T_One (col1)
VALUES ('xxx'),('xxx'),('xxx'),('xxx'),('xxx'),('xxx'),('xxx');
INSERT INTO dbo.T_Many (to_id, col1)
VALUES (1,'asd'),(1,'fgb'),(2,'ddd'),(3,'ddd'),(3,'ppp'),(3,'ooo'),
(4,'ddd'),(4,'zzz'),(5,'123'),(6,'vvv'),(7,'yyy');
UPDATE dbo.T_One
SET col1 = m.col1
FROM dbo.T_One o
JOIN dbo.T_Many m ON o.to_id = m.to_id;
SELECT * FROM T_One;
Because I'm doing an update based on a one-to-many join the optimizer is selecting the first match between each id and performing the update based on that.
Hopefully this helps you understand the problem with your update statement and why you are getting a DISTINCT sort. See attached query plan.
-- Itzik Ben-Gan 2001
June 21, 2015 at 9:43 pm
Alan,
In your example the SORT is happening right after the table is read and prior to the Nested Loops join, which makes sense given your example. In my situation the TempEmailTrackingSummary table and the EmailTrackingSummary table go into a Nested Loops inner join and then the distinct sort occurs after the tables are joined. The clustered index on TempEmailTrackingSummary is not unique. I will have to learn more about the data and how this table is used in order to know if it can be made a unique index.
Here is the DDL:
/****** Object: Table [dbo].[TempEmailTrackingSummary] Script Date: 6/21/2015 10:32:44 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TempEmailTrackingSummary]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[TempEmailTrackingSummary](
[SendID] [int] NOT NULL,
[EmailName] [varchar](150) NULL,
[SentDate] [datetime] NULL,
[SendDate] [datetime] NULL,
[NumberSent] [int] NULL,
[NumberDelivered] [int] NULL,
[Duplicates] [int] NULL,
[UniqueClicks] [int] NULL,
[UniqueOpens] [int] NULL,
[HardBounces] [int] NULL,
[Unsubscribes] [int] NULL,
[ExistingUndeliverables] [int] NULL,
[ExistingUnsubscribes] [int] NULL,
[InvalidAddresses] [int] NULL,
[NumberExcluded] [int] NULL,
[NumberTargeted] [int] NULL,
[OtherBounces] [int] NULL,
[SoftBounces] [int] NULL,
[PreviewURL] [varchar](150) NULL
) ON [PRIMARY]
END
GO
SET ANSI_PADDING OFF
GO
CREATE CLUSTERED INDEX [ClusteredIndex-SendId] ON [dbo].[TempEmailTrackingSummary]
(
[SendID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[EmailTrackingSummary]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[EmailTrackingSummary](
[SendID] [int] NOT NULL,
[EmailName] [varchar](150) NULL,
[SentDate] [datetime] NULL,
[SendDate] [datetime] NULL,
[NumberSent] [int] NULL,
[NumberDelivered] [int] NULL,
[Duplicates] [int] NULL,
[UniqueClicks] [int] NULL,
[UniqueOpens] [int] NULL,
[HardBounces] [int] NULL,
[Unsubscribes] [int] NULL,
[ExistingUndeliverables] [int] NULL,
[ExistingUnsubscribes] [int] NULL,
[InvalidAddresses] [int] NULL,
[NumberExcluded] [int] NULL,
[NumberTargeted] [int] NULL,
[OtherBounces] [int] NULL,
[SoftBounces] [int] NULL,
[ts] [datetime] NULL,
[PreviewURL] [varchar](150) NULL,
[Reconciled] [bit] NULL DEFAULT ((0)),
[ReconciledCount] [int] NULL DEFAULT ((0)),
CONSTRAINT [PK_EmailTrackingSummary] PRIMARY KEY CLUSTERED
(
[SendID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
END
GO
SET ANSI_PADDING OFF
GO
SET ANSI_PADDING ON
GO
/****** Object: Index [IX_EmailTrackingSummary_EmailName] Script Date: 6/21/2015 10:01:02 PM ******/
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[EmailTrackingSummary]') AND name = N'IX_EmailTrackingSummary_EmailName')
CREATE NONCLUSTERED INDEX [IX_EmailTrackingSummary_EmailName] ON [dbo].[EmailTrackingSummary]
(
[EmailName] ASC
)
INCLUDE ( [NumberSent],
[UniqueClicks],
[UniqueOpens],
[HardBounces],
[Unsubscribes]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
/****** Object: Index [IX_EmailTrackingSummary_Reconciled_ReconciledCount] Script Date: 6/21/2015 10:01:02 PM ******/
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[EmailTrackingSummary]') AND name = N'IX_EmailTrackingSummary_Reconciled_ReconciledCount')
CREATE NONCLUSTERED INDEX [IX_EmailTrackingSummary_Reconciled_ReconciledCount] ON [dbo].[EmailTrackingSummary]
(
[Reconciled] ASC,
[ReconciledCount] ASC
)
INCLUDE ( [SendID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
June 22, 2015 at 7:50 am
lmarkum (6/19/2015)
I have an UPDATE statement that joins two table by SendId. One table, I'll call it T1, has a clustered index on SendId ASC. The other table I will call T2 also has a clustered index on SendID ASC. All the columns from T2 are used to update T1. The execution plan shows a Clustered index scan on T2 and a Clustered Index Seek on T1 going into a Nested Loops inner join. Immediately following is a Distinct Sort that is done on SendId ASC. Why the Distinct SORT if the tables are already ordered by SendID?
I think you've more or less answered your question with some encouragement from other folks. Whether you have or not, here's a demo which clearly shows the point - without a unique index on the source set, SQL Server can't determine if it contains dupes or not, so dedupes with a DISTINCT SORT.
DROP TABLE #a; DROP TABLE #b;
WITH
e1 AS (SELECT d.n FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) d (n)),
e2 AS (SELECT e1.n FROM e1 CROSS JOIN e1 b),
e4 AS (SELECT e2.n FROM e2 CROSS JOIN e2 b),
iT AS (SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM e4 CROSS JOIN e2 CROSS JOIN e1)
SELECT
n,
c1 = NEWID(),
c2 = DATEADD(millisecond,CHECKSUM(NEWID()),GETDATE()),
c3 = ABS(CHECKSUM(NEWID()))
INTO #a
FROM iT
CREATE CLUSTERED INDEX cx_Thing ON #a (n)
SELECT TOP 24 *
INTO #b
FROM #a
CREATE CLUSTERED INDEX cx_NotUnique ON #b (n)
-- Check the execution plan: DISTINCT SORT
UPDATE a SET
c1 = b.c1,
c2 = b.c2,
c3 = b.c3
FROM #a a
INNER JOIN #b b ON b.n = a.n
DROP INDEX cx_NotUnique ON #b
CREATE UNIQUE CLUSTERED INDEX cx_Unique ON #b (n)
-- Check the execution plan: NO SORT
UPDATE a SET
c1 = b.c1,
c2 = b.c2,
c3 = b.c3
FROM #a a
INNER JOIN #b b ON b.n = a.n
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply