January 19, 2011 at 3:07 pm
Hello All, I'm running SQL 2008 R2 x64 (10.50.1600)
I have crated the following User Defined Table Type:
CREATE TYPE [dbo].[UserTableType] AS TABLE(
[Name] [varchar](50) NOT NULL,
PRIMARY KEY CLUSTERED
(
[Name] ASC
)WITH (IGNORE_DUP_KEY = OFF)
I pass in a list of records into the UserDefined Table type which I reference in a stored procedure here is how it executes from profiling the application:
declare @p4 dbo.UserTableType
insert into @p4 values(N'blahblah')
insert into @p4 values(N'ab13d03a)
insert into @p4 values(N'9d5663fd)
insert into @p4 values(N'e67f13d2)
insert into @p4 values(N'0b129772)
insert into @p4 values(N'9d6d343e)
insert into @p4 values(N'fac588c3)
insert into @p4 values(N'f9741e71')
exec spsMyStoredProcedure @Paramater1=10,@Paramater2 =4,@UserTableType=@p4
When I look at the execution plan the Clustered Index Scan against the @UserTableType object shows 4,000,000 Actual Records then a scalar operator filters them down to a handfull to join the the table.
If I take the exact SELECT code in the procedure and run this example it's fast and the row count from the user table type reflects the number of insert statements I made. Any ideas why this is the case or next troubleshooting steps to try?
declare @p4 dbo.UserTableType
insert into @p4 values(N'blahblah')
insert into @p4 values(N'ab13d03a)
insert into @p4 values(N'9d5663fd)
insert into @p4 values(N'e67f13d2)
insert into @p4 values(N'0b129772)
insert into @p4 values(N'9d6d343e)
insert into @p4 values(N'fac588c3)
insert into @p4 values(N'f9741e71')
SELECT * FROM Table1 t
WHERE
EXISTS
(Select NULL as EMPTY FROM Table2 t2.id =t1.id and t2.Name IN (SELECT Name from @p4)
January 19, 2011 at 5:11 pm
Any chance you could post the sqlplan?
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 20, 2011 at 7:21 am
Really hard to be sure without seeing the execution plan.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 20, 2011 at 8:41 am
I completlely understand, I'll see what I can post, I appreciate your time. If it looks like you need some more details in the text output format let me know. I'll try writing the SQL code diffently to see if the plan changes and when. As a side note, the statistics are up to date I made sure I did that before posting anything. I've attached the two exexution plan differences.
The difference I see between the two plans is that when I pass in a long list (200) rows into the user table type SQL appears to do a NESTED LOOPS physical operator and the output is the product of the two joining tables. When I run the same exact code out of the stored procedure SQL does a NESTED LOOPS however the outer reference is the smaller table and the output is the exact matches.
In the stored procedure:
--Outer Referernce (308,349 rows)
--NestedLoops
--Inner Reference (200)
--308,349 executions against the inner refernced table
The output from the operator is 76,069,800 passed to the compute scalar function which is doing an implicit conversion to a UNIQUEIDENTIFIER data type.
The76 million appears to be the product of the two tables. The execution time differences are 1:20 seconds in the procedure and 4 seconds outside the procedure.
January 20, 2011 at 9:44 am
In the slow stored procedure:
Looks like the Query Optimizer thinks (estimates) that the Index Seek will result in 6 rows, however, it results in (actual) 380,349 and then joins to the 200 in the inner referenced table. I'm still not sure why that results in millions becuase I know there are no matches the output of the Nested Loops is 0 rows. It's making a decision with bad info somehow, I'll have to examine exactly what statistics are on these columns and why the operator is a LEFT SEMI JOIN rather than an INNER JOIN since my predicate says WHERE t1.id = t2.id. I guess the optimizer thinks it needs the entire result set to join to other outputs later...
I also tried changing the WHERE EXISTS Correlated SubQuery to two INNER JOINS and that resulted in a better execution plan, not perfect but better. The estimate row counts are still far apart but much closer hence the better choice.
January 20, 2011 at 10:34 am
Table variables like you're using show as having a single row because they don't have statistics. This can lead to seriously problematic execution plans, like you're seeing. If you have to do joins against this data, you'd be better off putting it into a temporary table, which will have statistics so the execution plans will be more accurate.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 20, 2011 at 11:01 am
I agree, so I tried creating a temp table and clustered index on that inside the stored procedure and inserting all the records from the User Defined Table Type into that and doing the join. Sadly same execution plan....
I know there isn't a lot you can do without seeing the db design, and all the code associated with it. I've been peeling back each part of the query and looking at the execution plan and all the steps do not reflect the accurate data count. The estimates are way off from the Actual row counts even if I remove the temp table or table variable from the code. Obviously it runs much faster without that but it's still not right. I know that this is usually bad statistics or statatistics are out of date. I've looked and all the predicates and columns referenced have statistics and are up to date as of today.... For example this still results in an estimate of 1 row but produces 75,000
for the DataLabel table in this query.
SELECT *
FROM [dbo].vwMyView as vw
WHERE
vw.TypeId = @TypeId AND
vw.MyName LIKE @MyName AND
vw.DataId =
(
SELECT TOP 1 (DataId)
FROM [dbo].DataLabel
WHERE MyViewId = vw.Id AND DataLabelId = @DataLabelId
ORDER BY DataId desc
)
The view is really just the full base table but filters out one TypeID <> [integervaluehere]
The full query filters out permissions with this AND EXISTS
) AND
EXISTS
(
SELECT NULL AS EMPTY
FROM [dbo].Permissions
WHERE MyViewId = vw.Id AND Id IN
(SELECT Id FROM @MyUserTableType)
)
Thoughts?
I've tried the INNER JOIN rather than EXISTS and it sometimes goes back to the poor execution plan. I've tried a temp table that's still the same too.
January 20, 2011 at 11:14 am
You know you can post the sqlplan here and protect your security without too much trouble...
Save the execution plan as sqlplan.
Edit the sqlplan in notepad.
Replace ALL (Ctrl-h) myTable with GenericTable
replace ALL myDatabase with GenericDb
etc
Just replace any sensitive names/data you want to hide with something non-sensitive.
Just make sure to be consistent and careful so that you don't end up with bad replaces.
You can open the sqlplan after editing to check that no sensitive data can be seen.
That's what I do.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 20, 2011 at 11:14 am
Getting the exact same plan with that much data between a table variable & a temp table indicates something else is going on. Is auto-create stats or auto-update of stats turned off? Also, have you updated the stats with a full scan? Specific calls to UPDATE STATS.. WITH FULL SCAN instead of using sp_udpatestats?
You're right. Without direct knowledge of what's going on here, the structures, the actual queries, execution plans, I'm just kind of guessing.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 20, 2011 at 12:22 pm
Correct, I updated all statistics with FULL SCAN and I checked that the statistics show they were last updated today. Auto_Create_Statistics is Enabled for the database too.
I've attached the .sqlplan, I've replaced the actual data and other name, the thing to note in this plan is that the insert statement is called 200 times to insert 200 unique records before the select statement for simplicity I only have 6 in the plan so I didn't have to change 200 records...
To recap what I have tested:
1. I've updated all statistics with FULL SCAN
2. Auto Create Statistics is ENABLED
3. I've changed the @UserTableType in the procedure to insert into a #Temp Table to use in the EXISTS statement and no change in the query plan.
4. I've changed the entire statement to use all INNER JOINS and not the AND EXISTS, it runs exactly the same as before.
5. I've tried to confirm that all predicates and combo's have statisics created and are current. I created some extra statistics manually and it didn't change the query plan.
6. After each of these changes I ran DBCC FREEPROCCACHE to get the recompiled version to test against.
7. I removed the SELECT statement from the stored procedure and ran it in SQL management studio and the query plan is different, it uses @UserTableType as the outer reference against the Permissions table and results in an accurate count.
8. Most all of these tables have a FK relationship between them.
January 20, 2011 at 12:25 pm
It's uploaded and working now
January 20, 2011 at 12:41 pm
Both files when I try to open them are giving me an error:
The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 20, 2011 at 12:44 pm
Yeah, me too... It seems to work if you right click the file and choose "Save Target As" .
January 20, 2011 at 1:12 pm
Seems like the root problem is that it's doing a LEFT SEMI JOIN without any argument right? Each row is output as a match?
January 20, 2011 at 5:22 pm
I have to disagree about the root problem.
There are two problems that I see.
1) The table variable with 200 rows is being SCANNED 390,000 times because of the IN.
Change your code like this:
...
EXISTS
(
SELECT 1
FROM [dbo].Permission AS PERM
WHERE FileId = vw.FileId AND EXISTS (SELECT 1 FROM @UserTableType AS UT WHERE UT.UniqueidentiferID = PERM.UniqueidentiferID )
)
Hopefully that will cut the 76M rows read from there down to about 390,000.
2) 88% Cost : The index seek on - [DatabaseName].[dbo].[Property].[IX_Property_TypeID_Name_FileStatus_INCLD] is estimating 24K rows but actually retrieving 74K rows - this is in turn causing the optimizer to think that it will need to seek on [DatabaseName].[dbo].[Ticket].[IX_Ticket_TicketID_INCLD] 24K times instead of 74K times.
This is made up of the following:
Problems inside the view vwProperty : There is a predicate performing a NOT EQUAL to 2 on column FileStatus - see if you can make a more positive selection here - it may help.
The optimiser is converting your @Name variable into a varchar(8000) before performing a LIKE on it - what is the definition of column Name in table Property? What is the definition of @Name? DO you really need a LIKE? Are you pattern matching?
Finally, just for fun, try adding OPTION(RECOMPILE) to the end of the SELECT query to see if the plan changes.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply