February 3, 2014 at 2:03 pm
L' Eomot Inversé (2/3/2014)
Oh, if only the reasonably competent used it it would be great
BWAA-HAAA!!!! Absolutely agreed. I said the very same thing when SQLCLR, the XML datatype, the DATE datatype, and recursive CTEs came out (to name a few). 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
February 3, 2014 at 2:54 pm
rodjkidd (1/29/2014)
Talking of Grant, did anyone see the picture of him on the SQLCruise yesterday- almost as red as the livery of that company he works for!Rodders...
Why yes, yes I did see him. Tho, in person he didn't seem as RED.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 3, 2014 at 3:10 pm
Grant Fritchey (2/3/2014)
wolfkillj (2/3/2014)
Grant Fritchey (2/3/2014)
wolfkillj (1/29/2014)
I created and used my first extended events session last week, and it was great - easy and extremely helpful. I just posted about it on my blog, if you're interested in reading about my experience:WHOOPP!!! Extended Events.
Right back at ya, Grant - the segment on extended events in your PASS Summit 2012 performance tuning pre-con blew me away.
Since I have your ear on this topic for a minute, do you know of any good references that describe the significance of each of the 100+ extended events - things like exactly what happens in the database to trigger each event (that's not always clear from their names, especially to those of us with a more superficial understanding of the inner workings of SQL Server) and what the data collected actually represents? There doesn't seem to be any BOL documentation on this (that I could find, at least). I've found and learned much from resources like Jonathan Kehayias's 31 Days of Extended Events blog series, including his queries to view the metadata for extended events (which itself is fairly cryptic). It would be nice to have single source that explains each extended event in plain terms, though.
If the thing does not yet exist and someone were to start putting one together, I'd be happy to contribute write-ups of the (very few) extended events that I understand well enough to explain to others.
Nope. You've just hit the two best sets of documentation, BOL and Jonathan. He is writing a book (and I'm helping tech edit it), but even that won't have a complete reference for what all of them are. We are stuck with Microsoft for that.
You can pull the explanations from the system tables that define the events. I don't have that precise table in my head, but a little google-fu ought to bring it up.
use this
SELECT event_package=o.package_guid, o.description,
event=c.object_name, channel=v.map_value
FROM sys.dm_xe_objects o
LEFT JOIN sys.dm_xe_object_columns c ON o.name = c.object_name
INNER JOIN sys.dm_xe_map_values v ON c.type_name = v.name
AND c.column_value = cast(v.map_key AS nvarchar)
WHERE object_type='event' AND (c.name = 'channel' OR c.name IS NULL)
FWIW - they are getting more and more info available to you as they incorporate more of what has been available to the OS for years now (via XE etc).
Also of note, in 2008 you have 263 events. In 2012 you have 627 events. And currently in 2014 you will see 744 events. Besides the new events, there are new properties for some old events. Definitely on the cool side for getting a better idea of what is happening internally.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 3, 2014 at 3:37 pm
Grant Fritchey (2/3/2014)
wolfkillj (2/3/2014)
EDIT: Any word on when Jonathan's book will hit the streets (or "drop", as the kids today seem to say)?
Not a clue. I've gone through three chapters. I know he has three more done, but I haven't seen them.
Grant, does this mean that you're a tech reviewer for this book? I'd love to be doing that for this one!
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 3, 2014 at 3:43 pm
Jeff Moden (2/3/2014)
Personally and just as a non-confontational point of view, I think GUIDs as a clustered index is a really bad thing to do especially since the advent of SEQUENCE. One other problem that I have with them is that people think they're still "Globally Unique" and they haven't been since MS changed them from Type 1 GUIDs (machine sensitive) to Type 4 GUIDs (not much more than a really big random number). While the chances of duplicating a GUID between even several machines is astonomical, it is a possibility that can happen and a lot of people don't bother with unique constraints on supposed key columns (a basic fault in design knowledge, as well).Are GUIDs always bad, IMHO? Not on your life but I treat them much like Cursors and While loops. They have their uses but I generally avoid them.
On that same subject, I'll add to the list of favorite Kimberly Tripp links already posted above.
http://technet.microsoft.com/en-US/sqlserver/gg508879.aspx
To be sure, a good deal of the problem I have with GUIDs is that I'd rather read/troubleshoot a (say) 12 digit number than a 36 character hex representation. 😛
Jeff,
Do you happen to have any links describing the change from Type1 to Type4? Also, does part of the GUID identify the type of GUID it is? (If so, then this would reduce the number of possibilities that the GUID could produce.)
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 3, 2014 at 3:47 pm
SQLRNNR (2/3/2014)
rodjkidd (1/29/2014)
Talking of Grant, did anyone see the picture of him on the SQLCruise yesterday- almost as red as the livery of that company he works for!Rodders...
Why yes, yes I did see him. Tho, in person he didn't seem as RED.
I'd guess that it was either a reflection of his shirt, or he heard of yet another production DBA without a backup.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 3, 2014 at 3:50 pm
WayneS (2/3/2014)
Grant Fritchey (2/3/2014)
wolfkillj (2/3/2014)
EDIT: Any word on when Jonathan's book will hit the streets (or "drop", as the kids today seem to say)?
Not a clue. I've gone through three chapters. I know he has three more done, but I haven't seen them.
Grant, does this mean that you're a tech reviewer for this book? I'd love to be doing that for this one!
Yeah, me and Hugo Kornelis (sp?).
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 3, 2014 at 3:58 pm
Please, please, please, no piling on, but I'm curious what you guys think about this: Unreasonable Defaults. I think most of the fundamental info is fine. I'm just... concerned?... with the statements around clustered indexes and the recommendations to go with a heap instead. It's basically 180 degrees out from what I recommend. I've been thinking about it, and I'm still comfortable with my recommendations, but it is a consideration. Curious what you guys think.
And again, this isn't an invite to beat on someone's head.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 3, 2014 at 4:00 pm
WayneS (2/3/2014)
SQLRNNR (2/3/2014)
rodjkidd (1/29/2014)
Talking of Grant, did anyone see the picture of him on the SQLCruise yesterday- almost as red as the livery of that company he works for!Rodders...
Why yes, yes I did see him. Tho, in person he didn't seem as RED.
I'd guess that it was either a reflection of his shirt, or he heard of yet another production DBA without a backup.
I think it was actually when his ssd died leading into a demo.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 3, 2014 at 4:11 pm
SQLRNNR (2/3/2014)
WayneS (2/3/2014)
SQLRNNR (2/3/2014)
rodjkidd (1/29/2014)
Talking of Grant, did anyone see the picture of him on the SQLCruise yesterday- almost as red as the livery of that company he works for!Rodders...
Why yes, yes I did see him. Tho, in person he didn't seem as RED.
I'd guess that it was either a reflection of his shirt, or he heard of yet another production DBA without a backup.
I think it was actually when his ssd died leading into a demo.
Yeah, that hurt.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 3, 2014 at 4:12 pm
Grant Fritchey (2/3/2014)
Please, please, please, no piling on, but I'm curious what you guys think about this: Unreasonable Defaults. I think most of the fundamental info is fine. I'm just... concerned?... with the statements around clustered indexes and the recommendations to go with a heap instead. It's basically 180 degrees out from what I recommend. I've been thinking about it, and I'm still comfortable with my recommendations, but it is a consideration. Curious what you guys think.And again, this isn't an invite to beat on someone's head.
Not gonna post on the thread at the link because he seems pretty convinced in his philosophy (e.g. 1 IO for a RID lookup vs. several IOs for a Key lookup - it really depends on the scenario for that one).
I recommend a CI on every table for reasons he doesn't bother to mention. I have seen the following with Heaps (with or without additional indexes).
1. Table bloat - deletes, inserts etc causing the table to bloat and not release unused space. Fixed by adding a CI
2. Forwarding records
3. Queries against a heap that should perform a seek that cause a RID lookup. Place a CI on the table (in the cases I have seen this it wasn't a matter of which columns where in the CI - just that one was present) and reduced the plan to a single branch and a CI seek. Query cost dropped by 80%.
I'm sure others have seen plenty of other issues as well. I don't agree with the mindset of throwing every column in the table into the "include" and leaving the table as a heap either. In fact, I don't agree with throwing every column into the include.
Would I consider leaving a table sans CI? Sure. It would need to be well documented as to the reasons why it was done that way first though (maybe the warehouse table gets completely rebuilt nightly or something like that).
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 3, 2014 at 4:40 pm
SQLRNNR (2/3/2014)
Grant Fritchey (2/3/2014)
Please, please, please, no piling on, but I'm curious what you guys think about this: Unreasonable Defaults. I think most of the fundamental info is fine. I'm just... concerned?... with the statements around clustered indexes and the recommendations to go with a heap instead. It's basically 180 degrees out from what I recommend. I've been thinking about it, and I'm still comfortable with my recommendations, but it is a consideration. Curious what you guys think.And again, this isn't an invite to beat on someone's head.
Not gonna post on the thread at the link because he seems pretty convinced in his philosophy (e.g. 1 IO for a RID lookup vs. several IOs for a Key lookup - it really depends on the scenario for that one).
I recommend a CI on every table for reasons he doesn't bother to mention. I have seen the following with Heaps (with or without additional indexes).
1. Table bloat - deletes, inserts etc causing the table to bloat and not release unused space. Fixed by adding a CI
2. Forwarding records
3. Queries against a heap that should perform a seek that cause a RID lookup. Place a CI on the table (in the cases I have seen this it wasn't a matter of which columns where in the CI - just that one was present) and reduced the plan to a single branch and a CI seek. Query cost dropped by 80%.
I'm sure others have seen plenty of other issues as well. I don't agree with the mindset of throwing every column in the table into the "include" and leaving the table as a heap either. In fact, I don't agree with throwing every column into the include.
Would I consider leaving a table sans CI? Sure. It would need to be well documented as to the reasons why it was done that way first though (maybe the warehouse table gets completely rebuilt nightly or something like that).
Azure SQL requires a CI for each table, no exceptions. As Azure SQL and SQLS share overlapping code bases with lots of code shared, I have a feeling that this may be a requirement in a future version of SQLS, too.
February 3, 2014 at 5:04 pm
Revenant (2/3/2014)
SQLRNNR (2/3/2014)
Grant Fritchey (2/3/2014)
Please, please, please, no piling on, but I'm curious what you guys think about this: Unreasonable Defaults. I think most of the fundamental info is fine. I'm just... concerned?... with the statements around clustered indexes and the recommendations to go with a heap instead. It's basically 180 degrees out from what I recommend. I've been thinking about it, and I'm still comfortable with my recommendations, but it is a consideration. Curious what you guys think.And again, this isn't an invite to beat on someone's head.
Not gonna post on the thread at the link because he seems pretty convinced in his philosophy (e.g. 1 IO for a RID lookup vs. several IOs for a Key lookup - it really depends on the scenario for that one).
I recommend a CI on every table for reasons he doesn't bother to mention. I have seen the following with Heaps (with or without additional indexes).
1. Table bloat - deletes, inserts etc causing the table to bloat and not release unused space. Fixed by adding a CI
2. Forwarding records
3. Queries against a heap that should perform a seek that cause a RID lookup. Place a CI on the table (in the cases I have seen this it wasn't a matter of which columns where in the CI - just that one was present) and reduced the plan to a single branch and a CI seek. Query cost dropped by 80%.
I'm sure others have seen plenty of other issues as well. I don't agree with the mindset of throwing every column in the table into the "include" and leaving the table as a heap either. In fact, I don't agree with throwing every column into the include.
Would I consider leaving a table sans CI? Sure. It would need to be well documented as to the reasons why it was done that way first though (maybe the warehouse table gets completely rebuilt nightly or something like that).
Azure SQL requires a CI for each table, no exceptions. As Azure SQL and SQLS share overlapping code bases with lots of code shared, I have a feeling that this may be a requirement in a future version of SQLS, too.
Maybe it's just me, but in a data warehouse, data is usually inserted mostly by Date and Company, so it isn't a complete mystery what might be included in a CI. Include too many columns, that is bad, usage will go way down.
February 3, 2014 at 5:16 pm
Grant Fritchey (2/3/2014)
Please, please, please, no piling on, but I'm curious what you guys think about this: Unreasonable Defaults. I think most of the fundamental info is fine. I'm just... concerned?... with the statements around clustered indexes and the recommendations to go with a heap instead. It's basically 180 degrees out from what I recommend. I've been thinking about it, and I'm still comfortable with my recommendations, but it is a consideration. Curious what you guys think.And again, this isn't an invite to beat on someone's head.
I saw that article via a LinkedIn post (to which I responded to sometime over the weekend, IIRC) that he'd created the article.
In some aspects, I agree but he sure did take a hell of a long time to say what we already know... that RID lookups can be terribly expensive whether you're looking up a single row or a million rows. So, IMHO, nothing new there. He just took way too long to explain the RID lookup problem.
The thing that really concerns me is his almost casual suggestion (see the ***BAM*** comment he made) that you use INCLUDE to make covering indexes. While that certainly avoids what he calls the "Clustered Index Penalty", I'm hoping that people don't read into that and start making coving indexes for every bloody query because of the data duplication caused by every non-Clustered Index. It doesn't take very many such indexes to turn a large but managable table into a monster that will eat huge amounts of disk space, especially during extent splits when adding new data, that will also easily double or triple already long index maintenance times not to mention giving the database a much larger footprint and related duration for backups and restores.
The other thing that he didn't test was INSERTs. I use Clustered Indexes for IDENTITY PKs like a lot of folks do but not for the same reasons. I use them to control the order of inserts to prevent page splits. Ostensibly, having a heap prevents that but even heaps will split on UPDATEs. Contrary to belief, disk space isn't as cheap as a lot of people think it is. Correct... the disks themselves don't cost that much.
A possible glaring fault with his testing is the order of the HEAP itself. What order were the rows inserted in? If, as typical with a heap, the rows were inserted in order by date (sales_date, in this case) and the only thing you ever looked up ranges of data by was the sales_date, then I agree... the HEAP will have less reads than if a clustered index is present even if the clustered index is on the date. However, what happens if you lookup the 200,000 rows based on criteria for a different indexed column? If the index is covering, we know it won't matter but if the index is not covering, what happens to the number of reads then? I'm thinking full table scan or, worse yet, a page read per row much like a wayward Index Seek can be.
To wit, I very much appreciate his findings and concerns but I believe his testing is grossly inadequate to support the conclusion that he's apparently reached. More testing needs to be done before anyone uses terms such as "Unreasonable Defaults" or {gasp} new "Best Practice". And, yes, I've seen this phenomena in my own testing where not even having an index for a date range lookup on a heap was almost as fast as traditional indexing but I've also seen where the addition of an index, even one that does a RID lookup, can be make the difference between night and day, so I'm not bagging everything this guy is raking.
This guy isn't the only person of such ilk when it comes to indexes. I also see that Brent Ozar published an article about what a waste of time it is to even REORGANIZE indexes. While I, again, see some merit in his proposal, I don't believe that enough testing has been done to make it a standard practice where I work.
And, to be absolutely clear... I'm not saying that either is wrong. I'm just saying that I've not seen enough testing to form a standard opinion of either practice other than "It Depends". Further testing may very well prove both individuals right.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 3, 2014 at 10:37 pm
Grant Fritchey (2/3/2014)
Please, please, please, no piling on, but I'm curious what you guys think about this: Unreasonable Defaults. I think most of the fundamental info is fine. I'm just... concerned?... with the statements around clustered indexes and the recommendations to go with a heap instead. It's basically 180 degrees out from what I recommend. I've been thinking about it, and I'm still comfortable with my recommendations, but it is a consideration. Curious what you guys think.And again, this isn't an invite to beat on someone's head.
I haven't read it in detail, but I suspect the problem is:
Although this article uses SQL Server as demo database, the article is equally relevant for MySQL/MariaDB with InnoDB and the Oracle database when using index-organized tables.
Those are 3 very, very different database engines. From what little I understand of Oracle, clustered indexes aren't great there. Hard to make an over-arching rule as to table structures on 3 quite different database engines that work and behave differently.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 42,751 through 42,765 (of 66,688 total)
You must be logged in to reply to this topic. Login to reply