January 4, 2016 at 9:03 am
Comments posted to this topic are about the item Stairway to Columnstore Indexes Level 6: Updating and Deleting Data in a Columnstore Index
December 8, 2016 at 8:09 am
Hello Hugo,
Thanks for this article.
I am a bit confused by your explanation along with Listing 6-8 and Figure 6-5 because it looks like "Rowgroup 2" and "Segment 2" seem to be used interchangeably here.
In the query of Figure 6-5, the number 2 you specify is the segment ID, then how do you confirm that this belongs to Rowgroup 2?
Also, the list of columns: segment ID, min data ID, max data ID, and then min data ID for segment 2, max data ID for segment 2.
Are these all for rowgroup 2?
The more I read, the more it gets confusing here...
Would appreciate your explanation.
Thanks again
December 10, 2016 at 7:59 am
choiclark (12/8/2016)
Hello Hugo,Thanks for this article.
I am a bit confused by your explanation along with Listing 6-8 and Figure 6-5 because it looks like "Rowgroup 2" and "Segment 2" seem to be used interchangeably here.
In the query of Figure 6-5, the number 2 you specify is the segment ID, then how do you confirm that this belongs to Rowgroup 2?
Also, the list of columns: segment ID, min data ID, max data ID, and then min data ID for segment 2, max data ID for segment 2.
Are these all for rowgroup 2?
The more I read, the more it gets confusing here...
Would appreciate your explanation.
Thanks again
Thank you, Choiclark, for your kind words.
And thanks for asking this question. I can easily see where this confusion is coming from, especially after rereading that part of the article with your question in mind.
When SQL Server 2012 was released with the first iteration of columnstore indexes, a lot of the documentation and other supporting material used the terms "segment" and "rowgroup" interchangeably. This has resulted in a lot of confusion, and a lot of people thinking that segments and rowgroups are the same thing, which is not the case. In fact, the process where entire rowgroups are eliminated from a query based in min_data_id and max_data_id is still most often called "segment eliminiation" instead of the more correct "rowgroup elimination".
When you just look at my text and compare it to the query text used, you will see this same confusion of terms. In the text, I just double checked and I did not find any invalid use of the terms. But the text refers to rowgroups, and the query uses a column named segment_id to identify a single ... rowgroup.
The one part where I did mess up (and add to the confusion) is when I assigned aliases to the columns in the listings. The two columns that I aliased as min_data_id__segment_2 and max_data_id__segment_2 should have been called min_data_id__rowgroup_2 and max_data_id__rowgroup_2 instead. So the data on line 20 in figure 6-5 can be interpreted as: "In the segment for column SalesOrderHeader in rowgroup 3, the minimum and maximum data id are 1 and 4972; as compared to 1 and 4439 for the same column in rowgroup 2".
I hope it makes more sense now. If not, please ask and I'll try to explain better.
Cheers,
Hugo
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply