March 29, 2017 at 6:23 am
Sergiy - Tuesday, March 28, 2017 10:48 PMSemantics or no semantics - can you answer the simple question?Why columnar storage (any implementation of it) is not updateable?
When ColumnStore tables were first introduced in SQL Server 2012, it wasn't updatable, but starting with version 2014 it can now be inserted, updated, and deleted just like a traditional RowStore table.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
March 29, 2017 at 7:41 pm
Google BigQuery also supports Updates now too.
March 30, 2017 at 5:03 am
When you add a word to an existing file and then immediately search for that word in the file you get a positive response.
This is a true data scan.
Whatever changed in the data is immediately reflected in the scan output.
Does it work like that with columnstore updates?
No.
Which means - there is no source data scan in Big Data queries.
The source data needs to be "prepared" before it's made available for scans.
You don't like the term "normalization"? Ok, let it be "compression".
You don't like "index"? Ok, let's name it "columnar store".
Whatever sells.
_____________
Code for TallyGenerator
March 30, 2017 at 7:37 am
Sergiy - Thursday, March 30, 2017 5:03 AMWhen you add a word to an existing file and then immediately search for that word in the file you get a positive response.This is a true data scan.Whatever changed in the data is immediately reflected in the scan output.Does it work like that with columnstore updates?No.Which means - there is no source data scan in Big Data queries.The source data needs to be "prepared" before it's made available for scans.You don't like the term "normalization"? Ok, let it be "compression".You don't like "index"? Ok, let's name it "columnar store".Whatever sells.
When a ColumnStore table is inserted/updated/deleted, the modification is first persisted temporarily to a row based DeltaStore and then depending on the COMPRESSION_DELAY <MINUTES> setting or when the DeltaStore has reached it full point (1 million rows by default), the Tuple Mover process will compress the rows into a ColumnStore block. Is this what you're referring to? Row modifications contained in the DeltaStore are still query-able and seamlessly integrated with the ColumnStore, but only with a higher degree of latency.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
April 5, 2017 at 9:07 pm
However you name it - the source data requires preparing/restructuring after being loaded and before it's made available for "scanning".
And however you name that preparation process, it's still nothing else but indexing.
Because the BigTable and all its descendants are nothing more that big indexes.
https://courses.cs.washington.edu/courses/cse444/14sp/lectures/lecture26-bigtable.pdf
You may use words "map", "key", "block", etc., but it's all other names for "index".
Big data systems go through your data, normalise it (sorry, should have said "compress") and automatically create for you that index you failed to identify and create in a relational database.
_____________
Code for TallyGenerator
Viewing 5 posts - 46 through 49 (of 49 total)
You must be logged in to reply to this topic. Login to reply