June 10, 2016 at 10:08 am
Steve Thompson-454462 (6/10/2016)
Brandie Tarvin (6/10/2016)
Thanks for the reply.
I wonder... Our reporting team has a database where they essentially flatten all our data into giant tables so their reports read quicker. Would this be a candidate for a NoSQL solution?
I think it could be; it would depend on the downstream system generating the reports - can it consume/parse JSON-like data? There are plenty of reporting stacks that work with NoSQL stores; for example, Kibana for ElasticSearch.
If the data you are flattening has a variable schema (e.g. sometimes columns are missing, sometimes new columns are added, etc.), document stores will happily accept this at write, so that might be a nice value add. Then it would be up to the reporting client to manage the data (e.g. come up with a new binding for the new column).
This sounds like a data warehouse, which could be the best way to handle things, if the ad hoc querying is met well by indexing and the ETL isn't too bad.
I hesitate to move technologies if something works, since you're introducing work for questionable benefit. My experience with using new technologies is that you need someone with experience and expertise to make it work well. If you try it yourself, you don't necessarily know the best architecture or design and end up with a system that may not work.
It isn't the technology, but the knowledge that makes things work well. I have seen quite a few NoSQl/Hadoop/MongoDB migrations not produce benefits. However, others with similar problem domains have great success. Has to be people and knowledge that make the big difference.
June 10, 2016 at 10:22 am
Steve Jones - SSC Editor (6/10/2016)
This sounds like a data warehouse, which could be the best way to handle things, if the ad hoc querying is met well by indexing and the ETL isn't too bad.
I hesitate to move technologies if something works, since you're introducing work for questionable benefit. My experience with using new technologies is that you need someone with experience and expertise to make it work well. If you try it yourself, you don't necessarily know the best architecture or design and end up with a system that may not work.
It isn't the technology, but the knowledge that makes things work well. I have seen quite a few NoSQl/Hadoop/MongoDB migrations not produce benefits. However, others with similar problem domains have great success. Has to be people and knowledge that make the big difference.
Agree with Steve that the old "if it ain't broke, don't fix it" paradigm applies. Data Warehouses feeding downstream reporting services is a well-established model. Incentive to explore a new way of handling it could come from scope changes like the need to combine log data (or change data) with relational master data or, as mentioned before, issues with scaling. But keeping up with the NoSQL Joneses probably isn't a compelling reason.
My organization has spiked out several system designs based on Hadoop or Mongo only to decide that good old SQL Server was still the right tool for the job. Though that's the critical point: not closing your mind to new stuff simply because it's new and perhaps intimidating. Finding a way to spike the new tech so that if it's not going to work for you, you fail it early and walk away with, if not a shiny new toy in production, at least some shiny new knowledge, can be beneficial.
June 10, 2016 at 10:29 am
Steve Jones - SSC Editor (6/10/2016)
Steve Thompson-454462 (6/10/2016)
Brandie Tarvin (6/10/2016)
Thanks for the reply.
I wonder... Our reporting team has a database where they essentially flatten all our data into giant tables so their reports read quicker. Would this be a candidate for a NoSQL solution?
I think it could be; it would depend on the downstream system generating the reports - can it consume/parse JSON-like data? There are plenty of reporting stacks that work with NoSQL stores; for example, Kibana for ElasticSearch.
If the data you are flattening has a variable schema (e.g. sometimes columns are missing, sometimes new columns are added, etc.), document stores will happily accept this at write, so that might be a nice value add. Then it would be up to the reporting client to manage the data (e.g. come up with a new binding for the new column).
This sounds like a data warehouse, which could be the best way to handle things, if the ad hoc querying is met well by indexing and the ETL isn't too bad.
I hesitate to move technologies if something works, since you're introducing work for questionable benefit. My experience with using new technologies is that you need someone with experience and expertise to make it work well. If you try it yourself, you don't necessarily know the best architecture or design and end up with a system that may not work.
It isn't the technology, but the knowledge that makes things work well. I have seen quite a few NoSQl/Hadoop/MongoDB migrations not produce benefits. However, others with similar problem domains have great success. Has to be people and knowledge that make the big difference.
We're currently using a data warehouse for this. The ETL isn't too nasty, but it took us forever to figure out the best way for our different systems to meld into it.
I'm only asking about this as a use case so that I can better understand what NoSQL could be used for, not because I want to spend the next two years disrupting something that works just to apply a NoSQL experiment that may crash on us. @=) I'm sure my boss wouldn't see the ROI in trying such a thing since we're not hurting.
June 10, 2016 at 10:54 am
Phil Parkin (6/8/2016)
And why not ask them to make it idempotent while you're at it:
IF COL_LENGTH('dbo.mytable', 'col1') IS NULL
BEGIN
ALTER TABLE dbo.mytable
ADD Col1 INT;
END;
I've never seen an existence check for a column done like this before... it's always been more complicated. I like it!
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
June 10, 2016 at 11:05 am
Brandie Tarvin (6/10/2016)
[We're currently using a data warehouse for this. The ETL isn't too nasty, but it took us forever to figure out the best way for our different systems to meld into it.
I'm only asking about this as a use case so that I can better understand what NoSQL could be used for, not because I want to spend the next two years disrupting something that works just to apply a NoSQL experiment that may crash on us. @=) I'm sure my boss wouldn't see the ROI in trying such a thing since we're not hurting.
What I'd do, Brandie, is POC this. I'd take one report, maybe one that's run a lot, and then get the minimum data I needed for this. I'd put together a small, side ETL pipeline to pull that data into some format. Maybe use JSON, but have it ready.
Now use that as a source to pull into Mongo. Or Couchbase, or neo4J or something. Maybe multiple items. See what it's like to move the data. Get help from people on that part.
Now, can I report from that? How easy? Make that a separate POC.
Start small and play. I often see people thinking they need to make some big, complex change. Especially when looking at CI/CD. Don't do that. Start small, and work in stages. At each point, see what you learn, what's better or worse. And question that you really understand the new tech.
June 10, 2016 at 11:06 am
BTw, if you want to play with Mongo, Redgate invested in 3T, and you can use their tools to work with the db: http://3t.io/mongochef/
June 10, 2016 at 11:20 am
WayneS (6/10/2016)
Phil Parkin (6/8/2016)
And why not ask them to make it idempotent while you're at it:
IF COL_LENGTH('dbo.mytable', 'col1') IS NULL
BEGIN
ALTER TABLE dbo.mytable
ADD Col1 INT;
END;
I've never seen an existence check for a column done like this before... it's always been more complicated. I like it!
Me either. I've always used:
if (select count(*) from sys.columns where object_id = object_id('dbo.table_name', 'u') and name = 'column_name') = 0
begin
alter table dbo.table_name add column_name integer;
end;
This way looks pretty cool. I'm going to have to play around with this and see if I can find a way to break it. Thanks.
June 10, 2016 at 11:32 am
Steve Jones - SSC Editor (6/10/2016)
Brandie Tarvin (6/10/2016)
[We're currently using a data warehouse for this. The ETL isn't too nasty, but it took us forever to figure out the best way for our different systems to meld into it.
I'm only asking about this as a use case so that I can better understand what NoSQL could be used for, not because I want to spend the next two years disrupting something that works just to apply a NoSQL experiment that may crash on us. @=) I'm sure my boss wouldn't see the ROI in trying such a thing since we're not hurting.
What I'd do, Brandie, is POC this. I'd take one report, maybe one that's run a lot, and then get the minimum data I needed for this. I'd put together a small, side ETL pipeline to pull that data into some format. Maybe use JSON, but have it ready.
Now use that as a source to pull into Mongo. Or Couchbase, or neo4J or something. Maybe multiple items. See what it's like to move the data. Get help from people on that part.
Now, can I report from that? How easy? Make that a separate POC.
Start small and play. I often see people thinking they need to make some big, complex change. Especially when looking at CI/CD. Don't do that. Start small, and work in stages. At each point, see what you learn, what's better or worse. And question that you really understand the new tech.
Thanks for the advice. I appreciate it.
June 10, 2016 at 11:34 am
Ed Wagner (6/10/2016)
WayneS (6/10/2016)
Phil Parkin (6/8/2016)
And why not ask them to make it idempotent while you're at it:
IF COL_LENGTH('dbo.mytable', 'col1') IS NULL
BEGIN
ALTER TABLE dbo.mytable
ADD Col1 INT;
END;
I've never seen an existence check for a column done like this before... it's always been more complicated. I like it!
Me either. I've always used:
if (select count(*) from sys.columns where object_id = object_id('dbo.table_name', 'u') and name = 'column_name') = 0
begin
alter table dbo.table_name add column_name integer;
end;
This way looks pretty cool. I'm going to have to play around with this and see if I can find a way to break it. Thanks.
I just use information_Schema.
IF (SELECT Column_Name FROM Information_Schema.Columns
WHERE Table_name = 'MyTable' AND Column_name = 'MyCol') IS NULL
ALTER TABLE dbo.MyTable
ADD MyCol <datatype>;
And I use the same code to check the data type and the size if I'm going to alter an existing column (say from VARCHAR(10) to VARCHAR(50)), just adding the appropriate columns.
June 10, 2016 at 12:23 pm
Brandie Tarvin (6/10/2016)
Ed Wagner (6/10/2016)
WayneS (6/10/2016)
Phil Parkin (6/8/2016)
And why not ask them to make it idempotent while you're at it:
IF COL_LENGTH('dbo.mytable', 'col1') IS NULL
BEGIN
ALTER TABLE dbo.mytable
ADD Col1 INT;
END;
I've never seen an existence check for a column done like this before... it's always been more complicated. I like it!
Me either. I've always used:
if (select count(*) from sys.columns where object_id = object_id('dbo.table_name', 'u') and name = 'column_name') = 0
begin
alter table dbo.table_name add column_name integer;
end;
This way looks pretty cool. I'm going to have to play around with this and see if I can find a way to break it. Thanks.
I just use information_Schema.
IF (SELECT Column_Name FROM Information_Schema.Columns
WHERE Table_name = 'MyTable' AND Column_name = 'MyCol') IS NULL
ALTER TABLE dbo.MyTable
ADD MyCol <datatype>;
And I use the same code to check the data type and the size if I'm going to alter an existing column (say from VARCHAR(10) to VARCHAR(50)), just adding the appropriate columns.
I tend to steer away from InformationSchema in favour of Catalog Views:
Catalog views return information that is used by the SQL Server Database Engine. We recommend that you use catalog views because they are the most general interface to the catalog metadata and provide the most efficient way to obtain, transform, and present customized forms of this information. All user-available catalog metadata is exposed through catalog views.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
June 10, 2016 at 12:40 pm
Phil Parkin (6/10/2016)
Brandie Tarvin (6/10/2016)
Ed Wagner (6/10/2016)
WayneS (6/10/2016)
Phil Parkin (6/8/2016)
And why not ask them to make it idempotent while you're at it:
IF COL_LENGTH('dbo.mytable', 'col1') IS NULL
BEGIN
ALTER TABLE dbo.mytable
ADD Col1 INT;
END;
I've never seen an existence check for a column done like this before... it's always been more complicated. I like it!
Me either. I've always used:
if (select count(*) from sys.columns where object_id = object_id('dbo.table_name', 'u') and name = 'column_name') = 0
begin
alter table dbo.table_name add column_name integer;
end;
This way looks pretty cool. I'm going to have to play around with this and see if I can find a way to break it. Thanks.
I just use information_Schema.
IF (SELECT Column_Name FROM Information_Schema.Columns
WHERE Table_name = 'MyTable' AND Column_name = 'MyCol') IS NULL
ALTER TABLE dbo.MyTable
ADD MyCol <datatype>;
And I use the same code to check the data type and the size if I'm going to alter an existing column (say from VARCHAR(10) to VARCHAR(50)), just adding the appropriate columns.
I tend to steer away from InformationSchema in favour of Catalog Views:
Catalog views return information that is used by the SQL Server Database Engine. We recommend that you use catalog views because they are the most general interface to the catalog metadata and provide the most efficient way to obtain, transform, and present customized forms of this information. All user-available catalog metadata is exposed through catalog views.
Developers here use INFORMATION_SCHEMA view, COL_LENGTH(...), count of sys.columns; what ever suits their fancy. Luckily each developer uses the same method each time, it's just different between developers.
I use the catalog views myself.
June 10, 2016 at 12:59 pm
Lynn Pettis (6/10/2016)
Phil Parkin (6/10/2016)
Brandie Tarvin (6/10/2016)
Ed Wagner (6/10/2016)
WayneS (6/10/2016)
Phil Parkin (6/8/2016)
And why not ask them to make it idempotent while you're at it:
IF COL_LENGTH('dbo.mytable', 'col1') IS NULL
BEGIN
ALTER TABLE dbo.mytable
ADD Col1 INT;
END;
I've never seen an existence check for a column done like this before... it's always been more complicated. I like it!
Me either. I've always used:
if (select count(*) from sys.columns where object_id = object_id('dbo.table_name', 'u') and name = 'column_name') = 0
begin
alter table dbo.table_name add column_name integer;
end;
This way looks pretty cool. I'm going to have to play around with this and see if I can find a way to break it. Thanks.
I just use information_Schema.
IF (SELECT Column_Name FROM Information_Schema.Columns
WHERE Table_name = 'MyTable' AND Column_name = 'MyCol') IS NULL
ALTER TABLE dbo.MyTable
ADD MyCol <datatype>;
And I use the same code to check the data type and the size if I'm going to alter an existing column (say from VARCHAR(10) to VARCHAR(50)), just adding the appropriate columns.
I tend to steer away from InformationSchema in favour of Catalog Views:
Catalog views return information that is used by the SQL Server Database Engine. We recommend that you use catalog views because they are the most general interface to the catalog metadata and provide the most efficient way to obtain, transform, and present customized forms of this information. All user-available catalog metadata is exposed through catalog views.
Developers here use INFORMATION_SCHEMA view, COL_LENGTH(...), count of sys.columns; what ever suits their fancy. Luckily each developer uses the same method each time, it's just different between developers.
I use the catalog views myself.
I also use the catalog views, probably because I'm most comfortable with them and have been using them forever.
It's good that the developers are consistent, if not with each other, then at least with themselves.
June 10, 2016 at 1:15 pm
Phil Parkin (6/8/2016)
And why not ask them to make it idempotent while you're at it:
IF COL_LENGTH('dbo.mytable', 'col1') IS NULL
BEGIN
ALTER TABLE dbo.mytable
ADD Col1 INT;
END;
That's the one I use and most of the folks at work use it, as well. We also use a variation with the known "length" when changing datatypes from, say, an INT to a BIGINT.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 13, 2016 at 3:25 am
Steve Jones - SSC Editor (6/10/2016)
BTw, if you want to play with Mongo, Redgate invested in 3T, and you can use their tools to work with the db: http://3t.io/mongochef/
A series of good tutorials for MongoDB can be found at the MongoDB Univerity - https://university.mongodb.com/
June 13, 2016 at 3:35 am
Anyone coming to InsideSQL tomorrow?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 54,556 through 54,570 (of 66,712 total)
You must be logged in to reply to this topic. Login to reply