February 7, 2017 at 2:54 pm
OK, I know this sounds a bit wonky, but I have a client that has a very large database (about 2 TB) and they have multiple teams with multiple servers that need read only access to it for reporting. Right now they use replication to copy it out to the required servers. So we end up with 5 to 7 servers, each with their own copy of this database which takes up 10 to 14 TB of space. Which is just wasteful, but each team has different uses for the data and are located at various locations so they can't all use the same server. Not to mention the load would kill it.
Are there any features in SQL 2014 or SQL 2016 that would allow multiple servers to see and read from a single set of datafiles located on a SAN or from the cloud? I have heard a bit about Data Lakes, and on the surface that sounds like what I am looking for. A big old lake of data that can be served up to multiple SQL instances and users.
Or am I just out of luck and delusional?
Thanks!!!
Jim
February 7, 2017 at 5:26 pm
Jim Youmans-439383 - Tuesday, February 7, 2017 2:54 PMOK, I know this sounds a bit wonky, but I have a client that has a very large database (about 2 TB) and they have multiple teams with multiple servers that need read only access to it for reporting. Right now they use replication to copy it out to the required servers. So we end up with 5 to 7 servers, each with their own copy of this database which takes up 10 to 14 TB of space. Which is just wasteful, but each team has different uses for the data and are located at various locations so they can't all use the same server. Not to mention the load would kill it.Are there any features in SQL 2014 or SQL 2016 that would allow multiple servers to see and read from a single set of datafiles located on a SAN or from the cloud? I have heard a bit about Data Lakes, and on the surface that sounds like what I am looking for. A big old lake of data that can be served up to multiple SQL instances and users.
Or am I just out of luck and delusional?
Thanks!!!
Jim
1) If you think the load would kill a local real server, what do you think will happen to that shared storage? And what about the latency to move that data over some wire to the compute resources?
2) You can buy EXTRAORDINARILY POWERFUL Windows servers now. Multi-terabytes of RAM and Flash-based IO with incredible bandwidth. This is the simplest and best solution by far.
3) You can set up Always On (or old-school database mirroring with read-only snapshots) and have readable secondaries. Unfortunately that carries a negative consequence of modifying primary rows to add 14-byte version store pointer for all updates that happen. BAD stuff there potentially.
4) Thinking a bit outside the box I wonder if you could use Redgate's new product SQL Clone to make this work. Theoretically you could, although it doesn't refresh with updates that the source database contains once established.
5) Sounds like some interesting stuff though! I'm envious!!
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 8, 2017 at 12:11 am
Jim Youmans-439383 - Tuesday, February 7, 2017 2:54 PMAre there any features in SQL 2014 or SQL 2016 that would allow multiple servers to see and read from a single set of datafiles located on a SAN or from the cloud?
In short, no.
The data lake is a completely different architecture, it's the Hadoop file system, designed for unstructured data and lots of it.
The problem with SQL and multiple machines isn't that the same drive can't be exposed to multiple machines, it can. The 'problem' is that SQL server locks the DB files open exclusive, and hence only one SQL Server can access a single set of DB files.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 8, 2017 at 12:46 am
There is a very little known feature that is called - Scalable shared databases. It lets several servers use the same database if it is a read only database. I have to admit that not only that I've never used it, I've never heard of anyone that used it (and I suspect that there is a good reason for that), but you can try and search for more details on the net.
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
February 8, 2017 at 4:12 am
Two options:
(1) Redgate SQLClone - currently in Beta, so might not be stable / ready for you; on the other hand, I'm sure they'd be happy to talk to you about it. There is a limit of 2TB on the database size here; you mention that your databases are "about 2TB", so this might not be suitable if you're just over that limit.
(2) Scalable Shared Databases. An enterprise feature since SQL 2005, but not very often used or documented... some material on TechNet, SSC, and SQLBlog. Relies on the database files being on read-only volumes.
The update / refresh procedures for both are likely to be a bit of a faff; however, nothing that you shouldn't be able to get round with some scripting, depending on how often you're going to want to go through a refresh cycle...
Good luck. Sounds as though there's blog fodder in here for you, if you decide to go down that route. 🙂
Thomas Rushton
blog: https://thelonedba.wordpress.com
February 8, 2017 at 5:07 am
Adi Cohn-120898 - Wednesday, February 8, 2017 12:46 AMThere is a very little known feature that is called - Scalable shared databases. It lets several servers use the same database if it is a read only database.
The key being that ALL have to be read only. If the data has to change, all but one server has to detach the DB, the SAN volume has to be detached from all but one server, the DB made read-write, the change made and then the entire topology re-set up.
Also worth noting that I can't seem to find reference to Scalable Shared Databases in the 2012, 2014 or 2016 documentation.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 8, 2017 at 9:47 am
Jim Youmans-439383 - Tuesday, February 7, 2017 2:54 PMOK, I know this sounds a bit wonky, but I have a client that has a very large database (about 2 TB) and they have multiple teams with multiple servers that need read only access to it for reporting. Right now they use replication to copy it out to the required servers. So we end up with 5 to 7 servers, each with their own copy of this database which takes up 10 to 14 TB of space. Which is just wasteful, but each team has different uses for the data and are located at various locations so they can't all use the same server. Not to mention the load would kill it.Are there any features in SQL 2014 or SQL 2016 that would allow multiple servers to see and read from a single set of datafiles located on a SAN or from the cloud? I have heard a bit about Data Lakes, and on the surface that sounds like what I am looking for. A big old lake of data that can be served up to multiple SQL instances and users.
Or am I just out of luck and delusional?
Thanks!!!
Jim
Just throwing an idea in here...since each team has different needs, without wasting too much space, would there be any benefit in setting up a DataMart for each region and only updating the data they need via ETL as an example?
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
This thing is addressing problems that dont exist. Its solution-ism at its worst. We are dumbing down machines that are inherently superior. - Gilfoyle
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply