October 23, 2017 at 6:47 am
Jeff Moden - Saturday, October 21, 2017 6:06 PMpatrickmcginnis59 10839 - Friday, October 20, 2017 12:12 PMShifting gears, people will site things like the fact that EBAY uses no stored procedures at all forgetting or not knowing the fact that EBAY maintains more than 600 servers. No one has ever tested there to see whether or not stored procedures would decrease that ridiculous number. My personal opinion is that they could get away with about 1/10th of that. if they used stored procedures.
I would love to know more about this, do you have numbers or citations to back up your assertion?
edit slight rephrasing 🙂
Sure... no problem.
This is the link that I saw years ago. It's from 2008 (damn... time flies... it seems like yesterday when I first saw that article during a heated debate about using stored procedures or not)
A lot can happen in nearly a decade. They now use a technique known as "MicroServices", which sounds very similar to what they were doing a decade ago but didn't have a name for it. Here's a later article on the subject.
To the best of my knowledge, they still use no stored procedures, no referential integrity, no ACID, and continue to use an "eventually synchronized" model. They don't even use JOINs in their code. It's all done via the front end. They basically use the database quite literally as "just a place to store data".
I did google and got what looks to be pretty much the same info you got, but nowhere did I find any citation that said they could reduce their server count ten fold with stored procedures. It seems really unlikely and hard to believe.
October 23, 2017 at 7:34 am
Jeff Moden - Saturday, October 21, 2017 6:06 PMpatrickmcginnis59 10839 - Friday, October 20, 2017 12:12 PMShifting gears, people will site things like the fact that EBAY uses no stored procedures at all forgetting or not knowing the fact that EBAY maintains more than 600 servers. No one has ever tested there to see whether or not stored procedures would decrease that ridiculous number. My personal opinion is that they could get away with about 1/10th of that. if they used stored procedures.
I would love to know more about this, do you have numbers or citations to back up your assertion?
edit slight rephrasing 🙂
Sure... no problem.
This is the link that I saw years ago. It's from 2008 (damn... time flies... it seems like yesterday when I first saw that article during a heated debate about using stored procedures or not)
A lot can happen in nearly a decade. They now use a technique known as "MicroServices", which sounds very similar to what they were doing a decade ago but didn't have a name for it. Here's a later article on the subject.
To the best of my knowledge, they still use no stored procedures, no referential integrity, no ACID, and continue to use an "eventually synchronized" model. They don't even use JOINs in their code. It's all done via the front end. They basically use the database quite literally as "just a place to store data".
Reading the article, it seems the bulk of their data is stored in Hadoop clusters (hundreds of them) and they leverage a product called Apache Kylin to stack a SQL query and OLAP cube layer on top of it.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
October 26, 2017 at 8:08 pm
patrickmcginnis59 10839 - Monday, October 23, 2017 6:47 AMJeff Moden - Saturday, October 21, 2017 6:06 PMpatrickmcginnis59 10839 - Friday, October 20, 2017 12:12 PMShifting gears, people will site things like the fact that EBAY uses no stored procedures at all forgetting or not knowing the fact that EBAY maintains more than 600 servers. No one has ever tested there to see whether or not stored procedures would decrease that ridiculous number. My personal opinion is that they could get away with about 1/10th of that. if they used stored procedures.
I would love to know more about this, do you have numbers or citations to back up your assertion?
edit slight rephrasing 🙂
Sure... no problem.
This is the link that I saw years ago. It's from 2008 (damn... time flies... it seems like yesterday when I first saw that article during a heated debate about using stored procedures or not)
A lot can happen in nearly a decade. They now use a technique known as "MicroServices", which sounds very similar to what they were doing a decade ago but didn't have a name for it. Here's a later article on the subject.
To the best of my knowledge, they still use no stored procedures, no referential integrity, no ACID, and continue to use an "eventually synchronized" model. They don't even use JOINs in their code. It's all done via the front end. They basically use the database quite literally as "just a place to store data".
I did google and got what looks to be pretty much the same info you got, but nowhere did I find any citation that said they could reduce their server count ten fold with stored procedures. It seems really unlikely and hard to believe.
Correct. No one else said that... I said it as a speculation based on what their configuration is and what I think it could be.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 27, 2017 at 6:44 am
Jeff Moden - Thursday, October 26, 2017 8:08 PMpatrickmcginnis59 10839 - Monday, October 23, 2017 6:47 AMJeff Moden - Saturday, October 21, 2017 6:06 PMpatrickmcginnis59 10839 - Friday, October 20, 2017 12:12 PMShifting gears, people will site things like the fact that EBAY uses no stored procedures at all forgetting or not knowing the fact that EBAY maintains more than 600 servers. No one has ever tested there to see whether or not stored procedures would decrease that ridiculous number. My personal opinion is that they could get away with about 1/10th of that. if they used stored procedures.
I would love to know more about this, do you have numbers or citations to back up your assertion?
edit slight rephrasing 🙂
Sure... no problem.
This is the link that I saw years ago. It's from 2008 (damn... time flies... it seems like yesterday when I first saw that article during a heated debate about using stored procedures or not)
A lot can happen in nearly a decade. They now use a technique known as "MicroServices", which sounds very similar to what they were doing a decade ago but didn't have a name for it. Here's a later article on the subject.
To the best of my knowledge, they still use no stored procedures, no referential integrity, no ACID, and continue to use an "eventually synchronized" model. They don't even use JOINs in their code. It's all done via the front end. They basically use the database quite literally as "just a place to store data".
I did google and got what looks to be pretty much the same info you got, but nowhere did I find any citation that said they could reduce their server count ten fold with stored procedures. It seems really unlikely and hard to believe.
Correct. No one else said that... I said it as a speculation based on what their configuration is and what I think it could be.
It was speculation that these guys had a "ridiculous" number of servers and that if they used stored procedures they could reduce their server count up to 90 percent, but that seems like a counter intuitive assertion in itself. EBay is big on serving up connections. You yourself are constantly asserting how costly "RBAR" is. When you serve up customers connections with content, you aren't returning 10s of thousands of rows, chances are, the row count is going to be low enough to where the software stack / procedure call is going to start factoring in.
Think of what a high volume website looks like, many many connections ordering up a rather lowish number of rows, maybe a shopping cart, etc. This is getting close to ticking the "RBAR" penalty, which with SQL Server means that the lower number of rows returned per call to the database means that the T-SQL penalty starts rearing its head.
Heck, folks that put a high volume database on the web are often advised to put a web cache in, and serve content from that instead of hitting the database for everything. Now possibly you could regain some performance with SQL Servers memory optimized stuff but why not directly suggest that instead of the stored procedure speculation? I just don't think your speculation is based on any viable factors but I would be happy to be convinced otherwise 🙂
October 28, 2017 at 10:52 am
patrickmcginnis59 10839 - Friday, October 27, 2017 6:44 AMJeff Moden - Thursday, October 26, 2017 8:08 PMpatrickmcginnis59 10839 - Monday, October 23, 2017 6:47 AMJeff Moden - Saturday, October 21, 2017 6:06 PMpatrickmcginnis59 10839 - Friday, October 20, 2017 12:12 PMShifting gears, people will site things like the fact that EBAY uses no stored procedures at all forgetting or not knowing the fact that EBAY maintains more than 600 servers. No one has ever tested there to see whether or not stored procedures would decrease that ridiculous number. My personal opinion is that they could get away with about 1/10th of that. if they used stored procedures.
I would love to know more about this, do you have numbers or citations to back up your assertion?
edit slight rephrasing 🙂
Sure... no problem.
This is the link that I saw years ago. It's from 2008 (damn... time flies... it seems like yesterday when I first saw that article during a heated debate about using stored procedures or not)
A lot can happen in nearly a decade. They now use a technique known as "MicroServices", which sounds very similar to what they were doing a decade ago but didn't have a name for it. Here's a later article on the subject.
To the best of my knowledge, they still use no stored procedures, no referential integrity, no ACID, and continue to use an "eventually synchronized" model. They don't even use JOINs in their code. It's all done via the front end. They basically use the database quite literally as "just a place to store data".
I did google and got what looks to be pretty much the same info you got, but nowhere did I find any citation that said they could reduce their server count ten fold with stored procedures. It seems really unlikely and hard to believe.
Correct. No one else said that... I said it as a speculation based on what their configuration is and what I think it could be.
It was speculation that these guys had a "ridiculous" number of servers and that if they used stored procedures they could reduce their server count up to 90 percent, but that seems like a counter intuitive assertion in itself. EBay is big on serving up connections. You yourself are constantly asserting how costly "RBAR" is. When you serve up customers connections with content, you aren't returning 10s of thousands of rows, chances are, the row count is going to be low enough to where the software stack / procedure call is going to start factoring in.
Think of what a high volume website looks like, many many connections ordering up a rather lowish number of rows, maybe a shopping cart, etc. This is getting close to ticking the "RBAR" penalty, which with SQL Server means that the lower number of rows returned per call to the database means that the T-SQL penalty starts rearing its head.
Heck, folks that put a high volume database on the web are often advised to put a web cache in, and serve content from that instead of hitting the database for everything. Now possibly you could regain some performance with SQL Servers memory optimized stuff but why not directly suggest that instead of the stored procedure speculation? I just don't think your speculation is based on any viable factors but I would be happy to be convinced otherwise 🙂
Yep. Understood. I thought I made it clear that it was speculation on my part but maybe not.
Just to clear things up a bit and although we certainly aren't operating anywhere near the scale the EBAY does, my "speculation" isn't based on hear-say nor even supposed "Best Practice" (which are frequently not) recommendations or arguments. They're based on the personal experience of my own small world.
When I first reported to my current job 6 years ago, the two main databases were at about 65GB each, the system had a total of less than a dozen databases, and only had about 20 or so concurrent users with a couple dozen more connections sleeping. Even though the company is in the mortgage insurance business, there was little in the way of document management, etc, and what of that there was, was on a separate system written by a 3rd party.
The system had 16 core, 128GB RAM, and SQL Server 2005. That's a fairly robust system for such a light load and yet CPU typically bounced around between 40 and 60% during the 10 hour "work day", screen returns typically took between 5 and 30 seconds (horrible), and there were multiple 10 minute outages every day. There was also a more than merely noticeable delay just in tabbing from one field to another.
Because of the data we handle, the previous regime of front-end Developers made it so that the current user was checked for role membership on every focus change on screen. They had written a stored procedure that would check for "users in roles" and called that for every focus change. Seemingly contrary to what my "speculation" is about EBAY, we did a quick "fix" (it wasn't a fix, it was a "patch"). We realized that the list of users and roles wouldn't change much (perhaps once or twice a day) and so the Developers elected to cache the data on the web servers and only update that data once an hour. It worked. CPU dropped by about 5%, IO dropped a bit more, and the on-screen focus changes worked a lot faster with no perceptible delay. They also had to bring another web server online because the two they had were now screaming from elevated CPU and disk usage.
The bottom line for this particular problem was that they didn't actually fix anything other than the time between focus changes. They simply moved the problem to the web servers and now we had an extra web server to license and maintain. Between that and other problems with the front-end code as our customer base to grow, we ended up having to add more web servers over the next two years to keep the ones we had from melting down. That got expensive for "fixing" such a relatively simple problem. I'm no front-ender and so I can't tell you what they did to finally fix that problem but they did. I suspect they simply eliminated the role verification for every focus change and limited it to only screen changes.
In the mean time, we instituted a "Continuous Improvement Program" where I'd find the worst performing queries (which should NOT be confused with the longest running queries) and we either fixed them by adding the correct index or by moving the queries from managed and (ugh!) ORM code to well written stored procedures (training the Developers, in the process). The activity on the web servers decreased substantially and the database server CPU usage and I/O plummeted. Fast forward to April of 2016 and that same server had now grown to over 2 Terabytes, we had moved several other systems/applications to the server, created several new applications including some heavy duty document management, added hundreds of new clients, the active user connections grew to 400 during any particular second with hundreds more "sleeping", and yet our CPU time was nearly at "idle" at about 5-8% and we had eliminated nearly 100 Terabytes of logical I/O during the normal 10 hour day and, except for another front-end related problem that reared it's ugly head that we found and fixed, we've had no outages like we used to even though we run a great deal of ETL related batch processes on the same box day and night. And, no... we didn't cache more things on the web servers. Yes, it did take a while because we had to work on a system that needed to stay active all the time and we only had one system. We couldn't upgrade 1 of 600 and move to the next.
I don't know what EBAY actually did but one of their engineers made a discovery (circa 2016, IIRC) and a fix that allowed them to take 400 servers offline without a hiccup and performance improved. According to them, it saved them a megawatt in electricity for the servers and the related supporting systems such as cooling and connectivity devices. It probably wasn't due to writing a stored procedure but the point is that even their wonderful system benefitted from a relatively simple tweak.
So, yes, because I'm not privy to what EBAY actually has going on in any great detail (especially in code) my comments are "speculation" but... due to the personal experience I've had with our smaller system and it's substantial growth, it's not a totally unqualified speculation. It would truly be interesting so see what would happen if EBAY took the same approach that we did (mostly with stored procedures) in a parallel system. I'm speculating they could save another couple of mega-watts of juice and seriously decrease their licensing costs with no impact or, possibly, and increase in performance.
Unfortunately, we'll never know because it's not likely that they'll give that a try and so folks like me are relegated to "speculation" based on personal experience.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 29, 2017 at 5:52 pm
I've seen the security advantages of using parameters fed into stored procedures, so I won't expand on that. The dangers of string concatenation being vulnerable to SQL injections are well known, as are the ways they get free news coverage for your company when they get hacked.
What I'll hit on instead is the plan cache. When the application strings together SQL statements using strings, it creates a statement and submits it to the database. The database then builds a plan to execute the statement, caches it and then executes the query. When another user does the same thing with a slightly different value, the SQL statement is different. Because it's no parameterized, the statement is seen as being different because it produces a different query hash, even though the only thing that's different are the values that should be parameterized. This generates a different plan. As this continues, the plan cache gets filled with single use plans. The plans that should already be known are seen as different and a new plan gets generated every time.
October 30, 2017 at 10:32 am
Yep. Understood. I thought I made it clear that it was speculation on my part but maybe not.
Just to clear things up a bit and although we certainly aren't operating anywhere near the scale the EBAY does, my "speculation" isn't based on hear-say nor even supposed "Best Practice" (which are frequently not) recommendations or arguments. They're based on the personal experience of my own small world.
Well my counter speculations are based on simple reading and experience with MS SQL, and I'm just not going to believe that stored procedures will cut EBay's server cost by 90 percent, even if you're the one writing them all 🙂
Even Stack Exchange, a very famous microsoft stack user seems intent on removing their stored procedures and they too are big on caching, and the folks even admit their workload in some cases can approach over 90 percent read only. And I'm going to hazard a guess that Stack Exchange's workload is going to be a fraction of EBay's.
https://nickcraver.com/blog/2016/02/17/stack-overflow-the-architecture-2016-edition/
October 30, 2017 at 10:40 am
patrickmcginnis59 10839 - Monday, October 30, 2017 10:32 AMWell my counter speculations are based on simple reading and experience with MS SQL, and I'm just not going to believe that stored procedures will cut EBay's server cost by 90 percent, even if you're the one writing them all 🙂
Yep. Understood there, as well. But that does make my point. No one has tried and they're not likely to. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
October 30, 2017 at 10:56 am
Jeff Moden - Monday, October 30, 2017 10:40 AMpatrickmcginnis59 10839 - Monday, October 30, 2017 10:32 AMWell my counter speculations are based on simple reading and experience with MS SQL, and I'm just not going to believe that stored procedures will cut EBay's server cost by 90 percent, even if you're the one writing them all 🙂
Yep. Understood there, as well. But that does make my point. No one has tried and they're not likely to. 😉
I don't believe this assertion either. Given that EBay has the microservices architecture going for them, nothing prevents them from providing an SQL Server installation for one of their thousand databases and seeing how it works out. Heck, we all talk about measuring for performance here so I think it would be pretty arrogant to imagine that EBay as successful as they are would be unable to do this measurement or evaluation especially when considering your 90 percent savings estimate that they could conceivably recover 😉
Heck they could simply load up a standalone SQL Server, run some simulations and get some back of the envelope measurements probably stupidly easy.
edit: for that matter, maybe EBay looked at Stack Exchanges experiences with stored procedures and decided they didn't have to test LOL
October 31, 2017 at 1:00 am
Ed Wagner - Sunday, October 29, 2017 5:52 PMI've seen the security advantages of using parameters fed into stored procedures, so I won't expand on that. The dangers of string concatenation being vulnerable to SQL injections are well known, as are the ways they get free news coverage for your company when they get hacked.What I'll hit on instead is the plan cache. When the application strings together SQL statements using strings, it creates a statement and submits it to the database. The database then builds a plan to execute the statement, caches it and then executes the query. When another user does the same thing with a slightly different value, the SQL statement is different. Because it's no parameterized, the statement is seen as being different because it produces a different query hash, even though the only thing that's different are the values that should be parameterized. This generates a different plan. As this continues, the plan cache gets filled with single use plans. The plans that should already be known are seen as different and a new plan gets generated every time.
Thanks Ed Wagner for this new "ace". I think that if a new cache plan is generated every time a slightly new statement is run on the database, then it hits the performance as well, apart from the plan cache growth ... What can be the result of a plan cache growth?
October 31, 2017 at 2:55 am
patrickmcginnis59 10839 - Monday, October 30, 2017 10:56 AMJeff Moden - Monday, October 30, 2017 10:40 AMpatrickmcginnis59 10839 - Monday, October 30, 2017 10:32 AMWell my counter speculations are based on simple reading and experience with MS SQL, and I'm just not going to believe that stored procedures will cut EBay's server cost by 90 percent, even if you're the one writing them all 🙂
Yep. Understood there, as well. But that does make my point. No one has tried and they're not likely to. 😉
I don't believe this assertion either. Given that EBay has the microservices architecture going for them, nothing prevents them from providing an SQL Server installation for one of their thousand databases and seeing how it works out. Heck, we all talk about measuring for performance here so I think it would be pretty arrogant to imagine that EBay as successful as they are would be unable to do this measurement or evaluation especially when considering your 90 percent savings estimate that they could conceivably recover 😉
Heck they could simply load up a standalone SQL Server, run some simulations and get some back of the envelope measurements probably stupidly easy.
edit: for that matter, maybe EBay looked at Stack Exchanges experiences with stored procedures and decided they didn't have to test LOL
From my experience, I would say that other perspectives are here in play. We are technicians, engeneers and speak about performance, ... but the commercial commitments may tip the balance the other side ... so could as well do the friendship with one or other big shareholders. I've seen some decisions in my company taken on a "emotional" basis rather than on a "logical" one. We're only humans after all 😉
edit: typo
October 31, 2017 at 8:09 am
rot-717018 - Tuesday, October 31, 2017 2:55 AMpatrickmcginnis59 10839 - Monday, October 30, 2017 10:56 AMJeff Moden - Monday, October 30, 2017 10:40 AMpatrickmcginnis59 10839 - Monday, October 30, 2017 10:32 AMWell my counter speculations are based on simple reading and experience with MS SQL, and I'm just not going to believe that stored procedures will cut EBay's server cost by 90 percent, even if you're the one writing them all 🙂
Yep. Understood there, as well. But that does make my point. No one has tried and they're not likely to. 😉
I don't believe this assertion either. Given that EBay has the microservices architecture going for them, nothing prevents them from providing an SQL Server installation for one of their thousand databases and seeing how it works out. Heck, we all talk about measuring for performance here so I think it would be pretty arrogant to imagine that EBay as successful as they are would be unable to do this measurement or evaluation especially when considering your 90 percent savings estimate that they could conceivably recover 😉
Heck they could simply load up a standalone SQL Server, run some simulations and get some back of the envelope measurements probably stupidly easy.
edit: for that matter, maybe EBay looked at Stack Exchanges experiences with stored procedures and decided they didn't have to test LOL
From my experience, I would say that other perspectives are here in play. We are technicians, engeneers and speak about performance, ... but the commercial commitments may tip the balance the other side ... so could as well do the friendship with one or other big shareholders. I've seen some decisions in my company taken on a "emotional" basis rather than on a "logical" one. We're only humans after all 😉
edit: typo
Oooohhh.... you said a mouthful on that "emotional" basis thing. we spent $350K USD on something that isn't much more powerful than SSIS because of that silliness and that doesn't include yearly maintenance fees nor the FTE hours to support the damned thing. It also takes a lot longer to do things like import a spreadsheet instead of just loading up the ACE drivers and "doing it simple" from T-SQL. It takes someone a day to do it in the product I speak of and it should only take a half hour or so to import a flat, single tab spreadsheet that has all the qualities of a table.
It's mostly because of an emotional decision by someone that got burned when MS refused to make 64 bit Jet Drivers and hadn't come out with the ACE drivers yet. They simply didn't know the work arounds.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 1, 2017 at 6:41 am
Jeff Moden - Tuesday, October 31, 2017 8:09 AMOooohhh.... you said a mouthful on that "emotional" basis thing. we spent $350K USD on something that isn't much more powerful than SSIS because of that silliness and that doesn't include yearly maintenance fees nor the FTE hours to support the damned thing. It also takes a lot longer to do things like import a spreadsheet instead of just loading up the ACE drivers and "doing it simple" from T-SQL. It takes someone a day to do it in the product I speak of and it should only take a half hour or so to import a flat, single tab spreadsheet that has all the qualities of a table.
It's mostly because of an emotional decision by someone that got burned when MS refused to make 64 bit Jet Drivers and hadn't come out with the ACE drivers yet. They simply didn't know the work arounds.
Does the name of this something happen to start with an "I" and end with an "a"? 🙂
November 1, 2017 at 7:45 am
dmbaker - Wednesday, November 1, 2017 6:41 AMJeff Moden - Tuesday, October 31, 2017 8:09 AMOooohhh.... you said a mouthful on that "emotional" basis thing. we spent $350K USD on something that isn't much more powerful than SSIS because of that silliness and that doesn't include yearly maintenance fees nor the FTE hours to support the damned thing. It also takes a lot longer to do things like import a spreadsheet instead of just loading up the ACE drivers and "doing it simple" from T-SQL. It takes someone a day to do it in the product I speak of and it should only take a half hour or so to import a flat, single tab spreadsheet that has all the qualities of a table.
It's mostly because of an emotional decision by someone that got burned when MS refused to make 64 bit Jet Drivers and hadn't come out with the ACE drivers yet. They simply didn't know the work arounds.
Does the name of this something happen to start with an "I" and end with an "a"? 🙂
No... it's "Web Methods". What product are you talking about? It might make me feel better that we're not the only ones paying from deep pockets because of an emotional decision.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 1, 2017 at 7:52 am
Jeff Moden - Wednesday, November 1, 2017 7:45 AMdmbaker - Wednesday, November 1, 2017 6:41 AMDoes the name of this something happen to start with an "I" and end with an "a"? 🙂
No... it's "Web Methods". What product are you talking about? It might make me feel better that we're not the only ones paying from deep pockets because of an emotional decision.
I'm talking about Informatica. There's some things I really like about it (e.g. the Workflow Monitor...SSIS really needs something like that built in), but sometimes it's a real PITA, makes some simple things really difficult.
Viewing 15 posts - 16 through 30 (of 36 total)
You must be logged in to reply to this topic. Login to reply