It's the Engineers

  • Comments posted to this topic are about the item It's the Engineers

  • Good article but I really don't understand why you say that relational databases are really only good for small scale stuff.  Is there no one that you would consider "large" that uses SQL Server, Oracle, or some other RDBMS?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Saturday, January 6, 2018 7:22 PM

    Good article but I really don't understand why you say that relational databases are really only good for small scale stuff.  Is there no one that you would consider "large" that uses SQL Server, Oracle, or some other RDBMS?

    I think he is being relative here. I could be wrong there, but that's how I read it in the sense, small in this context may be large for you.

    There are plenty of case studies that show both SQL Server and Oracle can solve very large problems. However, as the article highlights, when you don't really have a data model or don't attempt to define a proper one with or without the proper expertise to tune the tech you have, then you're left with a pretty large problem that does not just mean an increase in data size. It could just mean poor business process. Therefore, taking a piece of tech with the assumption that it has great ease of use that your business can easily pivot to that you can just throw large quantities of data at, seems like a good choice until you realize it's not that simple.

    Kind of reaffirms the notion that it may not actually be a tech issues as much as a business process issue. I.e.: maybe your RDBMS could handle that large problem afterall, you're just doing it wrong or don't want to do it right. Which is sadly very true in a large amount of cases I bet.

    Regardless, it all translates to using the right tools for the job with the added excerpt -- the right tools your team knows how to use. Not what some hot new marketing blog is telling you to use.

  • xsevensinzx - Saturday, January 6, 2018 9:32 PM

    Jeff Moden - Saturday, January 6, 2018 7:22 PM

    Good article but I really don't understand why you say that relational databases are really only good for small scale stuff.  Is there no one that you would consider "large" that uses SQL Server, Oracle, or some other RDBMS?

    I think he is being relative here. I could be wrong there, but that's how I read it in the sense, small in this context may be large for you.

    There are plenty of case studies that show both SQL Server and Oracle can solve very large problems. However, as the article highlights, when you don't really have a data model or don't attempt to define a proper one with or without the proper expertise to tune the tech you have, then you're left with a pretty large problem that does not just mean an increase in data size. It could just mean poor business process. Therefore, taking a piece of tech with the assumption that it has great ease of use that your business can easily pivot to that you can just throw large quantities of data at, seems like a good choice until you realize it's not that simple.

    Kind of reaffirms the notion that it may not actually be a tech issues as much as a business process issue. I.e.: maybe your RDBMS could handle that large problem afterall, you're just doing it wrong or don't want to do it right. Which is sadly very true in a large amount of cases I bet.

    Regardless, it all translates to using the right tools for the job with the added excerpt -- the right tools your team knows how to use. Not what some hot new marketing blog is telling you to use.

    Heh... of course Steve is being "relative" here and so was the article he cited.  I absolutely get that what people think is large (for example, some of the larger health insurance  companies) really isn't compared to some of the juggernauts of the computational world (NASDAQ, for example, which suffers "Billions of transactions per day and has "Multiple Petabyte Databases" and single tables with "Quintillions of records" https://www.youtube.com/watch?v=AW87RzZJ0Z0).  Certainly, most of the stuff that most of us (denizens of SQLServerCentral.com) deal with is, indeed, quite "small to medium scale" compared to NASDAQ or even companies contained in the following list of companies ( https://customers.microsoft.com/en-us/search?sq=&ff=&p=0&so=story_publish_date%20desc )

    I also understand what is being said in the following quote from the article...


    There's an article that I think illustrates this well, and might be worth passing along to your developers. It's called Why Amazon DynamoDB isn't for everyone and it's worth a few minutes of your time to read. The gist of the article is that a NoSQL system, like DynamoDB, isn't as simple and easy as you expect. It also says that for many applications, a relational database is a better choice because it's often better understood and will solve most problems at small scale. For most of us, we don't really get past what I'd consider small to medium scale, and so we should stick with relational systems. 

    The cited article starts off by saying that...

    By 2004, Amazon’s business was already stretching the limits of their Oracle database infrastructure. In order to scale the growing business, AWS designed an award winning internal key-value store — Amazon Dynamo — to meet their performance, scalability, and reliability requirements.

    And, yes... despite the fact that there were "only" ""hundreds of millions of products" from Thanksgiving through Cyber Monday", there were also a huge number of transactions for each product sold that control the warehouse robots and "smart" conveyer systems ( http://money.cnn.com/2017/11/29/technology/amazon-cyber-monday/index.html ) very likely (IMHO) bringing the number of related data transactions to the "billions" per day that the NASDAQ sees.

    That cited article and Steve's good article on the same subject seem to be suggesting that when you actually do approach or achieve truly "large" scale, that an RDBMS isn't the way to go and, yet, there are huge RDBMS's, such as what is powering NASDAQ, that have been using an RDBMS for a long time. 

    That's my only point of contention with the article. Size doesn't actually matter.

    That being said and what DOES matter, I very much DO agree with the main thrust of the article.  We have computational juggernauts like NASDAQ and Amazon that have similar scale (again, IMHO).  The former uses an RDBMS and the latter uses something that (apparently) doesn't.  Why is that?  The answer is indeed "The Engineers" but, again, to highlight my point of contention with the article, it has nothing to do with scale as to whether or not an RDBMS was used.  It was the know how of "The Engineers" that allowed one giant of the industry to use an RDBMS and the other to use something else.

    Heh... and to rub it in a bit for the people (and especially the self proclaimed "engineers") that I've had to interview in the past, I'm thinking that the Engineers that built both the NASDAQ and Amazon each knew how to get the bloody current date and time from the system being used . 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I wouldn't put it all on the engineers though. Plenty of cases where the business process is not defined and therefore a proper model can't be defined. With that, no amount of RDBMS will likely save you. It's true, there are solutions out there that allow you to get away without defining that model or process. That makes it easier for you to push forward without having to define relationships, constraints, indexes and so forth. Those solutions chosen could not just be the engineers, but just the fact it seemed like the best solution for the problem.

    Like you said, if one can do it why can't we all? Is SQL Server really that complicated that another man (or woman) can't duplicate what another company did? In my mind, I don't think so. But there is more to a solution than just the tech. A lot of things have to align to make something that shined for one happen for another that has nothing to do with the whole RDBMS versus NoSQL argument.

  • xsevensinzx - Sunday, January 7, 2018 9:10 AM

    I wouldn't put it all on the engineers though. Plenty of cases where the business process is not defined and therefore a proper model can't be defined. With that, no amount of RDBMS will likely save you. It's true, there are solutions out there that allow you to get away without defining that model or process. That makes it easier for you to push forward without having to define relationships, constraints, indexes and so forth. Those solutions chosen could not just be the engineers, but just the fact it seemed like the best solution for the problem.

    Like you said, if one can do it why can't we all? Is SQL Server really that complicated that another man (or woman) can't duplicate what another company did? In my mind, I don't think so. But there is more to a solution than just the tech. A lot of things have to align to make something that shined for one happen for another that has nothing to do with the whole RDBMS versus NoSQL argument.

    There's no question that the requirements that a system must meet do need some proper definition that the Engineers will need to design things properly but... I will put it all on the Engineers.. 😉  Good Engineers also understand when they don't actually have the necessary requirements (both happy path and the alternatives) and won't put fingers to keyboard unless it's a Proof-of-Principle to help determine the requirements and other possibilities.  Certainly, good Engineers won't pass something on just to "get it off their plate" to give the semblance of being "productive".  They also know when "good enough" actually isn't even when there's a looming deadline.

    And, no... I'm not saying that Engineers will/should always chose to use an RDBMS or chose to not use one.  I do agree that their job is to figure out what "seemed like the best solution for the problem".  The key here is that that choice isn't based on whether or not an RDBMS can handle the problem nor the size of the problem.  Rather, it's a choice made by a given set of Engineers on how they can handle the problem and, as we can see by the incredible success examples of NASDAQ and Amazon, the Engineers for both companies did one hell of a good job on all fronts.

    Here's to all Engineers that say "What if" and "We can"!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • When you start talking about over a 100,000 nodes to handle data you probably have scaled out of the size of RDBMS.

    412-977-3526 call/text

  • robert.sterbal 56890 - Monday, January 8, 2018 8:43 AM

    When you start talking about over a 100,000 nodes to handle data you probably have scaled out of the size of RDBMS.

    So, what does NASDAQ have in that area?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Do you have a link to their architecture? They have a need for ACID that facebook and google and dropbox do not.

    412-977-3526 call/text

  • xsevensinzx - Saturday, January 6, 2018 9:32 PM

    Jeff Moden - Saturday, January 6, 2018 7:22 PM

    Good article but I really don't understand why you say that relational databases are really only good for small scale stuff.  Is there no one that you would consider "large" that uses SQL Server, Oracle, or some other RDBMS?

    I think he is being relative here. I could be wrong there, but that's how I read it in the sense, small in this context may be large for you.

    There are plenty of case studies that show both SQL Server and Oracle can solve very large problems. However, as the article highlights, when you don't really have a data model or don't attempt to define a proper one with or without the proper expertise to tune the tech you have, then you're left with a pretty large problem that does not just mean an increase in data size. It could just mean poor business process. Therefore, taking a piece of tech with the assumption that it has great ease of use that your business can easily pivot to that you can just throw large quantities of data at, seems like a good choice until you realize it's not that simple.

    Kind of reaffirms the notion that it may not actually be a tech issues as much as a business process issue. I.e.: maybe your RDBMS could handle that large problem afterall, you're just doing it wrong or don't want to do it right. Which is sadly very true in a large amount of cases I bet.

    Regardless, it all translates to using the right tools for the job with the added excerpt -- the right tools your team knows how to use. Not what some hot new marketing blog is telling you to use.

    This is really what I meant. Certainly there are lots of large scale RDBMS systems out there. Verizon or TMobile had a huge SQL Server system at one point, and more. It's just that it isn't technology that makes or breaks most applications. It's the engineers.

  • robert.sterbal 56890 - Monday, January 8, 2018 8:53 AM

    Do you have a link to their architecture? They have a need for ACID that facebook and google and dropbox do not.

    I do not.  That's why I asked the question.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Monday, January 8, 2018 10:57 AM

    robert.sterbal 56890 - Monday, January 8, 2018 8:53 AM

    Do you have a link to their architecture? They have a need for ACID that facebook and google and dropbox do not.

    I do not.  That's why I asked the question.

    Here is one article about their setup:

    https://aws.amazon.com/blogs/big-data/nasdaqs-architecture-using-amazon-emr-and-amazon-s3-for-ad-hoc-access-to-a-massive-data-set/

    412-977-3526 call/text

  • robert.sterbal 56890 - Monday, January 8, 2018 11:30 AM

    Jeff Moden - Monday, January 8, 2018 10:57 AM

    robert.sterbal 56890 - Monday, January 8, 2018 8:53 AM

    Do you have a link to their architecture? They have a need for ACID that facebook and google and dropbox do not.

    I do not.  That's why I asked the question.

    Here is one article about their setup:

    https://aws.amazon.com/blogs/big-data/nasdaqs-architecture-using-amazon-emr-and-amazon-s3-for-ad-hoc-access-to-a-massive-data-set/

    Nice article. I suspect SQL Server (or the relational databases) sits in front of the NASDAQ data warehouse. The Data Ingestion section of the article is where I would expect to see it if he had elaborated a little more. It's probably part of one of the many (30,000) work flow operations he refers to that feed the warehouse. It sounds like they pull this data from their affiliated merchants and that would explain the diverse inputs and ingestion. A lot of those merchants/affiliates are probably running SQL Server but NASDAQ as a whole isn't. No?

  • Its interesting I think at the same time as procedures with high volume data flow are being worked on there are things like protocols for low volume data flows being worked on. Zigbee is one that comes to mind. As bandwidths increase and storage costs reduce in value with improved processing speed I think the scope of Relational Databases will expand towards the low and high end. I guess that there will be an argument that the expansion in data at the top end will outpace this expansion.

    Financial transactions are interesting because the variety of attributes are well established , I would suggest somewhat limited and ACID compliance is so important. I would suggest that this for the near future will keep a large arena within the scope of relational databases.

    cloudydatablog.net

  • An interesting view and one I would agree with to some extent. We recently purchased a system based on Openedge Progress (a rather niche solution) and it has it's own 4GL language. Unfortunately we still have to produce reports, extract data, or import data, etc. As we no longer have a development team the few of us who know SQL use it to perform our day to day tasks that the new system cannot provide. The problem is that noSQL databases are not native to SQL so performance can be a huge problem, not to mention the amount of time we have to dedicate to producing data out of a system that is not exactly relational and some SQL relational rules just don't work the same way.

    You also have to add the fact that those kind of off-the-shelf systems not always provide the level of function that is required and training people with the new "stuff" that comes in is something managers are not willing to consider. Low cost/budget based on outsourcing is the only factor used to search and implement solutions for the business.

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply