Do it all in C# instead?

  • Possibly a bit of a Friday afternoon question.
    I've been working on a product for about 10 years. The architecture consists of one database in which just about everything is done - specifically, validation and processing - and a number of relatively thin "access applications", most of which both write and read. These are written mostly in C#. Although this arrangement has been pretty successful (commercially at least), it's regarded as a bit old school.
    We've been adding new functionality using the existing architecture for some time, but now it has been decided to start adopting a new architecture - one in which the database just provides pretty dumb storage and the calculation is done in a new C# layer. Although it hasn't been decided yet, I imagine this is going to be achieved by:
    1) Reading from the database.
    2) Creating objects.
    3) Getting the objects to do the processing.
    4) Writing back to the database (either one object at a time or batched in some way).
    My question is this; has anybody had any experience of a similar approach and, if so, did it work? Or perhaps - how did it compare to the "do it all in the database" architecture?

  • julian.fletcher - Friday, March 31, 2017 8:17 AM

    Possibly a bit of a Friday afternoon question.
    I've been working on a product for about 10 years. The architecture consists of one database in which just about everything is done - specifically, validation and processing - and a number of relatively thin "access applications", most of which both write and read. These are written mostly in C#. Although this arrangement has been pretty successful (commercially at least), it's regarded as a bit old school.
    We've been adding new functionality using the existing architecture for some time, but now it has been decided to start adopting a new architecture - one in which the database just provides pretty dumb storage and the calculation is done in a new C# layer. Although it hasn't been decided yet, I imagine this is going to be achieved by:
    1) Reading from the database.
    2) Creating objects.
    3) Getting the objects to do the processing.
    4) Writing back to the database (either one object at a time or batched in some way).
    My question is this; has anybody had any experience of a similar approach and, if so, did it work? Or perhaps - how did it compare to the "do it all in the database" architecture?

    Both approaches work, though DBAs generally dislike the C# option because it's harder for them to fix performance problems.

    May I ask what the driver is for changing? Ability to run on multiple DB platforms, perhaps? Sounds like everything works well as it is.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Treating a relational data storage engine as dumb storage is spending time money and effort that is utterly wasteful. If you don't need and won't use the functionality that the database offers, why use it. Go to NoSQL or just a flat file and be done. Less overhead, less cost, etc. On the other hand, if you're going to use the database, have the database do what it does best and let the code do what the code does best. That's how I've always seen successful projects operate. It may be old school, but so is the wheel, ball bearings, indoor plumbing...

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Ah yes, electricity, running water ...
    I think the driver is that the processing aspect is considered to be quite complicated and that TSQL isn't as good a language as C# is, with all its classes, inheritance, reflection, better development environment and so on. It is hoped that development will be easier, as will future improvements, modifications etc.
    The DB platform isn't an issue as the "new architecture" is going to piggyback on the old one (at least to start with), so there will already be a SQL database there.
    As you can probably tell, it's a bit of a departure from our existing approach and, as such, I was wondering if anybody knew of any likely problems, pitfalls, etc.

  • I think it depends on what the processing is actually doing.  SQL does some things really well, others poorly.  C# does some things really well, and some things poorly.

    Doing it SQL ensures that the process is not affected by the end users machine.  That is, what happens if the application crashes 1/2 way through processing the data?  or the users computer processes the data slowly?
    Doing it in SQL also ensures data consistency.  Hypothetically, you pull the data from SQL and start processing it in C#.  I come along, run the app and start doing the same thing before your app finishes processing.  I now have pulled old data from SQL and started processing it locally.  You then write to SQL as your processing has completed.  My data is no longer valid but I have no way to know.  I make changes and push them to SQL from the application and I clobber your changes.

    Now, lets throw that all into SQL for the processing.  You start the C# app that calls the stored procedure, it does a BEGIN TRANSACTION and grabs a lock on the table it is going to be working with.  I run the same app and it just sits and waits because yours is running.  Yours completes, the lock gets released, my version pulls the data and I have fresh, live data to work with and no clobbering.

    I think it is one of those "pick the right tool for the job" questions.  Do you (or will you) have multiple concurrent users attempting to manipulate the data?  If so, do you have some method to "lock" the data so you don't clobber it?  Or does clobbering not matter?

    And it depends too on what you consider "quite complicated" in SQL.  In SQL, there are a lot of different ways to perform the same tasks.  Same applies to C#.
    Another thing is what your work practices are.  Where I work, the DBA (ie me) doesn't like it when people run dynamic SQL from a C# application as it makes a bunch of one-use query plans on the server.  Stored procedures are the recommended method unless there is a good reason not to use them (which I cannot think of one) as they are compiled and reused and, in the event it starts running slowly, I can easily verify if it is the SP that is slow or the application.  Ad-hoc queries are a lot harder to try to optimize from the DBA side of things.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • It really depends on what processing they are trying to move from the DB to the application, an everything or nothing approach is probably the wrong idea.  There are certain strengths of a DB that would be hard to replicate in the application and vice versa.  The strengths of a DB are things like enforcing constraints(relational, data types etc), quick data access(indexes), backups and robust transactions, certain reports and aggregations work well in the DB and batch processing.  Trying to do all of those things at the application layer by say pulling in entire data entire data sets and doing all those there makes no sense.

    On the other hand things like persisted interaction with an end user and presentation work much better in the application.

  • I'm curious.  What is this supposedly complicated process?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've seen what it looks like to have everything done in the application tier and treat SQL Server like dumb storage.  The client projects become large and suffer a bit in the area of maintainability, but they can certainly be managed by developers who know what they're doing. The downside is in the area of performance.  The queries get slower and slower until stored procedures are written to perform some of the processing.  Done properly, the performance improvements have been huge.

    All that being said, I have to agree with what BMG said - it comes down to picking the right tool for the job.  SQL is superior for some tasks and .NET is superior for others.  I use both in my applications.  If you focus entirely on one part (.NET or SQL) and ignore the other, you may miss performance tuning opportunities in the other.  You should also consider keeping people with the skills in house to address performance in both areas.

    You should also ask why, after 10 years of working on a product, such a fundamental change is being considered.  Is there a problem with scale?  Is the project no longer manageable?  Is it hard to find good T-SQL programmers?  Is there one process that's driving the question?  Or am I completely missing the point?

    The last thing to consider is the cost and time required to rewrite and test everything in .NET.

    I don't think there's a clear-cut and easy answer here.

  • julian.fletcher - Friday, March 31, 2017 9:44 AM

    Ah yes, electricity, running water ...
    I think the driver is that the processing aspect is considered to be quite complicated and that TSQL isn't as good a language as C# is, with all its classes, inheritance, reflection, better development environment and so on. It is hoped that development will be easier, as will future improvements, modifications etc.
    The DB platform isn't an issue as the "new architecture" is going to piggyback on the old one (at least to start with), so there will already be a SQL database there.
    As you can probably tell, it's a bit of a departure from our existing approach and, as such, I was wondering if anybody knew of any likely problems, pitfalls, etc.

    If it's just down to calculations, etc., yeah, T-SQL sucks and there are lots of better ways to take care of that type of thing. It just didn't sound like that's all you were talking about. If we're just talking about data retrieval & storage being done in the database and then calculations & reporting being done externally, yeah, that's very common and generally the right way to go about stuff. The only thing to watch for is a tendency for people to think they need ALL the data to do calculations so you end up with SELECT * queries with no WHERE clause and everyone complaining that "the database sucks" when in fact it's what they're doing to it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • julian.fletcher - Friday, March 31, 2017 8:17 AM

    Possibly a bit of a Friday afternoon question.
    I've been working on a product for about 10 years. The architecture consists of one database in which just about everything is done - specifically, validation and processing - and a number of relatively thin "access applications", most of which both write and read. These are written mostly in C#. Although this arrangement has been pretty successful (commercially at least), it's regarded as a bit old school.
    We've been adding new functionality using the existing architecture for some time, but now it has been decided to start adopting a new architecture - one in which the database just provides pretty dumb storage and the calculation is done in a new C# layer. Although it hasn't been decided yet, I imagine this is going to be achieved by:
    1) Reading from the database.
    2) Creating objects.
    3) Getting the objects to do the processing.
    4) Writing back to the database (either one object at a time or batched in some way).
    My question is this; has anybody had any experience of a similar approach and, if so, did it work? Or perhaps - how did it compare to the "do it all in the database" architecture?

    Who told you there is anything new in this kind of architecture?
    It's been new to me in 1998-1999, when I moved from MS Access to Informix and then to SQL Server.
    Mostly because I did not have much of experience with any kind of architecture at the time. 🙂
    As far as I know the whole story of success of Oracle was about advantages of that "old school" SQL processing comparing to the "new" dummy storage approach used in the most of pre-Oracle data applications.

    Since those times I had an extensive experience with projects using similar approach.
    And the only reason I've been involved in those projects - they wanted me to ditch the approach and develop proper data processing which would not need the whole night plus couple of hours in the morning to complete "effective" processing code.
    Basically, for 10+ years I was doing just that - taking front end code (VB6, java, C++, etc.) and implement equivalent functionality in T-SQL.

    Sure, C++ is way more effective in calculations than T-SQL, especially in string processing, but it's not calculations which take all of the time in data processing.
    Filtering the data, grouping it, mapping, joining right recordsets to each other, figuring out the most effective way of fetching the required data - that's what takes all the time, and that's where SQL Server does not leave a space for any competition.

    Pretty soon you're gonna hit the wall and have to choose from 2 options:
    - if you're capable of admitting of being stupid when buying that cheap c..p about "the new architecture" then you choose to move back to in-database processing, probably hiring a professional DB developer who would implement a proper data processing, not just translating the C# code in T-SQL syntax;
    - if you're not capable of doing that - you're gonna move to even newer architecture, namely "Big Data" and pay them through your nose:  per storage unit, per CPU clock, per query, etc.

    _____________
    Code for TallyGenerator

  • Thanks for all of your replies and their many follow-up questions. Here are replies to some as best I can.

    I'm curious. What is this supposedly complicated process?

    It’s all to do with how people are charged for the various services they are receiving in the area of social care. The charges depend on the cost (naturally) but also on their ability to pay. It’s all regulated by the government so, needless to say, terrifically complicated.

    Who told you there is anything new in this kind of architecture?

    Nobody at all! I’m simply saying it’s an approach which is new to our particular product.

    Doing it SQL ensures that the process is not affected by the end users machine.

    The “new” architecture will be one in which the processing (now implemented in super duper C#) is done in a single, central Web Service which is accessed by the various end user machines, so this particular problem shouldn’t be an issue.

    You should also ask why, after 10 years of working on a product, such a fundamental change is being considered. Is there a problem with scale? Is the project no longer manageable? Is it hard to find good T-SQL programmers? Is there one process that's driving the question? Or am I completely missing the point?

    Not at all, and these are very good questions. We’re certainly finding it difficult to recruit T-SQL developers. It’s also felt that automated testing will be far, far easier in a C# environment. And that the C# development environment and language are generally superior. There is also the question of performance, which is an ongoing, if intermittent, issue with the existing product for some customers.

    Apologies if I’ve been a bit vague. I was just interested in people’s thoughts on the idea generally – I don’t particularly want to get in to the nitty-gritty details as that could lead down any number of rabbit holes!

  • Another option to consider is putting the calculations into CLR within the database. Again, not knowing the full extent, this is just a suggestion. I do know of an investment firm that has done this extremely successfully. The CLR within the database outperformed both T-SQL and native C# code in the client since they basically crossed the streams and got the best of both worlds. Again, not saying this is THE solution, but just another wrinkle in the puzzle.

    By the way, it really does sound like a shop that has had some database performance issues, possibly related to fundamentals, and is now running away from the database as a "bad thing." Not saying it is at all. Just sounds a bit that way from what you're describing. Decisions based on "oh, the database is just a pain in the bottom" are seldom good decisions. I've seen projects fail because of that. Just passing it along.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • For all the talk about C# being so superior, I have to question the judgement of anyone that perceives this to be so.   Let's take a closer look at some inconvenient facts about the language:

    1.) It is NOT AT ALL GOOD at what I'll refer to as "SELF-DOCUMENTING".   The average corporate non-programmer will have pretty much NO IDEA what a segment of C# is doing, because the language just doesn't lend itself to that.   This matters because the average C# programmer will likely have the exact same problem, so you lose one good developer and you could well be in a world of hurt.   And I'm talking about car accidents and heart attacks as opposed to someone quitting or getting fired by a jerk of a manager (not that those things don't happen, but at least those are things you could at least have a shot at controlling).

    2.) Number 1 is exasperated by so-called C# experts, who then program what I'll refer to as "ROCKET SCIENCE" code that only they understand, as this almost always involves the most arcane aspects of the language.   Lose that programmer or consultant, and maintainability makes an instant nose-dive to zero.   Mind you, I've seen SQL developers do the same thing with SQL, but so much less often because most places have some level of database coding standards enforced.  Such standards often disappear in the C# world.   Especially when deadlines are looming.

    3.) When it comes to "complex health-care" stuff, it's rarely actually all that complicated, and what is usually true is that the database was never actually fully and properly designed to handle what it's supposed to handle.   The complexity usually only arises BECAUSE the database just isn't designed to deal with it.   When you take the database element and reduce it to just "dumb storage", then you probably also lose out on most of what a database can do, so frankly, this idea of doing mostly C# says that management thinks this kind of thing is far easier than reality will eventually prove.   Unfortunately, by the time they realize it, it may be too late.   Take the money out of where it really should be spent, and you can count on reduced functionality, reduced performance, and eventually, someone else produces a better product at a lower price.

    My opinion...   Leave the C# on the side of the road where it belongs.   You want good data processing, you let the database handle what it's good at, and the user interface handle the rest.   When you don't do it that way, your product and your reputation as a company WILL SUFFER.   And it's not a question of if, but of when.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply