April 11, 2008 at 6:33 am
It looks like we might be facing a large project shifting over to using ORM methods through nHibernate. I'm trying to get a read from the database community on what exactly I should expect in terms of issues, challenges and headaches during the development process. I'm also interested in any long term maintenance issues, troubleshooting problems, etc. If your developers implemented ORM all the way down to storing object data on the database in a non-normalized/object oriented fashion, how did that affect you? Did it muck up reporting? What benefits did you realize on the database side of the house?
I'm really looking for real-world, hands on information. Complaints or speculation about how stupid a lowest common denominator set of dynamic queries might be... well, I've got that complaint well in hand. I need as much hard data as I can collect so that I communicate enough information to my boss, his boss and his boss in order for them to make informed decisions about this and go into it with their eyes fully pinned open.
Any and all help in this area would be appreciated.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 15, 2008 at 7:28 am
It all depends on requirements, database model and frontend tool.
If you're porting an existing project to ORM, you should first create low level objects that match database objects and implement database storage then use them to build higher level objects.
For example: customer can be broken to several several sub-entities as needed like bare data and addresses. Address is a low level object, a customer can have more than one (snail mail, email, phone,...) so it naturally fits on address detail table. Another object is customer name - it may change, so you might need history, so if used with an old invoice, it can pull the customer name active at that time, otherwise it uses current name.
Objects can just fine fit into a normalized data model, but need not to and details are encapsulated on low level, so users/developers don't care much about storage implementation.
Some ORM use database as storage only, no foreign keys, no triggers, no stored procs. No database logic, all logic in application. I'd recommend against this. It gives you the flexibility to change the backend database without hassle, but that's the only advantage. In this scenario all database access can be done exclusively though object model, which can have serious performance impact on bigger transactions like reports.
So, how do reports perform, depends solely on database design. You can design the database model and object model on top of it, or design objects and let them design the database. I prefer and recommend the first. It has some portability issues depending on used servers, but ensures optimal work regardless of going through object model or not.
Ops. Here I assumed ORM to be "Object-relational mapping". Google reveals it's the acronym for many other things. Also reveals there's a lot of info in wikipedia.
Hope it helps.
April 15, 2008 at 7:43 am
Thanks for the reply. Yes, ORM, in this case, meant Object Relational Mapping.
We are talking about new projects and new databases. Although actually, we've had databases and projects running, but the development architects have decided to toss it all in favor of this new approach.
Your statements coincide well with the research I've done so far. Do you have anything more quantifiable about problems with performance because of the object designed database? I'm not questioning you on that issue at all. I'm assuming it to 100% true. I just wanted to be able to present some math when we sit down for our discussions on this. Same question applies to the issues around reporting.
Do you have any other quantifiable problems around code maintenance, database maintenance, production troubleshooting...? Something I haven't thought of?
Thanks again for the response.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 15, 2008 at 9:34 am
No, not without specific questions.
I could write a book about this and still not add something you haven't thought of.
ORM is excellent for models that don't fit well into matrices (tables).
Since you mentioned report performance, which are usually lists or tables, I'd guess your model is not of this kind, so ORM only logic is not recommended.
At first, you should balance ORM and database logic, leaving at least full integrity in database. If it later proves inflexible, you can move parts of it or all to ORM. The other way is a lot more difficult. If object code does not perform well, you can bypass it and access database directly. OO zealots will strongly oppose this, but if database takes care of its integrity, you can do it safely. If ORM takes care of integrity, you can't do it.
It's just like in OO languages when you have to access external non-OO API.
Code maintainability is simpler with ORM, as access to database is encapsulated in one layer, so, if you change/add/remove a column, application needs a change in one place. In classic app you can have several queries and code places affected by a change like this. In short, it's like discussion of OOP vs. PP, OOP is better, simpler, but requires a lot more planning and coding discipline. Planning phase gets significantly shorter with experience.
April 22, 2008 at 1:57 am
Seems like nobody is able or willing to share some hard experience. What was your decision/rationale with your boss?
April 22, 2008 at 5:46 am
Thanks for getting back.
At this point, to everyone except the development architect, we're still gathering information prior to making a decision. My two biggest issues come down to 1) not making the database "stupid" and 2) not eliminating the ability to troubleshoot this once its in production. The dev guy wants to make the database structure match the object structure so that his mapping code is minimized. That's just contrary to all my experience and most of the research I've done. We may have a fight there. The other concern is a bit more esoteric, but it's been my experience that when there is a TSQL issue, I'm able to get a fix out to production a heck of a lot faster than the developers are able to rewrite & deploy their code.
I've come up with a list of concerns/issues/questions that we'll be presenting when we start the final meetings prior to a decision. Have a look and please feel free to comment (formatting was a bit of a hash after the paste):
• Scalability Performance and Tuning
? I/O increase due to "chattier" applications
§ Most documentation indicates a lot more, smaller transactions, not to mention the possibility of frequent requests to verify structure (the app checking to see if the database has changed) prior to generating & running queries.
§ Mitigation is to ensure appropriate configuration & use of nHibernate. Monitoring can be done with Profiler
? I/O increase due to loading larger data sets more frequently
§ Ensure the use of "lazy" collections to reduce data moved
§ No other mitigation possible.
? General performance issues due to "generic" procedures using less efficient access methods
§ Generic data types used in queries can lead to indexes not being used
? Zero possibility to tune queries in a production setting
§ Any, all, changes require re-coding and re-deployment. There are no methods available for database only tuning except applying an index or forcing parameterization.
• Object-centric approach to building databases
? Reporting
§ This includes transactional level reporting as well as moving data between a model driven design and a more flexible design (normalized or star schema or whatever) that better supports reporting. Coding time reduced on the front end is, to some degree, tacked on to the back-end.
§ A data cleansing mechanism may be required.
§ Redundancy of data an authoritative sources for data may require some refinement.
? Data integrity
§ Generated structures are dependent on discovery to determine the proper constraints required on the data, or all constraints are assumed to be in the code.
§ Without integrity maintained with data the possibility of "dirty" data is increased ("USA", "U.S.A.","US" all values entered through app).
• Security
? We will have to give over full read/write privileges at the table level to the application. I think, it's possible, we'd have to give it 'dbo' in production. Based on a few statements in some of the research, it's even possible we'd have to give it 'sa' (although that is completely unproven currently).
• Integration with other systems at the data level
? Depending on the application, this may not be required. But if it is required at any point, it will entail a larger than normal effort to convert the data to a more normalized structure.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 22, 2008 at 11:02 am
I get the feeling that you're not sure what are you going into.
As I mentioned, it's like shift from procedural programming to OOP. You build several layers.
For example in windows programming, you have OS layer encapsulated in API, API is encapsulated on higher layer like MFC (or VCL or .net). Few programmers need to know inner details of MFC, even less about API, they're mostly MFC users. If hardware changes, there are some changes needed on OS layer, maybe some on MFC layer and virtually none on higher layer. So, it increases overall complexity, but hugely decreases it for each layer. It also adds some overhead, ie direct access to hardware can be a lot faster.
Moving analogy to database, ORM is MFC, you build hardware (database) and OS (database mapping). The majority of programmers focus on putting together objects, higher level logic and GUI.
Reports would need direct access to "hardware" for optimal performance, unless ORM has something like DirectX.
April 22, 2008 at 11:16 am
Damn straight I'm not sure what I'm getting into.
I recognize the issue that ORM tools are attempting to solve. I acknowledge that the various tools I've investigated do a pretty good job solving it (to a degree). What I'm having serious trouble getting straight in my head, as a dba, not a developer, how does this change my life? I'm surrendering stored procedures and granular security. Some of that isn't an issue. Performance, for example is identical between a parameterized query and a stored procedure. But, if there is a tweak available to the TSQL that will improve performance when it's problematic, I can apply it to the procedure, but I can't to the query. That concerns me.
I'm being asked to surrender relational data design. Yet, I'm still required to integrate this data with the rest of the enterprise, provide reporting, yada-yada-yada. As near as I can tell, we're just shifting a ton of work into other spaces in order to reduce development time by 10-20%. I just need to understand if that shift is 1-1, for an hour of developer work saved, we're adding an hour to data migration work or if it's 1-.5 or 1-5 or worse?
So,yeah, I'm not sure what I'm getting into. That's why I've been trying to get someone, anyone, to point me to a white paper or case study or decent set of blog entries, about a reasonably large scale application's development, reporting and maintenance over a period of time. I keep finding all kinds of "boy, it sure made development SOOOO easy" stories. That's nice, but development is only part of the pie in a functioning enterprise, especially one that is focused on something other than developing applications.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 23, 2008 at 6:28 am
Speaking as one who's normally on the developer side of this question, but also does some database development, I don't think ORM has to be as scary as it seems. The big thing is the attitude of the development group. If they want objects to rule the world and have no consideration for relational data constructs, then yes it could be scary.
OTOH, using the example you reference which is NHibernate/Hibernate, these products were developed to work with a very wide range of relational schemas and are very flexible. If your develop team is rational (their attitude, not the methodology), then they understand that the data must be stored relationally and therefore designing a solid relational store is incredibly important to the long term health and performance of their application. I'll be a lot of ORM failures come down to not properly considering this.
If you're developing from scratch, then you really shouldn't have a problem. At times ORM can be difficult to sit on an existing database that was designed without any consideration of OO concepts (in that scenario I'd highly recommend iBatis.NET, though N/Hibernate is very flexible), but if you are designing a database from scratch it's easy to design a database that is solid from a relational perspective as well as friendly to being translated into an OO world.
From my experience, NHibernate/Hibernate does not in any way prohibit the database from being well designed. You will run into certain things (typically minor) where designing the table a certain way will make the development team's life much easier because it flows more with the default N/Hibernate way of thinking, but remember that there are a lot of people at an enterprise level using this product and those people can't get away with bad relational designs. In fact, google just contributed a module to Hibernate to allow sharding so that tells you they are using Hibernate with some incredibly large datasets.
On nice thing about a good ORM is that the queries will be consistent. This should allow you to profile the app and index pretty effectively. In addition, Hibernate can log all the queries to show you what it's doing. What's nice about N/Hibernate also is that it's hql language basically lets the developers define a query using sql syntax but using their objects in the query. (A little like linq, but looks much more like real sql). Basically if you have a particularly inefficient query for some reason, you can use hql to express the query like sql thereby forcing the ORM to query they way you want to hit an index or something. Also, I'm assuming this has made it's way into NHibernate, but Hibernate has the ability to define a query and then run it so you can manually define your 1 or 2 odd queries or execute a stored proc to do some large data manipulation that is not as efficient in the app layer.
The other thing is the "one size fits all" mentality. ORM is excellent for more transactional applications, but does not always fit a datawarehouse type scenario. It might might sense to code 90% of the app using ORM and then use something else for reporting or warehousing. This would allow you to develop the tables differently as well, since reporting type tables are typically designed very differently from transactional ones. This is something where the dev team might need to give to understand that the reporting type data needs to be structured differently and they might need to write a little sql, using a reporting tool, or consider an ORM that is more flexible like iBatis.NET.
I think if the db architects and the development architects will work together and both agree to consider the other's needs, this won't be that big a deal. Developers need to realize at the end of the data they are acting on relational data even though that's being abstracted for them. Must the same as though they don't need to do memory management in .NET or Java, they still need to think about what their code is causing the runtime to do behind the scenes. Dba's need to understand the huge benefits that ORM offers the coders and work with them to model their data in a relationally efficient as well as ORM efficient (i.e. minimum amount of mapping) way. If both sides stay involved through the whole process then nobody gets blindsided. Also, if either side is willing to give on some things (for example, don't quote me on this but IIRC, Hibernate really likes Identity/sequence type primary keys over natural keys although it handles both) then the result can be positive for both sides.
I'll bet a lot of the "ORM is slow, Hibernate stinks" comments come from developers who never considered the relational impacts of their object design. Those same developers then produce things that make dbas thing "this ORM stuff is junk, I can't support a database like this." ORM is a huge help, but you still have to understand what your ORM is doing behind the curtain. If you do, you can get the code benefits and still have a nice, well performing database.
The SQL vs. stored procs argument is far too deep for me to get into, but I can say I used to be a stored procedure zealot, but over time I've softened up to using stored procs mostly only where manipulation on large sets of data was needed and it didn't make any sense to do any logic out of the database and allowing all other logic to live in the application. As time has gone by, that has ended up making a lot more sense to me. That's just my opinion though, I don't want to hijack your thread into a dynamic sql vs stored procs flame.
I also don't see how any decent ORM would require dbo privileges much less sa. Sa would be insane. I'm assuming many just give dbo to the user because that's an easy way to make sure it has access to all objects in the database. I know to some dbas table access is scary, but that seems a little draconian to me. The application is going to get at the data one way or another. Creating walls in front of the data is not always justifiable IMHO. One nice thing is that when you are using an ORM that you are comfortable with you at least know the tool is generating reasonable sql rather than having a developer code everything hit or miss by hand.
Anyway, hope my ramble was of some value.
April 23, 2008 at 6:52 am
Excellent points. Very well delivered. You've summarized where I find myself. I think the only real remaining issue I have with my team is that they do want to toss relational data design and I just can not get behind that. Thanks for the feedback. It was very helpful. Interesting information about Google too.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 23, 2008 at 6:55 am
If they really want to toss relational design, tell them they need an object database.
Then tell them when they find one that can perform as quickly as Oracle/SQL Server, etc you'll be happy to consider supporting it.;)
January 2, 2009 at 12:45 pm
Hi Grant - I just came across this discussion and was wondering how your project is going.
I'm on a team that is reengineering a very large application (was overgrown 2tier RAD tool based and relational, lot's of very complex procedures, lacking documentation, etc). We will be redesigning the DB and application from the ground up and using NHibernate at the core of the DAL. I share some of your concerns. But I'm not too worried.
I'm the data architect and have worked with the app architects before, elsewhere. On our former projects we've always enjoyed a nice balance of respect between the DB design and the OO needs for the application. Our database design is a first class citizen, as is the object model. For us, data is the most valuable asset and it needs to have integrity and quality. It will be used by downstream systems like enterprise reporting and out external customers.
We work together as a team on a continuous basis; no over the wall or ivory tower mentality from either side. That makes all the difference in the world.
So, what have you learned from your ORM project so far?
January 2, 2009 at 1:03 pm
JoeA (1/2/2009)
Hi Grant - I just came across this discussion and was wondering how your project is going.I'm on a team that is reengineering a very large application (was overgrown 2tier RAD tool based and relational, lot's of very complex procedures, lacking documentation, etc). We will be redesigning the DB and application from the ground up and using NHibernate at the core of the DAL. I share some of your concerns. But I'm not too worried.
I'm the data architect and have worked with the app architects before, elsewhere. On our former projects we've always enjoyed a nice balance of respect between the DB design and the OO needs for the application. Our database design is a first class citizen, as is the object model. For us, data is the most valuable asset and it needs to have integrity and quality. It will be used by downstream systems like enterprise reporting and out external customers.
We work together as a team on a continuous basis; no over the wall or ivory tower mentality from either side. That makes all the difference in the world.
So, what have you learned from your ORM project so far?
Hello,
Sounds like heaven. Where do I apply?
So far... They started coding in November, but have yet to do anything that touches a database. We haven't seen any of the generated code or structures yet. So far, I don't have anything to complain about, but then, so far, I haven't seen anything.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 2, 2009 at 1:32 pm
We're in Minnesota, so this time of year the term "heaven" is debatable.
Sounds like you've still got a bit of "us vs. them" going on... they might be chucking a database design over the wall at you.
For what it's worth, we're generating our persistent object model from the database design (with plenty of app team input) which keeps things in pretty good alignment and helps to ensure that we can migrate the existing data from the legacy system.
Good luck!
January 5, 2009 at 5:53 am
It sounds like you're taking the approach I would have expected based on all my readings on this topic. Logical and straight forward, playing against the strengths of each technology. Makes sense. Unfortunately, a sub-set of the industry is taking the approach our dev team is taking.
Yes, I'm standing on the other side of a wall with a net and I'm hoping to catch the database as it flies over.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply