April 9, 2024 at 2:18 pm
Just looking for some background on if anything has changed around "code first" for building schema. I've always preferred designing first, then building the schema (usually in a database project) and handling changes to schema that way. I'm in the position now where there is a project where it appears the developers picked EF and code first to do all of their schema. I imagine this means that the C# devs will now handle all schema changes and migrations themselves using EF? I'm also thinking how best to support them, and putting my DBA hat on, I'm expecting I should look at turning on caching ad-hoc queries at the very least as they aren't building out stored procedures or anything. Everything seems to be running out of their app and they seem to be just setting it up to be a basic CRUD workflow.
Thoughts?
April 9, 2024 at 10:00 pm
Personally, I would avoid EF for database design. Use SSMS, a Whiteboard, and a DBA (or 2). That way you won't end up with over-normalized data (I saw someone once normalize a date column and put the date plus an ID in a table and other things that used dates just had ID's back to that table) or under-normalized data (that's how you end up with tables that have 255 columns) . EF and code first is good to prototype and proof of concept, but for production ready stuff I much prefer to have an expert getting their hands dirty than an automated tool (EF) doing a "best guess". I've seen some projects that used EF to do the database work complain about database performance and it has nothing to do with the database being "slow", it is EF (or similar tools) building bad stuff.
For example, EF is very good at creating a bunch of ad-hoc queries against a database. This means that each query that hits the DB needs to have the code compiled and run through the optimizer to get a new query plan. If you can re-use a query plan, it makes things faster. On "slow" systems (not many db hits per second), you may not notice a difference, but when you have to scale up to millions of requests per second, plan reuse is a very nice feature to have. And even on "slow" systems, if the query plan takes a long time to build, that's a waste of resources.
It's like all those "no code" coding tools out there - they work great for prototyping and POC and at small scale, but once you start scaling up, you realize quickly how much better the code can be if you throw an expert at it.
IF the project is definitely going live (and a prototype and POC are not needed), then I would NOT do code first or EF. I would start with the paperwork side - data flow diagrams, process maps, whiteboarding, required data points, brainstorming sessions, documentation, requirements docs, spec docs, standards documentation (if you are building on top of other standards). Once you know what you are building and what data points you need to capture, start thinking about the database schema and how you want to structure the data and link it and normalize it. I've seen some crazy designs where the developer put code first and feature creep caused odd database design decisions such as table A, table B, and a "lookup" table (table C) to map data from A to B. Had they done schema first design and feature lock once code started, they could have just had a foreign key constraint between table A and B. We have also had some where table A has a 1:1 relationship to table B (1 row in A maps to 1 row in B) just to store 1 or 2 additional data points that make sense to just have in table A.
What is REALLY hard to do is to re-design the database after go live. And optimizing the database can be a challenge too as you approach go live as nobody wants to adjust the application code close to go live except for bug fixes, so I find you often get the bad database design hitting production with code first approaches. Imagine you build the code and are pulling data directly from the table and then the DBA tells you that your table needs to be normalized as it is too wide and inefficient and hands you a list of new tables to use instead of pulling from the one table you did in a code first approach as they are about to change the table definition for performance reasons? Or they give you a stored procedure/view and it changes the column order or names so your application starts failing. Code first usually means the database is going to be unchanging once the objects are created and your DBA will get blamed when things are slow because EF decided that everything should be NVARCHAR(MAX) and performance tanks OR you get errors due to the row being too big (too much data in a single row).
Just my 2 cents. I know a lot of people like EF (and similar tools). Just as a DBA, it has always left a bad taste in my mouth... similar to having hard-coded SQL inside an application code. Applications should call stored procedures, not access the data directly. That gives the DBA a chance to tune things and also means that schema level changes can happen without impacting the application. Update the SP and as long as the result is the same, the application doesn't care.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply