March 2, 2016 at 1:18 am
I have created this post based on a discussion in my original question A column to identify one record? that went in a completely different direction and rightly so.
As this subject implies I am trying to model Departments, Teams, Roles and Services for a database that will have many clients. By that I mean I am not modelling a specific organisation's structure but for multiple, different organisations. These tables are just one part of a bigger database (product).
The primary table is Roles as it is needed to link to Services and other tables. A Role's Department and/or Team is basically a fact about that Role however they are not required. The reason is that I have to account for the possibility that an organisation doesn't care about Departments and Teams when using this product so they might just create a series of roles.
Here are some examples of Roles:
Organisation does not use Departments and Teams:
- Account Executive of Product XYZ Sales
- Manager of Product XYZ Sales
- Payroll Supervisor
Organisation does use Departments and Teams:
- Account Executive (Department = Sales, Team = Product XYZ)
- Manager (Department = Sales, Team = Product XYZ)
- Supervisor (Department = Payroll)
Since there is more functionality if the organisation provides departments and teams, I will provide rules as I see them for that. Please feel free to expand or clarify if you see I am missing something
- Departments may have member teams (1:M)
- Departments may have member roles, either directly or via a team relationship (1:M)
- Teams may be members of departments (M:M)
- Teams may have member roles (M:M)
- Roles may be a member of teams and/or departments. With department either directly or indirectly via teams (M:M).
Here are some examples:
A Manager (Role = Manager) of the all sales (Department = Sales) , it is unlikely they would be part of a team yet the there could be multiple teams in that department (Team = Product A, Product B ) that would have multiple roles (Role = Team Leader, Sales Executive).
Account Executive (Role = Account Executive) is responsible for existing customers of various products (Team = Product A, Product B, Department = Customer Service )
I didn't mention Services above as that is just a reminder that a Role needs to be would a Service.
I have created a very basic ERD and attached it. I should add that I always use surrogate keys but for clarity have left them out. One of the problems I have with this model is that the Department and Teams names are both optional, e.g. Role = Account Executive of Product XYZ Sales, but they are part of the primary key or what would be my alternate key. A far as I know that is a no-no.
All thoughts welcome.
Thanks
Steve
March 2, 2016 at 3:05 am
Hi Steve,
First, thanks for not including surrogate keys in your ERD. They are an implementation choice, so they should never ever be in any ERD. (And let me stress the word "choice" in the previous sentence, so your statement that you normally always use them makes me rather sad).
Your current design has (at least??) one technical flaw - the Roles table has three non-nullable columns, but your description shows that at least one and optionally two of them can be NULL. And you cannot change them to be nullable because they are the primary key.
I think the most common error in data modeling is to start thinking about entities or even tables way too soon. What we are modeling is information about the world that is relevant for our customers/employers. Our task is to identify and defined the structures in that information, and to defined the rules that distinguis valid information from invalid information.
I always say that (even though it never really happens) the end result of my data modeling might be a well-designed set of paper forms and an archive solution for those forms.
Normally the way to get to a good data model is to talk, a lot, with end users and/or domain experts. I don't know if you have done that sufficiently. If you have, then I suspect that there may have been communication gaps. The problem is that, when you start talking entities and relationships they have no idea what you mean. But when they start spewing their jargon they are equally lost. In my experience, the best way to overcome this gap is to use simple concrete examples. When I design a data model for e.g. a bank, I would use (fake but realistic) bank statements or wire order forms, askk the domain expert how they would read the information on it to a colleague, modify them and ask the domain expert if they are still valid after the modification, and so on. (I have a data base design course on Pluralsight that shows in a bit more detail how I do this).
In this case I do not have access to your domain experts and end users, so I will have to speculate.
I think the basic error in your design is that you have already decided that everything is determined based on roles. So you have a single role for Account Executive of Product XYZ Sales (either with or without the teams and departments, that's not really relevant here). And then a separate role for Account Executive of Product ABC Sales. But are they really separate roles? Are they not both an account executive, just working for different teams?
The way I would think about this subject matter is that your data actually revolves around positions. Account Executive of Product XYZ Sales is a position, and Account Executive of Product ABC Sales is another position. Both positions are in the role of Account Executive, both are in the Sales department, but they are in different teams. The readings of the examples I present to a domain expert would probably look like "position Account Executive of Product XYZ Sales is in the role Account Executive", "position Account Executive of Product XYZ Sales is assigned to the Sales department", and "position Account Executive of Product XYZ Sales works for team Product XYZ".
The rest of your description suggest to me that some of these facts may be optional (e.g. when a customer does not work with departments), and that some of these may be implied (e.g. when the Product XYZ team is part of the Sales department, assigning a position to that team implies assigning it to that department). And there may be other facts between the teams, departments, and roles that act more as constraints. For instance, you could have a collection of facts that read as "Role Account Executive can be assigned to the Sales department", "Role Order Picker can be assigned to the Fulfillment department", etc - and then when you try to define a position as an Account Executive in Fullfillment, you are told that this combination is invalid.
Note that in the above, I discuss information (facts and fact readings), not entities, tables, relationships, etc. After collecting all the fact readings, finding their fact forms, and identifying the functional dependencies between attributes in fact forms, the entities and relationships will follow from that.
When you do that, please do not make the mistake to take anything I said for granted. All of the above is based on speculation, on how I think your organizations sees the world of their customers. I am probably wrong. Also, even if I am not wrong, I will probably use different terminology then your organization. Perhaps everyone in your organization uses the term "Role" for exactly the concept I call "Position", and they use a different term for what I call a "Role". To find that out, you will -again!- have to talk with end users and domain experts from the organization. They will tell you -if you interview them correctly!!- how they normally call these concepts, and those terms should end up as the names for your entities and tables.
Good luck!
March 2, 2016 at 3:58 am
Hi Hugo,
Thank again, I will take all you say onboard as I agree 100% with everything.
I should mention that the text that follows doesn't really have anything to do with the data structures mentioned in the subject so no need to to read further if that is the only interest 🙂
The challenge I have is I am creating a database to be used in a product that currently doesn't exist for customers I currently don't have. Surely this scenario does happen? I do know that these customers will be organisations that have departments, teams and roles however I don't know how they implement them internally nor since this product doesn't exist yet how they would use this in relation to this product. Initially the customer base would be councils in the UK.
Currently my organisation, being run from my home office, consists of myself and my partner (not business) who is the domain expert for what the product is targeting - Data Protection and Records management. The product I am working toward will hopefully fulfil a need that she know's exists because she has been trying to find an affordable solution for years. Unfortunately she cannot help in the requirements of future customers in industries other than her own.
In terms of this specific requirements, Depts etc..., my only choice is try to model the database as best I can to fulfil the needs of my current customer, which is basically me and my partner. FWIW, her organisation does have roles without teams and/or departments that provide services, potentially temporarily, that need to be captured to be reported on to the end users, which is the ultimate purpose of this product. Providing data to users to make business decisions.
I will revisit the design again keeping all your advice in mind but I may have to resign myself to starting out with a non-ideal design during this phase/version until I can get access to more customer data.
Thanks again for your time Hugo
Steve
March 2, 2016 at 4:34 am
Hi Steve,
In your case, you and your partners are the domain experts. And even if your partner cannot answer the questions, it can help to get in a sort of roleplay mode where you alternate between the roles of data modeler and domain expert.
The domain experts will have to make decisions based on what they expect of their future customers. Which partly is determined by, partly determines your set of possible customers.
The hardest decisions will be on flexibility versus cost and maintainability. Do you want to offer a solution that offers everything to everybody? In that case you will end up with either a very complex data model or a lot of dynamic stuff. Or do you prefer to build a good enough solution that works for a large enough part of the customers? In that case, you will have to set up rules that may not work for some potential customers.
Good luck with making thise hard decisions!
March 3, 2016 at 3:36 am
Hi Hugo,
Thought I would just let you know what I finally decided, also in case anyone else is interested. I don't think you'll like it. However since I am the only domain expert available at this time I am going with it. When I am in a position to do some refactoring , e.g. an actual customer base whom I can interview , I will definitely revisit the decision.
I have decided that each of the tables will have a 'No Record' record, e.g. DepartmentName = 'No Department', TeamName = 'No Team' and RoleName = 'No Role'. In order for the system to identify these records I am adding a 'IsNoRecord' bit column to each table.
I have decide on this because:
1 ) Considering NULL is...
A special value that is used to indicate the absence of any data value
... I am not 100% convinced that fact a Role is not a member of a team is not a data value. I will concede that this is probably one of the many things that I just don't get - oh there are so many 🙂
2) It provides me with the greatest flexibility in a situation where I don't have access to enough data to make an informed decision.
3) I can enforce the not null rule on the table's primary key
4) It provides (future) customers the greater flexibility in how they use/set up my product.
Thank again for all the time you have spent on this.
Regards
Steve
March 4, 2016 at 7:28 am
SteveD SQL (3/3/2016)
I don't think you'll like it.
I am sorry to say that you are absoltuely right on that.
It's your choice obviously, but I will comment on some of your arguments.
I have decide on this because:
1 ) Considering NULL is...
A special value that is used to indicate the absence of any data value
... I am not 100% convinced that fact a Role is not a member of a team is not a data value. I will concede that this is probably one of the many things that I just don't get - oh there are so many 🙂
The values you store in table columns represent facts. So the column "Team" in the "Roles" tables actually is used to store facts of the type "Role xyz is member of team Abcd", and if a role cannot be a member of more than a single team, then this column can indeed go into the "Roles" table.
Now if role xyz is not a member of any team, your solution appears to be to represent the fact "Role xyz is member of team No Team", which sounds silly, and will result in a lot of code in your database where you have to program special cases for the value "No Team". A normal (i.e. non-IT-geek/nerd type such as we) person would simply not read any fact that starts with "Role xyz is member of team ...". (They might read a fact with a different form, but that should then also go into a different column if it needs to be stored at all).
2) It provides me with the greatest flexibility in a situation where I don't have access to enough data to make an informed decision.
I fail to see how "No Team" is more flexible than NULL.
3) I can enforce the not null rule on the table's primary key
This makes no sense at all. In the roles table, the primary key would be RoleName. The Team column is a foreign key, not a primary key nor even part of it.
If the relationship between Roles and Teams is many-to-many then you have three tables, Roles and Teams each with their primary key and RoleMembership with two foreign keys to those tables and a composite key on the both; both columns would no not nullable, and for team xyz in the example above you would simply not add any rows to this table.
4) It provides (future) customers the greater flexibility in how they use/set up my product.
Again, "No Team" is no more flexible than NULL. I could even argue that it's less flexible because depending on how you set up the case sensitivity, I could run into problems in your system if I decide to have a team dedicated to new operations and call it the "NO team".
Also, many operations in SQL Server have a default handling of NULL that is usually correct; for "No Team", you will have to code exceptions in the logic every time.
March 4, 2016 at 9:19 am
Hi Hugo,
and if a role cannot be a member of more than a single team, then this column can indeed go into the "Roles" table.
A Role can be a member of 0 or N teams.
A teams can be a member of 0 or N departments
A team can have 0 or N roles member
A department can have 0 or N team member
Some sample data might be:
Dept. - Team - Role
Sales - Product A - Sales Person
Sales - Product B - Sales Person
Sales - No Team - Manager
Customer Service - Product A - Account Executive
Customer Service - Product B - Account Executive
No Department - Special Project Team - Project Leader
No Department - Special Project Team - Analyst
If a customer didn't want to set-up departments, teams or roles
No Department - No Team - Sales Person for Product A
No Department - Product A - No Role
Sales - No Team - No Role
The DeptTeam and TeamRole tables are for holding these links.
I know this is not ideal but it does provide me with the functionality that I want .
and will result in a lot of code in your database where you have to program special cases for the value "No Team".
The role table will have a single record that is the 'No Role' record, designated by column IsNoRole = 1
The team and departments tables each have a similar record is designated by an 'IsNo...' column
As a side note since I use surrogate keys there is no repeated text in code of TeamName = 'Not Team'. Also since it is in fact a value in a table the customers have the choice of the text to use to designate 'No Team', e.g. 'N/A' for instance
If I want to get a list of all roles that don't have a team I would use the following:
SELECT r.RoleName
FROM dbo.TeamRoles tr
JOIN dbo.OrgRoles r ON r.pkOrgRoles = tr.fkOrgRoles
JOIN dbo.OrgTeams ot ON ot.pkOrgTeams = tr.fkOrgTeams
AND ot.IsNoTeam = 1
3) I can enforce the not null rule on the table's primary key
This makes no sense at all.
But to me it does. The DeptTeam and TeamRole tables both use the surrogate key from the tables they are linking as their primary keys. (Departments and Teams, Teams and Roles). If I use NULL instead of pointing to a record I cannot enforce the NO NULL for PK rule. Granted I am bypassing it but again it does give me what I want for now.
I fail to see how "No Team" is more flexible than NULL.
At this time having those records is the only way I can see to fulfil the requirements I have laid out:
A Role can be a member of 0 or N teams.
A teams can be a member of 0 or N departments
A team can have 0 or N roles member
A department can have 0 or N team member
Also, not a database requirement one required for all User Facing functionality , e.g. forms and report:
Customers will be able to control over how the fact that there is No Department, No Team and/or No Role is displayed
As mentioned I will revisit all these when I have more/any customers but for now it is me.
In all seriousness Hugo, I have taken all you have said on board for this and future projects and I do thank you again for your time.
Take Care
Steve
March 5, 2016 at 9:28 am
SteveD SQL (3/4/2016)
Hi Hugo,and if a role cannot be a member of more than a single team, then this column can indeed go into the "Roles" table.
A Role can be a member of 0 or N teams.
A teams can be a member of 0 or N departments
A team can have 0 or N roles member
A department can have 0 or N team member
Some sample data might be:
Dept. - Team - Role
Sales - Product A - Sales Person
Sales - Product B - Sales Person
Sales - No Team - Manager
Customer Service - Product A - Account Executive
Customer Service - Product B - Account Executive
No Department - Special Project Team - Project Leader
No Department - Special Project Team - Analyst
This part of your posts exposes what I think is the flaw in your logic. You describe cardinalities (all many-to-many) for two relationships between the three entity types you have:
"A Role can be a member of 0 or N teams / A team can have 0 or N roles member" - that's a many-to-many between role and team.
"A teams can be a member of 0 or N departments / A department can have 0 or N team member" - that's a many-to-many between team and department.
In the ER diagram, I would draw three entities (Department, Team, Role), and two many-to-many relationships.
In the technical implementation, each many-to-many relationship becomes a table. So I would first create the three tables Departments, Teams, and Roles; and then add two more tables to implement the relationships.
In the example above, the line "Customer Service - Product B - Account Executive" would result in one row in each of those relationship tables - a row "Customer Service - Product B" in the table DeptHasTeams, and a row "Product B - Account Executive" in TeamHasRoles. (The table names might need some work).
The line "No Department - Special Project Team - Analyst" would be recorded only as "Special Project Team - Analyst" table; in this case there is no fact of the type "Team xxx belong to department yyy" to record so we simply do not add anything in the DeptHasTeams table.
The line "Sales - No Team - Manager" shows that the design is incomplete. This line suggests that there are also cases where a department has no teams, and that in such a case roles can belong directly to that department. So we need to add one more relationship to the ER diagram, find the cardinalities, and assuming that this one is also many-to-many add yet another table DeptHasRoles. Now this row can be entered in that table alone and not in any of the other two linking tables.
Returning to "Customer Service - Product B - Account Executive", you now might want to add "Customer Service - Account Executive" to that third table, but this requires some thought. This role-to-dept membership can be inferred from the data we already entered. If every role-to-dept membership that can be infered does indeed actually exist, then there is no need to store this, and we can choose to store only the role-to-dept memberships that cannot be infered because the department does not have teams.
So to wrap up the example, the flaw in your design is trying to combine three separate relationships into a single table. That, as I already said in a previous reply, violates fourth and/or fifth normal form. And in this case, because some of the relationships are optional, introduces a problem with nullable primary key columns.
Break up that table into three tables for the three separate relationships, and your problem is solved.
I hope that I was able to explain this better than the previous time...
March 5, 2016 at 11:55 pm
Thanks Hugo, I did have something similar however the challenge I came across was linking the Department-Team-Role combination to the Services table and other tables that need to link a Department-Team-Role. I wanted a single table which bring it all together and currently the TeamRoles table does that for me.
I will revisit the design considering the points you bring up once I go through the first round of refactoring after I actually get some customers 🙂 With the current design it would not be difficult to move the data around to a more precise, nor would a data access layer be that difficult to update.
Take care,
Steve
March 6, 2016 at 2:51 am
SteveD SQL (3/5/2016)
the challenge I came across was linking the Department-Team-Role combination to the Services table and other tables that need to link a Department-Team-Role. I wanted a single table which bring it all together and currently the TeamRoles table does that for me.
Sounds like an implementation-drive choice. You are thinking from the perpective of tables and relationships.
What exactly is represented by rows in that table? What words would you use to connect the values in a single row to form a valid sentence, that you would use when sharing that information with someone over the phone?
And the same questions apply to the link you want between the TeamRoles table and the Services (and other) tables. What exactly are you representing. What is the significance of a link between a Service and a "TeamRole-thingy" in the real world?
My hunch at this time is that you are overlooking a piece of the puzzle, probably because you already have a set of tables that appears to do the work even if you do not exactly know what they represent. I think that you have not yet the entity type that the services (and other entitiy types) actually relate to. Find this entity type (and with that I mean: understand what it represents in the real world and derive a good name for it from that understanding), then identify how it relates to teams, departments, and roles - perhaps you might find that there is a subtype/supertype structure at play, perhaps it's something else. If my hunch is right and you do manage to find the missing entity type, I expect a lot of your current issues to clear out automatically.
Anyway, I wish you good luck on your project. And I truly hope that either your design simply turns out to work flawlessly (despite its normalization violations), or that you find a better way in such an early stage that you can still rebuild. Far too often, mistakes such as this are only found when so much data has already been loaded, and so much code already written, that it becomes too expensive to fix. And I unfortunately inherited more of those systems than I care to remember...
Thanks for the interesting discussions!
March 6, 2016 at 3:08 am
Thanks to you too Hugo for the interesting and informative discussion. I think what this discussion really needs is a few napkins over a drink or two, at least from my POV 😀 Often I find it difficult to articulate and/or fully understand an idea or concept properly without actually having a conversation.
Take Care
Steve
March 6, 2016 at 4:25 pm
Hugo Kornelis (3/2/2016)
First, thanks for not including surrogate keys in your ERD. They are an implementation choice, so they should never ever be in any ERD.
That's one opinion. Mine is quite the contrary. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
March 7, 2016 at 1:14 pm
Jeff Moden (3/6/2016)
Hugo Kornelis (3/2/2016)
First, thanks for not including surrogate keys in your ERD. They are an implementation choice, so they should never ever be in any ERD.That's one opinion. Mine is quite the contrary. 😉
You believe that surrogate keys should be included in the ERD, the logical data model?
Why?
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply