July 5, 2022 at 8:33 am
Hi,
I have a scenario where a worker can do multiple training courses each year (max 4).
Is it better to create a table this way:
WorkerID |Year |Courses
Worker01 |2021 |CourseCode01
Worker01 |2022 |CourseCode01
Worker01 |2022 |CourseCode02
Worker01 |2022 |CourseCode03
Or this way:
WorkerID |Year|course1 |Course2 |Course3 |Course4
Worker01 |2021|CourseCode01| | |
Worker01 |2022|CourseCode01| CourseCode02 | CourseCode03 |
Thank you
July 5, 2022 at 8:45 am
Depends on what you will do with it. Do you quickly need to answer what is the second course Worker01 took in Year Y?
Usually the first form WorkerId Year CourseCode as it is more flexible. If you need what is the second course worker x took in year Y, you'll need some counter-field
WorkId Year CourseCode Counter
July 5, 2022 at 9:55 am
As Jo mentioned, it depends on your requirements
I would most likely use the following
tb_Worker
WorkerID
FName
LName
-- Other Worker Attributes
tb_Course
CourseID
CourseDescription
-- Other Course Attributes
tb_WorkerCourse
WorkerID
CourseID
DateCompleted
July 5, 2022 at 11:03 am
I'd personally recommend using the normalised approach, with 1 Course
column. If you needed the data in the denormalised format (too), then you could use a VIEW
with conditional aggregation to put the values into (up to) 4 columns.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
July 5, 2022 at 1:14 pm
You should only ever use the first method, period, no matter what your current requirements are.
The second approach would be a maintenance and query nightmare.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 5, 2022 at 1:25 pm
First option is the correct relational way to design it.
July 5, 2022 at 2:02 pm
Consider, for example, with design #1, a unique constraint will insure that the same course is not entered twice for the same year.
But for design #2, you'd have to insure yourself that Course1, Course2, etc., don't duplicate any other CourseCode. Just try writing the CHECK constraint(s?) that would accurately enforce that basic restriction!
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 5, 2022 at 7:06 pm
Consider, for example, with design #1, a unique constraint will insure that the same course is not entered twice for the same year.
But for design #2, you'd have to insure yourself that Course1, Course2, etc., don't duplicate any other CourseCode. Just try writing the CHECK constraint(s?) that would accurately enforce that basic restriction!
And - as happens in almost every case - when you need to track 5 courses instead of 4 you have to add a new column and modify code everywhere to account for that new column. Or - when you now want to only track 2 but have 4 columns you now have the issue of which columns are actually used and the end users will use columns 3 and 4 instead of 1 and 2...because they can.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
July 7, 2022 at 12:17 am
(Notes from the field: I worked in a place where they didn't know how to normalize... As a result, I was there for 8 months instead of maybe a week. It was Access, but you would run into the same querying problem. About the only thing you can do after the fact is use CROSS APPLY to create a view that returns a normalized result and query that, but why even put yourself through that?)
Spend the time to normalize properly. Your future self will thank you, I promise.
Of course, if you want to learn by "escarmiento" (the hard way), that's good too. Make a dummy database and create both tables. Then write down a bunch of questions you want to answer from the data and try writing the SQL for each. Proof is in the pudding.
I was doing simple data summaries...
1. "what's the severest grade for each symptom for each enrolled patient?"
2. how many are there of each symptom/grade? (Count based on query 1)
If you create the "Reported Symptom" table like this:
CREATE TABLE PatientSymptom (PatientID INT NOT NULL, SymptomID INT NOT NULL, Grade TINYINT NOT NULL...);
and then have a table of Symptoms, like this:
CREATE TABLE SymptomDictionary (SymptomID INT NOT NULL PRIMARY KEY,
Category VARCHAR(255) NOT NULL,
Term VARCHAR(255) NOT NULL,
Grade TINYINT NOT NULL);
then this is a stupid simple query.
Join the two tables, group by PatientID, Category, Term... get MAX(Grade)
then count (wrap that query in another, and just do COUNT(*).
With effectively repeating fields, it gets ugly fast. Requires CROSS APPLY to stack the repeating groups... BLECH.
But since experience is a much better teacher than a lecture, try it yourself. (Seriously, it's absolutely the best way to learn. turn on the timer stuff, (I think it's SET TIME ON), and see how fast your queries are. But remember to clear the cache between tests. (Kevin Kline has some really funny stories about that!).
if you get stuck, post back. But take my word (or better, TRY IT!, learn from objective tests, not someone's opinion!)
Happy learning!
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply