redesigning table to accommodate new data types

  • hlsc1983 - Friday, March 24, 2017 10:13 AM

    can anyone come in here?

    What do you mean? I am pretty sure that anybody can view but you have to be logged in to comment.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange - Friday, March 24, 2017 10:31 AM

    hlsc1983 - Friday, March 24, 2017 10:13 AM

    can anyone come in here?

    What do you mean? I am pretty sure that anybody can view but you have to be logged in to comment.

    i mean i still need some input from the experts here.

  • You can sum a nullable column. Rows where the column is null will not contribute to the total, so leaving the marks null when the student was absent (for whatever reason) will give you what you want.

  • This might be heresy 😛 but this sounds like like you could use flag values that are outside the allowed domain of Marks.

    I'm not sure what a range of valid marks is (1 to 100? 1 to 815?) but as a poster above noted 0 might be a valid domain value for an exam "missed without valid excuse". Otherwise you could use -1, -2, and -3 for flag values for the three conditions you need to store in the database. This wouldn't require modification to the table, only the check constraint.

    On the front end app you could have 4 (or 5?) radio buttons, one of which is "took exam", the Marks box that would only allow direct user entry for "took exam", the other three buttons (or four) would enter a specific flag value. Lastly you could have a display only text field that showed what the value in the Marks field actually means, from "missed exam" to "failed" to "A++" (or whatever the descriptions are for your grading system).

    Having said all that, which design you choose is a holistic choice: it depends on how the database and the application interact, how the flag value approach would affect queries and calculations vs how a seperate column would, what your stance on null values is, what the standards are for your organization, and so on.

    Flag values aren't always the right answer, but they aren't always wrong either.

  • thanks everyone for your input.

  • roger.plowman - Wednesday, March 29, 2017 6:49 AM

    This might be heresy 😛 but this sounds like like you could use flag values that are outside the allowed domain of Marks.

    I'm not sure what a range of valid marks is (1 to 100? 1 to 815?) but as a poster above noted 0 might be a valid domain value for an exam "missed without valid excuse". Otherwise you could use -1, -2, and -3 for flag values for the three conditions you need to store in the database. This wouldn't require modification to the table, only the check constraint.

    On the front end app you could have 4 (or 5?) radio buttons, one of which is "took exam", the Marks box that would only allow direct user entry for "took exam", the other three buttons (or four) would enter a specific flag value. Lastly you could have a display only text field that showed what the value in the Marks field actually means, from "missed exam" to "failed" to "A++" (or whatever the descriptions are for your grading system).

    Having said all that, which design you choose is a holistic choice: it depends on how the database and the application interact, how the flag value approach would affect queries and calculations vs how a seperate column would, what your stance on null values is, what the standards are for your organization, and so on.

    Flag values aren't always the right answer, but they aren't always wrong either.

    is this what you meant? In the marks column i can have marks scored by  a student or a flag value. (Btw,  there are three possible situations: Present , Absent and Not Allowed) So if a student is present, obviously ,the marks scored by a student will be stored. If he is absent, -1 will be stored and if he is Not Allowed to appear for the exam then -2 will be stored in the column Marks. Did i get it right?

    The main reason why  i need this is because i need to print marksheets containing marks . The marksheet must also have the values 'Absent' or 'Not allowed' if the student was absent or not allowed respectively.

  • hlsc1983 - Tuesday, April 4, 2017 12:55 PM

    roger.plowman - Wednesday, March 29, 2017 6:49 AM

    This might be heresy 😛 but this sounds like like you could use flag values that are outside the allowed domain of Marks.

    I'm not sure what a range of valid marks is (1 to 100? 1 to 815?) but as a poster above noted 0 might be a valid domain value for an exam "missed without valid excuse". Otherwise you could use -1, -2, and -3 for flag values for the three conditions you need to store in the database. This wouldn't require modification to the table, only the check constraint.

    On the front end app you could have 4 (or 5?) radio buttons, one of which is "took exam", the Marks box that would only allow direct user entry for "took exam", the other three buttons (or four) would enter a specific flag value. Lastly you could have a display only text field that showed what the value in the Marks field actually means, from "missed exam" to "failed" to "A++" (or whatever the descriptions are for your grading system).

    Having said all that, which design you choose is a holistic choice: it depends on how the database and the application interact, how the flag value approach would affect queries and calculations vs how a seperate column would, what your stance on null values is, what the standards are for your organization, and so on.

    Flag values aren't always the right answer, but they aren't always wrong either.

    is this what you meant? In the marks column i can have marks scored by  a student or a flag value. (Btw,  there are three possible situations: Present , Absent and Not Allowed) So if a student is present, obviously ,the marks scored by a student will be stored. If he is absent, -1 will be stored and if he is Not Allowed to appear for the exam then -2 will be stored in the column Marks. Did i get it right?

    The main reason why  i need this is because i need to print marksheets containing marks . The marksheet must also have the values 'Absent' or 'Not allowed' if the student was absent or not allowed respectively.

    That's right. On an app form you can have a display to show that, or on a form you could probably use a case expression to do it. Either way, you can still exclude non-mark flag values if you want to average them or whatever.

    Of course, as I said, it does depend on your solution as a whole. Sometimes flag values are perfect, and sometimes they introduce intractable problems. But I've always found them useful.

  • Embedded "magical" values are a terrible idea.  It violates 1st normal form, because you're trying to store two different pieces of info in one column.  That inevitably leads to query issues and data consistency problems.  Two separate columns isn't that big of a deal, and I demonstrated one approach for it earlier:

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Tuesday, April 4, 2017 1:41 PM

    Embedded "magical" values are a terrible idea.  It violates 1st normal form, because you're trying to store two different pieces of info in one column.  That inevitably leads to query issues and data consistency problems.  Two separate columns isn't that big of a deal, and I demonstrated one approach for it earlier:

    In this case it can be argued the "magical" values are part of the domain. In that sense they are no different from Null, which is also a "magical" value, in that it is not a numeric value and certainly not a test mark.

    A second point is that, in effect, the marks domain is actually a finite set of discrete values. In the US, for instance, the grade is typically 0 to 100, with 0 usually meaning "no correct answers". Adding 3 more values to the domain (-3, -2, -1) meaning "not present", "absent" or "not allowed" does not change the fact they are a necessary port of the domain definition.

    In that case -3 has no more special meaning than 97, which in this case means "97% of questions answered correctly" instead of "student was not present for test".

    As I said, it depends on what the domain is, what mathematical operations need to be performed on a subset of data in that domain, and how you plan to handle all cases in the domain, not just the major subset.

  • roger.plowman - Tuesday, April 4, 2017 2:05 PM

    ScottPletcher - Tuesday, April 4, 2017 1:41 PM

    Embedded "magical" values are a terrible idea.  It violates 1st normal form, because you're trying to store two different pieces of info in one column.  That inevitably leads to query issues and data consistency problems.  Two separate columns isn't that big of a deal, and I demonstrated one approach for it earlier:

    In this case it can be argued the "magical" values are part of the domain. In that sense they are no different from Null, which is also a "magical" value, in that it is not a numeric value and certainly not a test mark.

    A second point is that, in effect, the marks domain is actually a finite set of discrete values. In the US, for instance, the grade is typically 0 to 100, with 0 usually meaning "no correct answers". Adding 3 more values to the domain (-3, -2, -1) meaning "not present", "absent" or "not allowed" does not change the fact they are a necessary port of the domain definition.

    In that case -3 has no more special meaning than 97, which in this case means "97% of questions answered correctly" instead of "student was not present for test".

    As I said, it depends on what the domain is, what mathematical operations need to be performed on a subset of data in that domain, and how you plan to handle all cases in the domain, not just the major subset.

    Why intentionally violate 1NF for no gain?  There's a reason that the process of normalization was created -- because it produces better data stores. 

    Also, what happens if in the future the school decides to deduct a point or two for certain missed classes and/or tests?  (Maybe certain lessons/tests are deemed to be especially critical learning moments.)  You've got a royal mess and pain on your hands if you've corrupted the existing data column with "magic" negative numbers.

    Or suppose one needed to differentiate between in-state students and out-of-state students (for tuition rating, etc.)?  Would you do that by storing the phone number as negative for out-of-state and positive for in-state?  I don't think so; you'd store a separate attribute that denoted it.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Monday, February 20, 2017 10:26 AM

    You don't want to store a string, use a numeric code instead; strings would be a huge waste of resources and are harder to change  Also, I'd separate the present/not present from the reasons for not present to properly model the data and simplify changes later.

    I'd suggest something like below.  After you fill in these values for current rows, you may want to change the columns to "NOT NULL" and adjust the check constraints accordingly.

    ALTER TABLE <table_name> ADD was_present bit NULL, 
        reason_not_present tinyint NULL /*valid values are: 0=was present; 1=ABSENT; 2=DISALLOWED*/
            CONSTRAINT <table_name__CK_reason_not_present> CHECK(reason_not_present IS NULL OR reason_not_present <= 2);
    ALTER TABLE <table_name> ADD CONSTRAINT <table_name__CK_present_and_reason_not_present
        CHECK( (was_present = 1 AND reason_not_present IS NULL) OR
                      (was_present = 0 AND reason_not_present IS NOT NULL) OR /*optional: you may not want to force a "reason_not_present" to be given*/
                      (was_present IS NULL AND reason_not_present IS NULL) ) /*this can be removed once you adjust current data to not have NULL in present & reason*/

    You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).  You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).

    thanks for the input. one doubt i have is this. If there is 'was_present' bit, why is there a need for the value zero to represent 'was_present'  for the  'reason_not_null' column?  is this not redundant data?

  • hlsc1983 - Thursday, April 6, 2017 12:44 PM

    ScottPletcher - Monday, February 20, 2017 10:26 AM

    You don't want to store a string, use a numeric code instead; strings would be a huge waste of resources and are harder to change  Also, I'd separate the present/not present from the reasons for not present to properly model the data and simplify changes later.

    I'd suggest something like below.  After you fill in these values for current rows, you may want to change the columns to "NOT NULL" and adjust the check constraints accordingly.

    ALTER TABLE <table_name> ADD was_present bit NULL, 
        reason_not_present tinyint NULL /*valid values are: 0=was present; 1=ABSENT; 2=DISALLOWED*/
            CONSTRAINT <table_name__CK_reason_not_present> CHECK(reason_not_present IS NULL OR reason_not_present <= 2);
    ALTER TABLE <table_name> ADD CONSTRAINT <table_name__CK_present_and_reason_not_present
        CHECK( (was_present = 1 AND reason_not_present IS NULL) OR
                      (was_present = 0 AND reason_not_present IS NOT NULL) OR /*optional: you may not want to force a "reason_not_present" to be given*/
                      (was_present IS NULL AND reason_not_present IS NULL) ) /*this can be removed once you adjust current data to not have NULL in present & reason*/

    You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).  You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).

    thanks for the input. one doubt i have is this. If there is 'was_present' bit, why is there a need for the value zero to represent 'was_present'  for the  'reason_not_null' column?  is this not redundant data?

    It's just to make it more self-evident / clearer, and it only costs 1 bit.  Sure, you could just use "reason_not_present", but then you'd need to explicitly tell new developers that "0" in that column means the student really was present.  That's counter-intuitive, that the reason for not being present represents actually being present!  But I would expect any developer, even a new one, to be able to understand a bit column of "was_present", and a separate "reason_not_present" code, without any explanation at all.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Thursday, April 6, 2017 1:17 PM

    hlsc1983 - Thursday, April 6, 2017 12:44 PM

    ScottPletcher - Monday, February 20, 2017 10:26 AM

    You don't want to store a string, use a numeric code instead; strings would be a huge waste of resources and are harder to change  Also, I'd separate the present/not present from the reasons for not present to properly model the data and simplify changes later.

    I'd suggest something like below.  After you fill in these values for current rows, you may want to change the columns to "NOT NULL" and adjust the check constraints accordingly.

    ALTER TABLE <table_name> ADD was_present bit NULL, 
        reason_not_present tinyint NULL /*valid values are: 0=was present; 1=ABSENT; 2=DISALLOWED*/
            CONSTRAINT <table_name__CK_reason_not_present> CHECK(reason_not_present IS NULL OR reason_not_present <= 2);
    ALTER TABLE <table_name> ADD CONSTRAINT <table_name__CK_present_and_reason_not_present
        CHECK( (was_present = 1 AND reason_not_present IS NULL) OR
                      (was_present = 0 AND reason_not_present IS NOT NULL) OR /*optional: you may not want to force a "reason_not_present" to be given*/
                      (was_present IS NULL AND reason_not_present IS NULL) ) /*this can be removed once you adjust current data to not have NULL in present & reason*/

    You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).  You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).

    thanks for the input. one doubt i have is this. If there is 'was_present' bit, why is there a need for the value zero to represent 'was_present'  for the  'reason_not_null' column?  is this not redundant data?

    It's just to make it more self-evident / clearer, and it only costs 1 bit.  Sure, you could just use "reason_not_present", but then you'd need to explicitly tell new developers that "0" in that column means the student really was present.  That's counter-intuitive, that the reason for not being present represents actually being present!  But I would expect any developer, even a new one, to be able to understand a bit column of "was_present", and a separate "reason_not_present" code, without any explanation at all.

    k i have created a table:

    create table ExamStatus(
            Status_id tinyint primary key,
            Status_name varchar(15) not null,
                    );

    Inserted three rows:
    insert into ExamStatus(Status_id,Status_name) values (0,'Present'),(1,'AB'),(2,'DC')

    And added a foreign key:
    alter table Students_Subjects_junction
    add constraint FK_inSSJ_fromExamStatus
    foreign key (reason_not_present) references ExamStatus(status_id)

    i need the values AB (meaning absent) and DC( meaning discollegiate/disallowed) since these strings have to be printed out in the marksheet if the student is absent or disallowed.if a student appears for the exam, then the grade scored will be printed in the marksheet.

    Now there arises another query. When i print the marksheet the latest marks for a particular subject must be printed out in the marksheet. There is no problem if a student sits for a particular course say Physics only once and clears it. But if a student fails in this paper or is Absent or Disallowed, then in this case there will be more than one row for a student - course pair.   Example. In the first attempt, student fails with score 8 out of 100. So he appears again with a score of 50. In this case , there is a row each for two attempts. Another example is a student may be DC candidate, and then he may score 30 next time. Here too,there is a row each for two attempts.
    Keeping these possibilities, i need to extract the latest row for a student-course pair in the Students_Subjects_junction table . How can i get latest values for generating the marksheet?  do i need to introduce another column for storing the date of entry?

  • roger.plowman - Tuesday, April 4, 2017 2:05 PM

    ScottPletcher - Tuesday, April 4, 2017 1:41 PM

    Embedded "magical" values are a terrible idea.  It violates 1st normal form, because you're trying to store two different pieces of info in one column.  That inevitably leads to query issues and data consistency problems.  Two separate columns isn't that big of a deal, and I demonstrated one approach for it earlier:

    In this case it can be argued the "magical" values are part of the domain. In that sense they are no different from Null, which is also a "magical" value, in that it is not a numeric value and certainly not a test mark.

    A second point is that, in effect, the marks domain is actually a finite set of discrete values. In the US, for instance, the grade is typically 0 to 100, with 0 usually meaning "no correct answers". Adding 3 more values to the domain (-3, -2, -1) meaning "not present", "absent" or "not allowed" does not change the fact they are a necessary port of the domain definition.

    In that case -3 has no more special meaning than 97, which in this case means "97% of questions answered correctly" instead of "student was not present for test".

    As I said, it depends on what the domain is, what mathematical operations need to be performed on a subset of data in that domain, and how you plan to handle all cases in the domain, not just the major subset.

    Say you need to get an average for marks, which seems like a common-enough request.  With the "magic" values in place, AVG(marks) would give you an invalid result.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • hlsc1983 - Friday, April 7, 2017 2:13 PM

    ScottPletcher - Thursday, April 6, 2017 1:17 PM

    hlsc1983 - Thursday, April 6, 2017 12:44 PM

    ScottPletcher - Monday, February 20, 2017 10:26 AM

    You don't want to store a string, use a numeric code instead; strings would be a huge waste of resources and are harder to change  Also, I'd separate the present/not present from the reasons for not present to properly model the data and simplify changes later.

    I'd suggest something like below.  After you fill in these values for current rows, you may want to change the columns to "NOT NULL" and adjust the check constraints accordingly.

    ALTER TABLE <table_name> ADD was_present bit NULL, 
        reason_not_present tinyint NULL /*valid values are: 0=was present; 1=ABSENT; 2=DISALLOWED*/
            CONSTRAINT <table_name__CK_reason_not_present> CHECK(reason_not_present IS NULL OR reason_not_present <= 2);
    ALTER TABLE <table_name> ADD CONSTRAINT <table_name__CK_present_and_reason_not_present
        CHECK( (was_present = 1 AND reason_not_present IS NULL) OR
                      (was_present = 0 AND reason_not_present IS NOT NULL) OR /*optional: you may not want to force a "reason_not_present" to be given*/
                      (was_present IS NULL AND reason_not_present IS NULL) ) /*this can be removed once you adjust current data to not have NULL in present & reason*/

    You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).  You can also add a FK and a table for the "reason_not_present" values, or translate them in a view (a FK relationship is better from a design standpoint).

    thanks for the input. one doubt i have is this. If there is 'was_present' bit, why is there a need for the value zero to represent 'was_present'  for the  'reason_not_null' column?  is this not redundant data?

    It's just to make it more self-evident / clearer, and it only costs 1 bit.  Sure, you could just use "reason_not_present", but then you'd need to explicitly tell new developers that "0" in that column means the student really was present.  That's counter-intuitive, that the reason for not being present represents actually being present!  But I would expect any developer, even a new one, to be able to understand a bit column of "was_present", and a separate "reason_not_present" code, without any explanation at all.

    k i have created a table:

    create table ExamStatus(
            Status_id tinyint primary key,
            Status_name varchar(15) not null,
                    );

    Inserted three rows:
    insert into ExamStatus(Status_id,Status_name) values (0,'Present'),(1,'AB'),(2,'DC')

    And added a foreign key:
    alter table Students_Subjects_junction
    add constraint FK_inSSJ_fromExamStatus
    foreign key (reason_not_present) references ExamStatus(status_id)

    i need the values AB (meaning absent) and DC( meaning discollegiate/disallowed) since these strings have to be printed out in the marksheet if the student is absent or disallowed.if a student appears for the exam, then the grade scored will be printed in the marksheet.

    Now there arises another query. When i print the marksheet the latest marks for a particular subject must be printed out in the marksheet. There is no problem if a student sits for a particular course say Physics only once and clears it. But if a student fails in this paper or is Absent or Disallowed, then in this case there will be more than one row for a student - course pair.   Example. In the first attempt, student fails with score 8 out of 100. So he appears again with a score of 50. In this case , there is a row each for two attempts. Another example is a student may be DC candidate, and then he may score 30 next time. Here too,there is a row each for two attempts.
    Keeping these possibilities, i need to extract the latest row for a student-course pair in the Students_Subjects_junction table . How can i get latest values for generating the marksheet?  do i need to introduce another column for storing the date of entry?

    If it's taken as part of a class / course, the term (such as Fall 2017) should (must) be stored as part of the class, yes. so either the subject_id and term (preferably) or the date (if you couldn't use term) would be stored.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 15 posts - 16 through 30 (of 32 total)

You must be logged in to reply to this topic. Login to reply