Identity field for batch migrating data

  • sqlfriends (2/9/2016)


    so is it going to be something like this?

    Note table: Noteid, NoteDetail, NoteDate, CreateDate, createdby

    NoteBridgeTable:

    columns: NoteID, NoteTypeID, Objectid (cannot find a good name for it)

    Data:

    Noteid, notetypid, hearingid

    Noteid, notetypid, conferenceid,

    Noteid, notetypeid, workshopid

    Noteid, notetypeid, Interventionid

    Should NoteType in Note table or in NoteBridge table?

    Thanks,

    No. This violates the rules of first normal form, and you have no way to enforce relational consistency because the ObjectID is a single column that can reference multiple tables.

    If a note belongs to a single object (either a hearing or a conference or a workshop or an intervention), then use:

    Notes table: Noteid, NoteDetail, NoteDate, CreateDate, createdby, HearingID, ConferenceID, WorkshopID, InterventionID

    (where each of the last four columns is nullable and check constraints to ensure that exactly three of them are NULL - also consider if you really need a NoteID column)

    If a note can belong to several objects, but not to mulitple objects of the same object type (e.g. to both a hearing and an intervention, but not to two interventions), then use the same table structure but change the check constraints to ensure that at most three of them are NUL - and obviously exclude specific combinations that are invalid)

    If a single note can belong to several objects, including mutliple objects of the same type (e.g. two interventions), then use this design instead:

    Note table: Noteid, NoteDetail, NoteDate, CreateDate, createdby

    NoteBridgeTable:

    columns: NoteID, NoteBridgeNo, HearingID, ConferenceID, WorkshopID, InterventionID

    (where the primary key is on the first two columns combined, and each of the last four columns is nullable and check constraints to ensure that exactly three of them are NULL; also please change the name to something that described the function of the data in this table in terms that are normally used by your users)

    EDIT: I later saw your next post where you explore the alternative of having separate tables for ConferenceNotes, WorkshopNotes, etc. This is the prefered method by far, and I strongly suggest that you take that path - except (as indicated in my post yesterday) when you have a lot of queries that relate to the collection of all notes and you have a lot of notes that need to be shared between workshops and conferences.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • If a note belongs to a single object (either a hearing or a conference or a workshop or an intervention), then use:

    Notes table: Noteid, NoteDetail, NoteDate, CreateDate, createdby, HearingID, ConferenceID, WorkshopID, InterventionID

    (where each of the last four columns is nullable and check constraints to ensure that exactly three of them are NULL - also consider if you really need a NoteID column)

    If a note can belong to several objects, but not to mulitple objects of the same object type (e.g. to both a hearing and an intervention, but not to two interventions), then use the same table structure but change the check constraints to ensure that at most three of them are NUL - and obviously exclude specific combinations that are invalid)

    If a single note can belong to several objects, including mutliple objects of the same type (e.g. two interventions), then use this design instead:

    Note table: Noteid, NoteDetail, NoteDate, CreateDate, createdby

    NoteBridgeTable:

    columns: NoteID, NoteBridgeNo, HearingID, ConferenceID, WorkshopID, InterventionID

    (where the primary key is on the first two columns combined, and each of the last four columns is nullable and check constraints to ensure that exactly three of them are NULL; also please change the name to something that described the function of the data in this table in terms that are normally used by your users)

    .

    Our design for new application is for each record of each object like hearing will have a note or more notes, for example in the hearing table it may have a note of the hearing and also have noteType like hearing attendees, or hearing results may have a note, so that is a notetype inside one object.

    so there is not a big issue for new application if we use the one note tabled design you mentioned above, we need to also consider our existing data migration, because the previous bad UI design only has one field of note is available, so users just put all the notes for all above listed interventions into the one note field. New application will make each intervention has a note field.

    So for both existing data and new data, it seems using the Note table and NoteBridge table is better way.

    My further question is in the Notebridge table : is NoteBridgeNo an identity field,(1,1) and it plus the noteid becomes composite PK?

    Thanks much!

  • You can do that.

    But remember - pretty soon other developers will catch you in a dark corner with a strong intention to beat the c..p out of you. :w00t:

    Thanks much,

    Sorry, I am not getting above, so better solution is using conferenceid, hearingID, instead of a common name like objectID, correct?

    The only reason I thought using the common name is when doing join they don't need to change the column name for another joins? maybe easier for entity framework, which I don't know much?

    Thanks

  • First, thanks every one for your patient and detailed answers. All your answers are very helpful. I really learn a lot from this.

    We are not going to have many queries to the collection of all notes or note that shared between objects. We mainly associate each notes with each object.

    So I think now I came to the two approaches by your posts:

    1. create a noteTable for each parent Object. I heard this is proved preferred design.

    The only thing is that I see all the columns are the same,(except the hearingID, or conferenceID column), so even this is repeatable for all the notes tables, still consider a good design?

    NoteHearing: Noteid, notetype, createddate,hearingid,

    NoteConference: Noteid, notetype, createddate,Conferenceid,

    2. to create a Note table and a NoteBridge table.

    Note: NoteID, NoteDetail, createDate, updateDate

    NoteBridge: (or maybe call it NoteRelationship)

    columns: NoteBridgeID(identitycolumn), Noteid, hearingId, conferenceid, interventionid.

    The noteBridgeID+Noteid is the PK.

    Am I correct? Thank you very much

  • sqlfriends (2/10/2016)


    So I think now I came to the two approaches by your posts:

    1. create a noteTable for each parent Object. I heard this is proved preferred design.

    The only thing is that I see all the columns are the same,(except the hearingID, or conferenceID column), so even this is repeatable for all the notes tables, still consider a good design?

    NoteHearing: Noteid, notetype, createddate,hearingid,

    NoteConference: Noteid, notetype, createddate,Conferenceid,

    Yes, this design is good. Not sure why you think you need a NoteID. Just create the primary key on (HearingID, CreatedDate) / (ConferenceID, CreatedDate) / etc. (And perhaps swap the order of those columns, makes no difference in the logical design but can make a performance difference in a SQL Server implementation).

    2. to create a Note table and a NoteBridge table.

    Note: NoteID, NoteDetail, createDate, updateDate

    NoteBridge: (or maybe call it NoteRelationship)

    columns: NoteBridgeID(identitycolumn), Noteid, hearingId, conferenceid, interventionid.

    The noteBridgeID+Noteid is the PK.

    Am I correct? Thank you very much

    Only use the second design when you (a) fully understand why you cannot use the first design in your specific situation, and (b) fully understand why this specific design is indeed required.

    Based on your posts so far, I am 99% sure that the first design is best in your case.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thank you, a further question about the database.

    For the new application data, it will be good and clean, if we create a hearing note, in the table of HearingNote, we will just use hearingID + CreateDate.

    HearingNote table: HearingID, noteType, NoteDetail, CreateDate, createdby

    But for the data migration, because some defect of previous UI design, I have some little trouble in migrating existing data. I would like to explore what is the workaround, even we are not required to convert data 100% accurately, I would like to be close as possible.

    The tricky part is this:

    I do have some existing records that is the hearingID, and createdDate are the same, so it violates the PK.

    In the old app Note table, they have, noteid, notetype, and sequenceNumber, noteDetail, and createdDate, createdby

    I like the new table design by just using hearingID+ createDate, it is cleaner,

    But for migrating existing data, it seems I have to add the NoteType into the PK, hearingID+Notetype+createdate, it may later find I have to add sequ,

    Any work around? I even think to make the second duplicate PK record add 1 second to the createddate to make it unique, but donot know how.

    any recommendations?

    THanks,

  • When you decide on table design don't be driven away by technical details.

    Follow the business logic.

    Should you distinguish different types of notes?

    Forget about the implementation in the old system.

    Is the logic of the system requires separating notes by different types?

    If yes - then you need to have NoteTypeID.

    Now - Note sequence. Do you need to bind notes in some sequences?

    Or following chronological order defined by CreateDate would be enough?

    The answer will tell you if you need NoteSequence column.

    From technical point of view you do not need it for a primary key, as HearingID+NoteID should define a unique key. You do not need to have the same Note allocated to the same Hearing more than once.

    Another question - will you have "follow up" note? The once which are "in reply" on an existing note?

    If yes, than you need to have some form of hierarchy implemented.

    And so on.

    About data migration.

    Use mapping tables.

    Create tables which hold the key values from old system mapped to the keys given to the entities by the new system. It will help you easily replace the old relations with new ones.

    You may drop those tables after migration is completed.

    _____________
    Code for TallyGenerator

  • Whether you "like" a design or not is actually irrelevant. The only question that matters is if it's correct.

    If you have existing data that violates the primary key, then either that data is incorrect, or the primary key is incorrect. In the first case, you'll have to fix the data. If a bunch of notes for which no date was known was inserted with a made-up date, then you can freely change that made-up date to something different for each note - it'll be an incorrect date, but so is the date you have now. If in the past you actually had multiple notes for the same hearing created on the same date, then HearingID + CreateDate is simply not a good candidate key.

    It sounds as if the sequencenumber that is used in the old app actually had a good reason for its existence. Perhaps you need to have a sequencenumber in your table as well.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thank you, good advice.

    One more question, they also need to add a notification table, basically for each item too, such as notification for Hearing, notification for workshop, notification for invervention. In old app, it is in only hearing table.

    It sounds it likes notes too, shall we add another set of tables for each objects just like notes, for example, HearingNotification: hearingID, notifytype, createdate, createdby

    WorkshopNotification: workshopid, notifytype, createddate, created by etc.

    Or if we can somehow avoid creating these extra tables.

    Thanks!

  • This looks like mostly a repeat of your earlier question, so the same answers apply. Short version: it deppends on what exactly the functional requjirements are. Find them, and the correct database design follows automatically. Then consider if you should implement that design as is, or make it functionally slightly less correct in order to gain other advantages (like better performance, easier maintainability, etc). But beware - never sacrifice too much, staying close to the optimal database design always pays off in the long run.

    Further, please try to get all the requirements on the table before you start building anything. Get the people who are in charge to sign off on them (which they should only do if you present the requirements in a form that they can understand). And then, and that is perhaps the most important, get full backing of your project manager in freezing the specifications - nobody, not even the CEO, should be allowed to make changes after the requirements have been frozen, because it might result in a complete redesign. (Not all changes will, in fact most don't, but that kind of message works well for managing expectations).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thank you so much Hugo for your detailed advice, I learn a lot.

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply