Normalisation of location data

  • Hi,

    I am trying to normalise some data and am unsure how best to deal with Location data.

    There are many clients and each client has many pieces of equipment which are tested and the results recorded. A (simplified) sample of data is below.

    Note that the type of location data varies within each column, so LocA may be a site number, a building name, a city etc, depending on the client requirements. Similarly with the other columns, although Loc D is usually a room identifier ("managers office" etc). There is a potential for repetition across location colums as well as within them.

    eg

    Client Name, LocA, LocB, LocC, LocD, Test Data

    AAAA, London, 34 The strand, Floor 1, Managers office, Test Data1

    AAAA, London, 34 The strand, Floor 1, Managers office, Test Data2

    AAAA, London, 34 The strand, Floor 1, Managers office, Test Data3

    AAAA, London, 34 The strand, Floor 1, Managers office, Test Data4

    AAAA, London, 34 The strand, Floor 1, Dep Managers office, Test Data5

    AAAA, London, 34 The strand, Floor 1, Dep Managers office, Test Data6

    AAAA, London, 34 The strand, Floor 2, Open Plan office, Test Data7

    AAAA, London, 34 The strand, Floor 2, Open Plan office, Test Data8

    AAAA, London, 34 The strand, Floor 2, Open Plan office, Test Data9

    AAAA, London, 10 Old Kent Road, Open Plan office, ,Test Data10

    AAAA, London, 10 Old Kent Road, Open Plan office, , Test Data11

    BBBB, 1234, London, The Font Building, Kitchen, Test Data 12

    Queries on the data will include: Where was this particular test done? What locations exist for this client?

    I have considered a single table for all the location fields, and four references from the test data table but I'm not sure that is right.

    Any suggestions would be welcome.

    Thanks 🙂

  • Normalizing is great if you have a way to guarantee that you know all locations ahead of time...

    How are these locations entered? If they are currently just free-typed into a form, you're out of luck without re-architecting the whole app.

    Thinking about it actually...this table isn't really set up for normalization anyway, since there is no way to verify which piece of data is where. You really need to look into starting over and have a Street, City, Zip, Floor, Office, Other format, or something along those lines.

    Depending how many addresses you have...you could probably keep an AddressBook table or something along those lines and then enter each possible address once (using country/floor/state codes if you really want to normalize it), and then link the TestData to the AddressBookID.

  • Thanks for the reply Derrick. I appreciate your time.

    The locations are entered into a PDA device, in fields called LocationA, B, C and Room. As I described before, the content of each field is variable (unfortunately). The data is currently stored in an access database, which I am seeking to normalise and move to SQLServer.

    I think what you said about 'no way to verify which piece of data is where' is the source of my unease with this. I had thought about something like this:

    Table Client: Client_id, Location_id

    Table Location: Location_id, SLocationA_id, SLocationB_id, SLocationC_id, SLocationD_id

    Table SingleLocation: SLocation_Id, Location, Parent_id (ref to another location_id or null)

    But now I think just sicking with unique combinations of Location fields in a Location Table might be the way to go:

    Table Client: Client_id, Location_id

    Table Location: Location_id, LocationA, LocationB, LocationC, LocationD.

    It would still reduce a significant amount of repetition, even though its not ideal.

    What do you think?

  • I suspect the location data is not strictly required to be in a given format, hence the relaxed design, so just an idea: when designing your reports/queries, why not join all location columns into one string field and display it as such?


    Regards,

    goodguy

    Experience is a bad teacher whose exams precede its lessons

  • Hi goodguy. Yes, combining all the locations at a query level would be an option.

    What I am trying to express here is that the location data for each client is like a tree structure. For each client, If I could represent each node only once that would save a lot of repetition. Re: my earlier sample data.

    Client AAAA has locations with the following structure:

    london is the parent of 34 the stand and 10 old kent road

    34 the strand is the parent of Floor 1 and Floor 2

    Floor 1 is the parent of Managers office and Dep managers office

    Floor 2 is the parent of open plan office

    10 old kent road is the parent of open plan office

    Client BBBB has locations with the following structure:

    1234 is the parent of London

    London is the parent of The Font Building

    The font building is the parent of Kitchen

    Thanks:-)

  • This is how I would have implemented the front-end:

    Offer the user auto-complete suggestions by concatenating location text entered so far and retrieving ALL records from db where columns match, to ensure data accuracy and user speed). This will also ensure that a particular location entry is ALWAYS saved to a specific column and not scattered across different columns.

    Hope I have helped.


    Regards,

    goodguy

    Experience is a bad teacher whose exams precede its lessons

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply