best way to store values with different numbers of columns in a single table

  • I have an application I am working on that pulls data from multiple different applications that can have a different number of columns. (Specifically related to the application's Security groups and permission schemes .)

    I am trying to figure out the best way to store this without having to create new tables in the database for each different application that is added, as the end-user has the ability to add new applications to the list of things to pull data from.

     

    One approach that occurred to me was to use a pipe delimited string , with another pipe delimited string that has the header values and then do a split on the pipe delimited strings, then pivot back to columns and rows, but I am having a hard time actually pivoting it back without having to hardcode each applications pivot.

     

    Another approach that I considered was using FOR XML PATH , but this looks like another case where I have to hard code the structure to parse back from XML to columns and rows.

     

    Is there something I am missing in terms of a better way? Ideally, once the table structure & the method for parsing back to columns and rows are created in the database, I can avoid having to make database changes, and just have to create new reports based on the application being audited.

    with a delimited string approach, I end up with a Recordid, applicationID, Value and key after splitting out the audit record string and the header string.

     

     

  • I think in this case "best" is a relative term.

    There are several ways to do it.  long character separated string values, key value pairs, very wide tables with non-descript column names, possibly other ways I am not thinking of.

    In the end, it depends on how you plan to use the data.  Long character separated string values mean you need to have some way to parse that data when you get it before it is useful.  Key value pairs mean you could end with a lot of rows to dig through to find what you are looking for.

    Very wide non-descript tables I think is probably going to be the worst approach, but there are some cases where you may want that route.  It is a bad approach as you are limiting the number of possible values that can be entered per application.

    We have an application that is used as an "app launcher" where I work and we went with the key value pair approach.  Then to determine if a user has access to run an application, we select applicationName from the table where username = abc.  All applications that come back the user has permissions to run.  There is a bit more logic in it than that as we are using the users AD credentials and  we use AD groups as well as AD users, but that is the basic methodology for it.  If we need a report based on it, we can filter by application or by user/group pretty easily.  Our list of applications is pretty small and doesn't look like it will be growing too quickly anytime soon, so this approach works fine for us.

    For application data (including app data inside the app launcher) we have one or more tables per application.  Sometimes there is table overlap, but if the data is not related, we don't try to force it to be related.  Using the above example with the key value pair, we have one table for application config information (where the app is located, the icon it should have, etc) and one table for app permissions.  We tend to design the table around the purpose.  So in your example, since it is used for application security, we would have that as an apppermission or appsecurity table and store things in key value pairs that would have a foreign key pointing to the application table.  Where possible and where it makes sense, we normalize our data.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • The best way?  Normalize your data instead of trying to store it denormalized.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • We have a system that has something similar. It a database that stores Appliance Efficiency data. There is an ApplianceTypeColumns table that stores the Column Names and data type. There can be an unlimited number of rows that define columns for a specific Appliance Type. Each of the PK values for each column in this table then references over to another table that actually holds data for each column. Lets call this table ApplianceData. Each row in ApplianceData will have its own PK, then it will have a FK called ApplianceTypeColumnID that points to the ApplianceTypeColumns PK. In the ApplianceData table there will be  columns called ValueChar, ValueNumeric, ValueBool, ValueDataTime that will hold the actual data. I basically call this setup a virtual table because it defines logical tables within physical tables.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply