How to study the layout of a database?

  • primitivefuture2006 - Thursday, November 15, 2018 8:26 AM

    Jeff Moden - Tuesday, November 13, 2018 7:24 PM

    primitivefuture2006 - Tuesday, September 18, 2018 6:14 PM

    Please. I am a noob but I really want to learn. Sorry if my question is funny, but I really would appreciate real advice. I am dreaming of landing a new career

    IF there are views and stored procedures that come with the database, read those and map out how the tables are joined.

    Also, if someone did a decent job on the database, check the Foreign Keys (pure gold if they exist) for relationships among the tables.  Although a bit limited and does take a bit of getting used to, you expand a given database (in the Explorer window) and use "Database Diagrams" to help get pictures of tables and their relationships when FKs are present.  If you're really lucky, you might find some that have already been drawn out by the people that created the database.  It's not the best tool in the world but it's a start.
    

    Thanks for the advice! I will explore Data Diagrams today. Are you working with massive databases professionally? I mean, are the databases you work with involve many tables (50+), some tables with hundreds of thousand to millions of rows? Thanks again

    I don't work with databases as large as some folks do but I do work with what some might call "huge" databases (they're not but that's what some call them) of 2+TB.  I have Non Clustered Indexes that are larger than some folks entire database and tables that are larger than some folks entire enterprise server. 😀  There are a whole lot of folks that fit that same category and some that work with databases in the Petabyte range.  This is a good place for someone to learn a whole lot because of the solutions and advice to problems that people post.

    Looking at your other post about "fundamentals" (and I'll get back to that one, hopefully after work), I have to say that you have precisely the correct attitude and level of humility to do very well in the world of SQL Server and T-SQL.  The fact that you've already correctly identified that you "Don't know what you don't know" (even the really good heavy hitters in SQL Server embrace that understanding) and that you practice every day after work (which many of us still do even after working with it all for decades) will help you excel in your studies and in your career.

    There's a caution I forgot to add about the use of data diagrams in SQL Server.  If you make a change there (other than moving tables around on the diagrams to make understanding a bit easier) and save them, you WILL change the underlying objects in the database.  Be careful.

    Like I said, I'll try to get back to your other post on "fundamentals" tonight and add a bit more over the next several days. 

    As for where you're at right now, it's how many of us started.  Most of us don't have the time to become a full time private mentor but this site has some incredible talent to learn from.  It's not just a group of individuals... it's a true community.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Assuming that the relationships have been properly baked into the SQL database schema, then there are tools out there to help you visualise the connections / relationships.  For example, Redgate's Dependency Tracker - https://www.red-gate.com/products/sql-development/sql-dependency-tracker/ - has a 14 day free trial.

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • I think it’s an excellent question 🙂, and I wish more people would ask it. The first obvious question is “what the heck is this thing supposed to do?” If you know that, then you can make some guesses things that should be there. For example, if this schema deals with vehicles, then you’d expect to see a VIN. If it deals with addresses, then you’d expect to see CASS standards applied to the addresses. If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    I also have a suggestion that while you’re inspecting the schema, you do what should have been done at the start and build at least a data dictionary. The thing that will drive you nuts is when the same data element has multiple names. Sometimes this guy is a “customer_id”, sometimes “cust_id”, sometimes “ry1234” or worse. I truly wish people would learn the basic ISO 11179 naming rules; it won’t prevent bad names, but it will at least minimize them. You

    The other thing that will drive you nuts is that the same data element under the various pseudonyms and aliases will be cast in different datatypes. Sometimes he’s CHAR(10), or maybe VARCHAR (50) or maybe he’s numeric. This gets be real cute problem when you try to use views and at various levels of nesting, the data element name changed.
    After that, I’d like to see at least a high-level ER diagram. There are various tools for this.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    I think it’s an excellent question 🙂, and I wish more people would ask it. The first obvious question is “what the heck is this thing supposed to do?†If you know that, then you can make some guesses things that should be there. For example, if this schema deals with vehicles, then you’d expect to see a VIN. If it deals with addresses, then you’d expect to see CASS standards applied to the addresses. If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    I also have a suggestion that while you’re inspecting the schema, you do what should have been done at the start and build at least a data dictionary. The thing that will drive you nuts is when the same data element has multiple names. Sometimes this guy is a “customer_idâ€, sometimes “cust_idâ€, sometimes “ry1234†or worse. I truly wish people would learn the basic ISO 11179 naming rules; it won’t prevent bad names, but it will at least minimize them. You

    The other thing that will drive you nuts is that the same data element under the various pseudonyms and aliases will be cast in different datatypes. Sometimes he’s CHAR(10), or maybe VARCHAR (50) or maybe he’s numeric. This gets be real cute problem when you try to use views and at various levels of nesting, the data element name changed.
    After that, I’d like to see at least a high-level ER diagram. There are various tools for this.

    Wow, this is gold! I had no idea these are some things I must look out for. So far my knowledge and practice is based entirely on sample databases, so I have not encountered any of the issues you described. Do you know of any resources that provides a "check-list" of what I need to look for to ensure the entire database tables and columns are consistent? If there isn't anything like that then I will make such a list on my own and share on this forum for other members to add to the list. Thanks so much for your advises!

  • you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    @JoeCelko, I'd be interested to see a reference which backs this up, if you have one.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations..

    So anybody using a digital calculator for money and currencies is committing a crime ?   😀:sick:

    Ben

  • It's entirely possible the DBA can't help much when it comes to understanding the logical model of the database, especially if it's part of a 3rd party vendor application, because the DBA simply had no role in developing it in the first place. Even the most popular CRM and accounting packages that organizations spend millions of dollars per year to license have database designs that are incomprehensible when it comes to naming conventions and normalization. You may even find that most of the tables in the database are either legacy or just there to support a feature set that isn't relevant to the version of the application you're running.

    But even if the logical database model is deficient in terms of self documenting, the SQL language itself is very declarative, so you can learn a lot about the form and function of a database model by examining the SQL queries contained in views and stored procedures. Also, if you want to know where something like the month end report gets it's data, you can run a SQL Profiler trace while the report is running and capture the SQL queries that way.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Tuesday, November 20, 2018 8:40 AM

    It's entirely possible the DBA can't help much when it comes to understanding the logical model of the database, especially if it's part of a 3rd party vendor application, because the DBA simply had no role in developing it in the first place. Even the most popular CRM and accounting packages that organizations spend millions of dollars per year to license have database designs that are incomprehensible when it comes to naming conventions and normalization. You may even find that most of the tables in the database are either legacy or just there to support a feature set that isn't relevant to the version of the application you're running.

    But even if the logical database model is deficient in terms of self documenting, the SQL language itself is very declarative, so you can learn a lot about the form and function of a database model by examining the SQL queries contained in views and stored procedures. Also, if you want to know where something like the month end report gets it's data, you can run a SQL Profiler trace while the report is running and capture the SQL queries that way.

    Hi Eric,

    Do you know what type of SQL queries I can run to help with this? I tried searching the Microsoft documentation page, but am not sure where I should look for this. Any resource or advice you may have will be highly appreciated

  • ben.brugman - Tuesday, November 20, 2018 5:26 AM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    So anybody using a digital calculator for money and currencies is committing a crime?   😀:sick:

    Rounding might actually be worth writing an article. Good idea! I haven't looked at the current rules in a while, so that would make me do some research. Actually, I've seen someone get in trouble for using a pocket calculator.

    He rounded every item in a long list either up or down (I camber which direction), instead of varying it from item to item. Then he did his summation. The errors accumulated in one direction and messed up the final results. Suddenly, what had been done on the mainframe didn't match his calculations.

    ToEven is also known as 'Banking Rules', it the default used in IEEE 754 floating-point standards, which is why it's the default in .NET.

    Conversely, AwayFromZero is also known as 'Commercial Rounding'. I don't know why it is the default of SQL Server. It's very widely known and SQL has been used in commercial environments.

    Euro conversion gets even weirder. If you go from one non-euro currency to another, you have to convert the first currency to euros, then convert the euros to the second currency. It's called triangulation. The problem is that transactions occur so fast that the rates can change during the calculations - ARRGH :sick:!

    I'm going to assume by now everybody seen the problems with the old Sybase MONEY datatypes. Basically, if you do multiplication and division, it rounds too soon and only carries things out to four decimal places.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 - Tuesday, November 20, 2018 10:28 AM

    ben.brugman - Tuesday, November 20, 2018 5:26 AM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    So anybody using a digital calculator for money and currencies is committing a crime?   😀:sick:

    Rounding might actually be worth writing an article. Good idea! I haven't looked at the current rules in a while, so that would make me do some research. Actually, I've seen someone get in trouble for using a pocket calculator.

    He rounded every item in a long list either up or down (I camber which direction), instead of varying it from item to item. Then he did his summation. The errors accumulated in one direction and messed up the final results. Suddenly, what had been done on the mainframe didn't match his calculations.

    ToEven is also known as 'Banking Rules', it the default used in IEEE 754 floating-point standards, which is why it's the default in .NET.

    Conversely, AwayFromZero is also known as 'Commercial Rounding'. I don't know why it is the default of SQL Server. It's very widely known and SQL has been used in commercial environments.

    Euro conversion gets even weirder. If you go from one non-euro currency to another, you have to convert the first currency to euros, then convert the euros to the second currency. It's called triangulation. The problem is that transactions occur so fast that the rates can change during the calculations - ARRGH :sick:!

    I'm going to assume by now everybody seen the problems with the old Sybase MONEY datatypes. Basically, if you do multiplication and division, it rounds too soon and only carries things out to four decimal places.

    That's great information and I absolutely agree about the problems with using the MONEY datatype in currency calculations but what does this have to do with the "floating-point" calculations you spoke of previously?  What do you use, instead?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

  • primitivefuture2006 - Tuesday, November 20, 2018 10:08 AM

    Eric M Russell - Tuesday, November 20, 2018 8:40 AM

    It's entirely possible the DBA can't help much when it comes to understanding the logical model of the database, especially if it's part of a 3rd party vendor application, because the DBA simply had no role in developing it in the first place. Even the most popular CRM and accounting packages that organizations spend millions of dollars per year to license have database designs that are incomprehensible when it comes to naming conventions and normalization. You may even find that most of the tables in the database are either legacy or just there to support a feature set that isn't relevant to the version of the application you're running.

    But even if the logical database model is deficient in terms of self documenting, the SQL language itself is very declarative, so you can learn a lot about the form and function of a database model by examining the SQL queries contained in views and stored procedures. Also, if you want to know where something like the month end report gets it's data, you can run a SQL Profiler trace while the report is running and capture the SQL queries that way.

    Hi Eric,

    Do you know what type of SQL queries I can run to help with this? I tried searching the Microsoft documentation page, but am not sure where I should look for this. Any resource or advice you may have will be highly appreciated

    Seeing what a view or stored procedure contains, code-wise, is actually fairly easy and doesn't require any software beyond Management Studio (SSMS)  Simply pick a view or stored procedure, right-click it, and choose the option "Script {Item} as -> Create To -> New Query window"  This will generate the code necessary to create the object in question, which you can then review at your leisure.

    So, with a view (which is basically a saved select query) you can see what tables it's joining together and on what column(s), which will give you some idea as to the relationship between those two tables.

    One suggestion I've not seen as well, if you want to be very paranoid about poking around in a "live" database, take a backup of it and restore it to a separate server, so that if you do break something in the database (oops, I dropped the logins table) you didn't break production.  (Note:  Check with your management, etc about this first, it may not be allowed!)

  • frederico_fonseca - Tuesday, November 20, 2018 12:43 PM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

    I'm going to give Joe a pass on this one (lol like he cares whether I give him a pass or not). I think its generally agreed not to store money in floats, unless you restrict the number to having no decimal points, and even then, why not just use integers if you're going to count pennies, etc 

    The closest I found to an actual regulation was this bit on the web:

    https://opendata.stackexchange.com/questions/10346/what-specifications-are-out-there-for-the-precision-required-to-store-money

    I doubt that'll satisfy the folks who just want to catch Mr. Celko typing something wrong on the internet but at least the link chats about the subject.

  • patrickmcginnis59 10839 - Wednesday, November 21, 2018 6:47 AM

    frederico_fonseca - Tuesday, November 20, 2018 12:43 PM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

    I'm going to give Joe a pass on this one (lol like he cares whether I give him a pass or not). I think its generally agreed not to store money in floats, unless you restrict the number to having no decimal points, and even then, why not just use integers if you're going to count pennies, etc 

    The closest I found to an actual regulation was this bit on the web:

    https://opendata.stackexchange.com/questions/10346/what-specifications-are-out-there-for-the-precision-required-to-store-money

    I doubt that'll satisfy the folks who just want to catch Mr. Celko typing something wrong on the internet but at least the link chats about the subject.

    Absolutely agreed on this one.  That's why I asked Joe what the specs actually were (a reference link or two from him would be great) and what he used in such cases.  Way too many people "format" the interim answers way too early in a series of calculations and then wonder why Granny with her 4 function calculator is able to catch "penny errors" on things like mortgage interest calculations, etc, .  Add to that the silent automatic severe reduction in scale that can occur with the Decimal and Numeric datatypes  (which some folks aren't aware of) in SQL Server and you can end up with some real problems. 
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-2017

    Of course, there are also problems with FLOAT datatypes, as well, and so it would be interesting and helpful if he let us know how he solves these types of problems, especially when it comes to meeting the requirements of EU and GAAP regulations.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ben.brugman - Tuesday, November 20, 2018 5:26 AM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations..

    So anybody using a digital calculator for money and currencies is committing a crime ?   😀:sick:

    Ben

    Yes. Once you turn your calculator off, it loses the account balances. This goes against GAAP at least!

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply