February 8, 2023 at 3:42 pm
Jonathan AC Roberts wrote:ScottPletcher wrote:RonKyle wrote:I would not prefix a view with tbl. In any case I rarely use views in an OLTP environment. I do make extensive use of them in my OLAP designs per Kimball's suggestion from his 3rd edition book.
No, of course you don't deliberately prefix a view with "tbl". You create a table, wrongly prefix it with "tbl", and then later that table needs to become a view instead. No way you're going to rename everything, so now the name is misleading from then on.
Or vice versa: a "view_whatever" needs to become a table instead.
It's a terrible idea to put "tbl" or "view" in the table name (ditto for "proc" on general principles). It is a very poor naming practice.
I agree about not prefixing table names with anything but sometimes it could be useful to prefix views with something, maybe "v_". Then if you are looking at a complex query someone else has written you would know where the data is coming from. Microsoft prefix most of their system stored procedures with "sp_".
"V_" still a terrible idea.
I prefix procs I need to put in the master db with "sp_", otherwise they don't the "special" property that the "sp_" stands for. "sp_" does not stand for "stored procedure" as everyone assumes. It allows procs in the master db to by default run in the current db context, for example, sp_help.
I never prefix a view with anything when I design it but I have worked on systems where all views have the same prefixed and I've never found it to be much of an issue.
Thanks, I didn't know that about the sp_ procedures.
February 8, 2023 at 5:37 pm
>> I'd say ZERO chance that the professor id is some type of tax number. The security implications of that alone would prevent it. For many, their tax number would be their Soc Sec Number -- only an IDiot would want to use that a key that the entire IT staff would be privy to. Unless they were issued a fed tax number (EIN), but I don't see why most university profs would need those. <<
Having actually been a college professor (several times at several different schools) , I know that we do use the SSN in personnel systems. The issue of security is handled in a different level. The entire IT staff is not privy to personnel records. We create views, masking, other levels of access. Nobody spends any learning DCL; a lot of SQL programmers don't even know that it's the third sub language! In fact, the first draft of X3H2's SQL allowed you to GRANT, but not to REVOKE privileges from users. We were just trying to get out a standard as fast as possible, to keep SQL from mutating into dialects like BASIC had done.
One of my favorites from my AIRMICS days was an approach created by the U.S. Navy. It actively lies to users depending on their clearance. Jimmy Olsen is shown information that leads him to believe that "Clark Kent is a mild-mannered reporter for a great metropolitan newspaper." However, Lois Lane can find out that "Clark Kent is secretly Superman!" For financial systems, the security system allowed only aggregations to be reported and it would also mess them up. Thus, nobody can get to their own payroll information and give themselves an unexpected raise.
There was an old Doonesbury cartoon strip where one of their characters was working for a company was very proud of the fact that when they embezzled a certain amount of money, the system would automatically move it to a Swiss bank account and additionally book a flight to Zürich without human intervention. AI at its finest!
Please post DDL and follow ANSI/ISO standards when asking for help.
February 8, 2023 at 6:51 pm
>> ... joining hundreds of millions or billions of rows, dealing with big Cartesian products or when involving things like column store indexes. I've experienced it myself extracting data from a database that used GUIDs for primary keys, artificially generating an integer to replace the guid keys, then extracted into another table greatly sped up queries with many joins. <<
People seem to forget that GUID or UUID stand for "global" or "universal" identifiers. We never intended them to be used inside the database. They were supposed to reference objects outside of the database. But when your design stinks and you need a kludge, you can use one of them without any thought, planning or design. When I first moved to Austin, I worked for a company that used GUID in some of their software products instead of a true relational key. All they wanted was a "unique-ifier", like a pointer was in the old databases., The results quickly became unworkable because you're always having to do garbage collection. Before you knew it you had exceeded disk space. Your trick of replacing the GUID with some integer just adds another layer of complexity – – how did you tie each integer back to a particular entity? How did you make sure you never use the same integer twice?
The IDMS and IMS database systems that you're imitating this way had utility programs to rebuild the pointer chains, and do garbage collection on them. If you worked with the C programming language, you probably have seen a simple version of this with malloc. In a multiuser database, it can get very complicated.
Please post DDL and follow ANSI/ISO standards when asking for help.
February 8, 2023 at 7:46 pm
>> This is not a design error. It does not violate the normal forms. <<
Normal forms are not the only kind of design error. ISO 11179 and the Metadata Committee have other things that they don't like. In particular, the idea is that a data element name should tell you what the element is by its nature. It should not identify how it is used in one particular instance. As Chris Date says, in RDBMS we want to know "What" and not "how".
>> I might have preferred tblNameOfTable rather than TBL_NameOfTable. So-called tibbling helps avoid reserved words and keeps from having to identify a table as a lookup or some such to avoid another table that would otherwise have the same name but a different function. <<
These data element names should be used in places other than SQL as well. That means they have to fit into software packages, programming languages and whatever else comes along. They will all have different rules for capitalization and allowed characters. Thanks to Unicode, we pretty much agreed that a name should be made up of Latin letters, digits, and a minimal set of punctuation marks that include the underscore.
When the University of Maryland was doing readability studies for the US Army, we found that "tblNameOfTable" is a bad idea. Your eye has been trained from the Latin alphabet to look for an uppercase letter as the start of something – a sentence or word or paragraph. Using that format you just introduced an eye tick into scanning the word. Thanks to centuries of reading typeset material and using lined paper, we are not bothered by an underscore.
The ISO preferred format is <attribute>_<attribute property>, with some other options that I discussed in articles and books. For example, we might have "student_id", "student_hatsize", "student_admission_date" etc. but we know that we can't have "date_of_birth" since it's too generic. We also know that "_date" is an attribute property of something. It's dull as hell, the get a copy of ISO 11179.
Please post DDL and follow ANSI/ISO standards when asking for help.
February 8, 2023 at 8:18 pm
The use of the prefixed "TBL_" is a design error so bad it has a name; it's called the tibble.
This is not a design error. It does not violate the normal forms. I might have preferred tblNameOfTable rather than TBL_NameOfTable. So-called tibbling helps avoid reserved words and keeps from having to identify a table as a lookup or some such to avoid another table that would otherwise have the same name but a different function.
It is a design error. It's just a physical design error rather than a logical design (such as are used to derive normal forms) error.
Give other tables a proper, meaningful name without using tbl. It's really not that hard to do.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 8, 2023 at 8:46 pm
>> ... joining hundreds of millions or billions of rows, dealing with big Cartesian products or when involving things like column store indexes. I've experienced it myself extracting data from a database that used GUIDs for primary keys, artificially generating an integer to replace the guid keys, then extracted into another table greatly sped up queries with many joins. <<
People seem to forget that GUID or UUID stand for "global" or "universal" identifiers. We never intended them to be used inside the database. They were supposed to reference objects outside of the database. But when your design stinks and you need a kludge, you can use one of them without any thought, planning or design. When I first moved to Austin, I worked for a company that used GUID in some of their software products instead of a true relational key. All they wanted was a "unique-ifier", like a pointer was in the old databases., The results quickly became unworkable because you're always having to do garbage collection. Before you knew it you had exceeded disk space. Your trick of replacing the GUID with some integer just adds another layer of complexity – – how did you tie each integer back to a particular entity? How did you make sure you never use the same integer twice?
The IDMS and IMS database systems that you're imitating this way had utility programs to rebuild the pointer chains, and do garbage collection on them. If you worked with the C programming language, you probably have seen a simple version of this with malloc. In a multiuser database, it can get very complicated.
Using GUIDs (randomly generated, not sequential) as a clustered primary key used to be a trick to get SQL to write data randomly within the clustered index for applications that heavily fragment their data in order to reduce bad page splits. Over time, the clustered index will reach an approximately neutral state of fragmentation then whenever a new record is added, it has a chance of writing to empty extents in the database. This was certainly the case with this app. Each incident would be inserted totally empty other than audit attributes, many incidents could be started empty at the same time then data rapidly and regularly updated, some fields set to a wide value then empty, etc before eventually being closed. I doubt that it matters much anymore with SSD.
For generating the integer to use as a primary key to replace the guids - it never needed to be able to able to go back to the source system 1:1. The data product was used for some statistical analysis and for mapping in GIS, only used one time for daily analysis, the report saved and then the data discarded. If the analysts wanted to drill down, they could go back to the system to look up the incident by incident number, but it wasn't perfect. Their primary need was to make a map and do analysis on incident time and type. During daylight savings time, at least a few incidents would get created with incorrect incident numbers and during test failovers to DR sometimes the time of the application server would get off then create incorrect incident numbers. Eventually the incident numbers would get fixed but the analysis of how the incidents in a 24 hours period related to each other were more important than the precise details of each incident.
Viewing 6 posts - 31 through 35 (of 35 total)
You must be logged in to reply to this topic. Login to reply