Naming Conventions - Table and Column Names

Question

Naming Conventions - Table and Column Names

BethF

Mr or Mrs. 500

Points: 596
More actions
September 26, 2017 at 1:18 pm

#333016

I'm pretty old school and I've always created tables and columns following a specified naming conventions that doesn't include spaces or underscores and uses camel case. As I set out to design a new data mart that will be used primarily by non-tech people using Power Pivot, Power BI, and other assorted BI tools, I'm wondering if it is time to change the approach. In the past, I've made SSAS my "presentation layer" with business names but with the abundance of all of these new-fangled BI and reporting tools and pretty smart users, I'm wondering if I need to bring it down to the mart/star schema level. Most of the columns would then end up with a name that includes a space. I know it might be more difficult for us dinosaurs who still write T-SQL but we don't code for us, we code for our stakeholders.
I'm interested to hear others' opinion about this. Particularly if there are any performance implications to using column names with spaces.
Thanks in advance for your feedback!

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply

below86 SSChampion Points: 11582 More actions · Answer 1

IMHO, never ever use spaces. Use underscores to make it more readable to the users. I prefer underscores to camel, but I usually follow what the 'standard' is for the company I'm working for. Most users I've dealt with never complained about the name have an underscore versus spaces.

-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.

xsevensinzx One Orange Chip Points: 25560 More actions · Answer 2

I personally use camel case on everything. I really dislike underscores and trying to break the habit of hyphens mostly because all the veteran SQL Server guys hate having to use square brackets to query data (even though I secretly still use square brackets to square up my code anyways).

UserID, ClientID, FactSales, DimProduct, DMClient, I love it all and see no issue using whatever you like on the backend. When it comes to the presentation layer, I normally name it to whatever the end user is looking for. That means, ProductName is going to be converted to Product Name. Metrics may go from TotalRevenue to Total Revenue By Day or whatever they like. It matters to me not because that's the front-end and I'm responsible mostly for the back-end.

In my experience, there is always big debates on what approach you take. The one thing that will always remain true to all, remaining consistent is the most important part to your decision. Chaotic naming conventions weaken the strongest man (or woman).

P.S

Consistently using square brackets didn't save me. This is likely one of the exceptions to the rule. But, I like to watch the world burn. :Whistling:

Arsh SSCertifiable Points: 6381 More actions · Answer 3

Spaces should never ever be used ... Though they don't make a difference technically , underscores make it more readable. Assigning one fixed length module-name as a prefix (Fin_<tablename/viewname>) makes a good practice, the benefits of which can't be anticipated sometimes. Its a lot of relief when one has a need to develop an application for audits /security / licensing purposes especially where you can easily identify objects which follow one naming convention.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 4

I agree with the idea of avoiding both spaces and underscores for most things. Avoiding spaces should be a "commandment". As with spaces, I don't ever use underscores to merely to separate words. I do, sometimes, use a logical grouping "prefix" separated by an underscore from the rest of the column name.

The bottom line for me is to never format column names based on what people want to see on a report or a screen. That's why there are such things as column aliases.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 5

As for the square brackets, I have gotten in the habit of using them more. And, since I write a lot of dynamic SQL, I have made it even more important to use them in the dynamic code. Some of it is due to column names used in quite a few tables. Some weren't reserved words back in SQL Server 2005 but are now. But more importantly are the bizarre names that I have found for some indexes that were created elsewhere and some of code that worked perfectly fine suddenly failed. Imagine an index name in production that looks something like this: <index_name, sysname, ind_test>.

TomThomson SSC Guru Points: 104768 More actions · Answer 6

I try to avoid have anything that needs quoting to be a valid name, and except in extremely rare circumstances (eg something used as a name suddenly becomes a reserved word) that comletely rules out spaces within table and column names (as well as eliminating any need for square brackets or similar punctuation). I've sometimes given table names prefixes that indicate the general area they are relevant to and seperate that from the rest of the name by an underscore - so for example in an an in-room entertainment and other services system for hotels there could be prefixes like Movie_, Music_, OffficeTools_, Restaurant_,RoomService_, Internet_, Directory_, Telephone_, Messaging_, Email_ and so on. As you can see, I will use CamelCase to indicate word boundaries within a prefix, and also for boundaries within the rest of a name. I also sometimes use the same sort of prefixing for column names, and always have a rule that if the same column occurs in two or more tables (by the same column I mean a column representing the same real-world attribue) all those tables must use the same name for it; this forces the use of table aliases in DML statements involving more than one table unless the writer decides to use full table names to qualify column names (which can take up a lot of space on the page or screen in non-trivial DML statements). It also tends to avoid crazy errors resulting from joining on mismatched columns.

Tom

Ed Wagner SSC Guru Points: 287026 More actions · Answer 7

I avoid both spaces and underscores. Back when I did Oracle, underscores were the norm and everything was converted to upper case anyway. Now, I don't have to use them and don't. I really dislike spaces. When I need to write for the end user, I'll use aliases.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

Ed Wagner SSC Guru Points: 287026 More actions · Answer 8

Lynn Pettis - Wednesday, September 27, 2017 10:26 AM
Imagine an index name in production that looks something like this: <index_name, sysname, ind_test>.

It looks like someone used a template to create an index and didn't bother replacing the name. Sigh.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

Steve Jones - SSC Editor SSC Guru Points: 741290 More actions · Answer 9

I would never allow spaces. It creates far too much confusion and potential issues with "ProductID" and "ProductID ". Even "Product ID" can be misread.
Underscores I dislike, but if that's what's there, I wouldn't refactor.

I did see a design presentation that I liked. This made every column name unique, so that we always knew were data was. This was something like:

create table product ( Productkey int , Productname varchar , productdescription ) create table order (orderkey int , OrderProductKey int references Product(productkey) , qty int )

I'm torn on this. Knowing every item is interseting. Not sure of the value, and if you didn't have DRI this could be a mess, but it was interesting food for thought.

I am a bit fan of describing columns more completely with intellisense and 128 char identifiers. Make it easy to understand the column.

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 10

Ed Wagner - Wednesday, September 27, 2017 11:13 AM
Lynn Pettis - Wednesday, September 27, 2017 10:26 AM
Imagine an index name in production that looks something like this: <index_name, sysname, ind_test>.
It looks like someone used a template to create an index and didn't bother replacing the name. Sigh.

Actually, iirc, it was a bit worse than this but it was a good example. The database was in our lab but has since been deleted from the VMs as it was no longer needed so I can't get the actual name. I seem to recall it also contained unmatched square brackets (at least 1). But yes, someone messed up creating that index so I had to modify my code to handle that possibility in the future.

Michael L John One Orange Chip Points: 27079 More actions · Answer 11

Simply, whatever you pick, make sure it's consistent.

I tend to "think of my fellow man". Someone else is going to look at this.

If we've made up a bunch of abbreviations and acronyms as column names, you've created a situation where incorrect assumptions are likely to cause errors, especially if there are non-techies writing reports. Stick to English words, assuming that English is your company's language. You then eliminate the need for a data dictionary, and reduce the number of phone calls asking "where is the phone number for the clients home stored"? Is that in the hm_phn, home, or phhome column?

Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/

drew.allen SSC Guru Points: 77014 More actions · Answer 12

Steve Jones - SSC Editor - Wednesday, September 27, 2017 11:49 AM
I would never allow spaces. It creates far too much confusion and potential issues with "ProductID" and "ProductID ". Even "Product ID" can be misread.
Underscores I dislike, but if that's what's there, I wouldn't refactor.
I did see a design presentation that I liked. This made every column name unique, so that we always knew were data was. This was something like:
create table product ( Productkey int , Productname varchar , productdescription ) create table order (orderkey int , OrderProductKey int references Product(productkey) , qty int )
I'm torn on this. Knowing every item is interseting. Not sure of the value, and if you didn't have DRI this could be a mess, but it was interesting food for thought.
I am a bit fan of describing columns more completely with intellisense and 128 char identifiers. Make it easy to understand the column.

This certainly has advantages, but it also has disadvantages. When creating JOINs, RedGate SQL Prompt will (by default) recommend columns with the same name, which was very handy. If all of the field names are unique, your option is to either not have suggestions or suggestions based on data type, both of which are less than useful.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

TomThomson SSC Guru Points: 104768 More actions · Answer 13

Michael L John - Wednesday, September 27, 2017 1:40 PM
Simply, whatever you pick, make sure it's consistent.
I tend to "think of my fellow man". Someone else is going to look at this.
If we've made up a bunch of abbreviations and acronyms as column names, you've created a situation where incorrect assumptions are likely to cause errors, especially if there are non-techies writing reports. Stick to English words, assuming that English is your company's language. You then eliminate the need for a data dictionary, and reduce the number of phone calls asking "where is the phone number for the clients home stored"? Is that in the hm_phn, home, or phhome column?

That's good advice, but some may find it hard to follow.

There are still people around who learnt to write code when the typical "high level language" restricted them to identfiers consisting of 6 characters, the first of which must be alphabetic, and the remainder alphabetic or numeric (numerics 0 to 9, alphabetic AngloAmerican A to Z and a to z). And even some who began where programming was spaghetti-weaving (that's wiring a plug board, not writing spaghetti code in an early programming language which was easy to use like that (eg Fortran or Cobol or Algol of C of C++ or any assemly language). Even in modern times, most developers have never learnt a declarative language or even an pseudo-declarative language like SQL, and have been brought up on badly designed languages like C++ or C. or VB.

Tom

Steve Jones - SSC Editor SSC Guru Points: 741290 More actions · Answer 14

drew.allen - Wednesday, September 27, 2017 1:53 PM
This certainly has advantages, but it also has disadvantages. When creating JOINs, RedGate SQL Prompt will (by default) recommend columns with the same name, which was very handy. If all of the field names are unique, your option is to either not have suggestions or suggestions based on data type, both of which are less than useful.
Drew

One of the downsides. I tend to want columns that mean the same thing (FKs) to be named the same. I did like the idea of using the table name with the column.