What Counts for a DBA: Skill

Question

What Counts for a DBA: Skill

Louis Davidson (@drsql)

SSCommitted

Points: 1664
More actions
August 14, 2021 at 12:00 am

#3919398

Comments posted to this topic are about the item What Counts for a DBA: Skill

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

ian.jones-776848 Newbie Points: 2 More actions · Answer 1

Nice article Louis, but it's missing one of the other essentials steps on the road to becoming an expert (or just less dangerous) i.e. test whether somebody else's brilliant idea actually works. There's nothing like the internet for spreading a really bad idea, and coding (of all sorts) is no exception. But real knowledge comes when you take something that works and push it (apply it everywhere inadvisably) until it breaks to find out when it's useful and when it's not. Experimentation like this isn't encouraged on any taught course (that I know of/have encountered), or in most work places, but you can sneak it into a vague specification, or testing...

Rod at work SSC-Dedicated Points: 34109 More actions · Answer 2

Excellent article, Louis. I especially like the idea you're presented of the difficulty in finding good training on a skill. Too often, when I'm trying to learn a new skill (to me) I'll come across some websites that claim to train you in the skill, promising their instruction to be the best on the planet. But then you just learn what somebody knows of it and nobody else uses that technique. Not helpful.

Kindest Regards, Rod Connect with me on LinkedIn.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 3

Great article, Louis.

I've found that you have to be very, very careful in studying. It's really spooky out there.

For example... I've seen several articles from people that supposedly know what they're doing that said using recursive CTEs is a best practice set-based method for replacing while loops that operated based on single-row incremental computations. I've seen people create "Holy Grail" articles that "prove" that XML splitters will beat Tally Table-like splitters hands down. And almost the whole world (including the previous me) adopted some "Best Practices" for index maintenance more than 20 years ago because of some super-simple, super-unfortunate miswording in official Microsoft-supported documentation that has also mistakenly poo-poo'd the use of a remarkable datatype, its remarkable uses, and some incredible methods for performance. How about the "Best Practice" of doing things like lowering a Fill Factor until fragmentation stops, always using IDENTITY columns for virtually everything. or even defragging indexes just because they have logical fragmentation (which should never be done if they have a "0" Fill Factor, BTW). Of course, it's a "Best Practice" to not use Random GUIDs on a Clustered Index, right? Use NEWSEQUENTIALID() instead, right?

And then there are the "expert" changes that have been made by Microsoft such as the equivalent of the permanent invocation of Trace Flag 1117 on TempDB with no way to disable it even on a necessary temporary basis or how about LOBs defaulting to in-row which permanently destroys page density of the Clustered Index. Yeah... both of those are just two examples of how supposed "Best Practices" have put the screws to people without them even knowing about it.

I agree that forum sites like SSC (not SO) are wonderful for learning because you get the best of both worlds and those are real life problems and multiple solutions along with discussions that explain the pro's and con's of everything, even the original question itself.

With all that, the only way to flourish is to remember to doubt opinions enough to test them with code and doubt other people's tests and the data they used to conduct the test. Remember that if the test data doesn't look like production data, the code might fail miserably when it does go to production no matter how many tests you've run. A classic example of this is poor people that have been duped into thinking the use of what people call "XML Splitters" to split (for example) is better than Tally-based splitters. Several "Holy Grail" articles have been written on the subject where the tests "prove" it. The problem is, the test data used even on million row test table has a cardinality of somewhere between 1 and 10 because the authors either didn't know how to build life-like test data or were too lazy to. It turns out that XML absolutely shines on large rowcounts that have a super low cardinality. That's usually not anything like what your real data looks like.

So, question everything, learn how to do your own tests, and learn how to build lots of test data correctly. That will help you to develop two of the most important skills a DBA can ever have... very strong bullshit baffles and the ability to prove it in no uncertain terms. 😀 There's only one way to get there... taking the time to practice.

If you don't spend the time, then you should add "victim" to your resume. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

will 58232 SSC-Addicted Points: 479 More actions · Answer 4

There is a great deal of misunderstanding about relational DBMSs and you will find a lot of this misunderstanding out there on the internet.

For example, people talk about a "relational store". This doesn't make sense. One of the most important innovations that the RDBMS introduced is that the logical level is completely separated from the physical implemention. You can change the method of physically representing the data completely without changing the logical level. SQL is a logical API to the underlying physical structure.

Some programmers struggle with learning to think declaratively. I understood the advantages very quickly but it took me while for it to become my natural way of thinking about programming. If you have programmed your own nested loops in COBOL then a join statement in SQL immediately looks like a huge productivity benefit.

Likewise it was a struggle at first to understand the difference between a variable in the accustomed programming sense - and one in the declarative sense - it seems at first a little strange to think about a variable that can't change.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 5

will 58232 wrote:

There is a great deal of misunderstanding about relational DBMSs and you will find a lot of this misunderstanding out there on the internet.
For example, people talk about a "relational store". This doesn't make sense. One of the most important innovations that the RDBMS introduced is that the logical level is completely separated from the physical implemention. You can change the method of physically representing the data completely without changing the logical level. SQL is a logical API to the underlying physical structure.
Some programmers struggle with learning to think declaratively. I understood the advantages very quickly but it took me while for it to become my natural way of thinking about programming. If you have programmed your own nested loops in COBOL then a join statement in SQL immediately looks like a huge productivity benefit.
Likewise it was a struggle at first to understand the difference between a variable in the accustomed programming sense - and one in the declarative sense - it seems at first a little strange to think about a variable that can't change.

You said a mouthful there!

I wrote my own "mini database" a long time ago. I didn't know it at the time but I'd written kind of a "Clustered Index" (non-B-Tree but similarly keyed and contained the data) and kind of several Non-Clustered Indexes that referred back to the "Clustered Index". I also wrote a special sort that beat the hell out of "Shell Sort" and all the other sorts that I could find. I based it on how the old IBM card sorter worked. At the time, that type of sort didn't even show up in books or on the early internet.

Then my boss hired a consultant on another project and he introduced me to SQL Server (6.5 at the time) and I made a very early realization that has made things a whole lot easier for me than what most go through. I made the realization that a SELECT was a "macro" for opening a file, setting up what I wanted to read from the file, starting a loop, and reading from the file until it ran out of rows that didn't meet the condition of what I wanted to read.

I was immediately in 7th heaven because I no longer had to write actual code or loops or anything like I used to have to do. You should have seen the joy in my expression when I did my first ORDER BY. And then there was the dates! I had previously written code to convert a date to a "date serial number" with 1900-01-01 as the base date of "0". Heh... and there it was in SQL along with some date functions to do all the stuff I'd written code to do! Oh, the things I could now do without having to write ton's of code for!

As an "ex-data-programmer", I have no idea why people have such a difficult time in understanding that a SELECT, at it's most basic, is nothing more than a "macro" for all the junk we used to have to do to read data from files. R.Barry Young, a former champion from this site, referred to them as "Pseudo-Cursors" and I've embraced that term since the first time he spoke it on this site.

I think that a huge part of the reason why new and even some experienced people have such a difficult time with SQL is because they either don't know how to flowchart at the functional level in their minds or they don't make the connection that SELECT is just a box on one for previously complicated code that would open a file, do a whole bunch of loops to do something, and then close the file.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)