After years of waiting, a book about Biml has been published! It’s conveniently titled “The Biml Book” (subtitled Business Intelligence and Data Warehouse Automation) and I can tell you immediately it’s the best Biml book I’ve ever read! (haha, it will take a while until this joke gets old). In all seriousness though, it’s a great book and I happily recommend it to anyone who wants to learn more about Biml. In other words, everyone who has to develop more than 1 SSIS package.
The book has multiple authors (the cover lists 11 authors!) and that is the biggest strength and also the biggest weakness of the book. The strength lies in the fact that this book gathers every Biml expert from all over the world and this is reflected in the book: all of the concepts are clearly explained, from the introduction topics right into the more advanced topics. And there are plenty of advanced topics to choose from: documentation generation, metadata handling, building custom frameworks and so on. I’ve been using Biml for quite some years now (typically just to load staging environments, but I’ve been on Biml since the beginning, I just never went deeper into the rabbit hole) and I definitely learned some new tricks by reading this book (and not only that it is Biml and not BIML).
The downside of having so many authors is that inconsistencies can creep into the book. Most of them are of the “nitpicky” type: some authors use C#, some use Visual Basic, others give examples in both languages. Some use different naming conventions for the Biml files, some format their SQL code differently than other authors (some even don’t use the semicolon as statement delimiter. Shame! :). The most “annoying” inconsistency is the use of sample databases. In some chapters AdventureWorks is used, other chapters use Contoso. Personally I’d hoped they all used WideWorldImporters.
That being said, the book is expertly written (only one typo found) and I’ve only found a couple of bugs in the Biml code. The largest criticism I have has nothing to do with the actual content: at the time of writing, there are no downloadable files for this book. This means if you are using an e-book, you need to copy-paste the Biml code to Visual Studio. Copy pasting from a book can lead to several issues, such as one line of code suddenly being split out over multiple lines, page numbers being copied along or whitespace being inserted where it doesn’t need to be. For example, copy pasting <Biml> something resulted in < Biml> (with an extra space) which breaks the code. After that, you have to format the XML code as well (but there are tools for that). It’s even worse if you are reading a print copy, because that means you have to type all of the code yourself. I’ve talked with the authors though, and they are working on this, so this is hopefully only a temporarily grievance.
Enough about editorial stuff, let’s look at the actual content. The book explains Biml in three parts:
- An introduction to the Biml language
- Building frameworks with Biml (the most interesting part if you have already written some Biml)
- Other advanced topics
Some chapters include content about BimlStudio. Personally I don’t use the tool, so this meant I could skip some pages. If you do use the tool, there are some very nice features waiting for you. An overview of the chapters:
- Introduction. An explanation of what Biml is and what you can use it for. There are some code examples to show you what Biml can do. If you are already familiar with Biml, you can skip it.
- Chapter 1 – Biml Tools. An overview of BimlExpress (the free Visual Studio plug-in), BimlOnline and BimlStudio. Very useful if you’d like to know what you can do with BimlStudio and if it is worth the purchase (spoiler: the features are impressive).
- Chapter 2 – Introduction to the Biml Language. Still useful if you’ve worked with Biml, as it also explains code nuggets and code directives.
- Chapter 3 – Basic Staging Operations. Useful for people starting out with Biml, as building staging environments is straight forward and the number one use case for getting start with Biml. The chapter is divided in two piecies: one example with BimlStudio and one with BimlExpress. If you don’t use BimlStudio, the chapter is only half the length now
- Chapter 4 – Importing Metadata. A crucial operation in your Biml framework. It explains the differences between ImportTableNodes, ImportDB and GetDatabaseSchema really well. It also touches on loading metadata for flat files and Excel files.
- Chapter 5 – Reusing Code, Helper Classes, and Methods. An overview of how to make your Biml code more reusable. Excellent chapter for the intermediate Biml developer.
- Chapter 6 – A Custom Biml Framework. One of the most important chapters of the book. It explains how you can store your metadata in a database and use this to generate all your SSIS packages through Biml. It dives into the concept of annotations and how they can make your solutions more robust.
- Chapter 7 – Using Biml as an SSIS Design Patterns Engine. A very useful chapter. It explains how you can store separate patterns (such as “truncate&load” and “incremental load”) in separate Biml files. By using metadata, you can apply a different pattern on a package. Very useful.
- Chapter 8 – Integration with a Custom SSIS Execution Framework. This chapter mainly builds on the open-source SSIS Framework Community Edition as an execution framework. If you don’t use it (I have my own framework for example), you can skip this chapter.
- Chapter 9 – Metadata Automation. This chapter is mostly about the metadata models/instances you can create inside Biml. I’m not 100% convinced why this is such an advantage to store the metadata twice (once in Biml, once in your database or whatever framework you use). Maybe for generating documentation? Read the chapter and decide for yourself
- Chapter 10 – Advanced Biml Frameworks and BimlFlex. Mostly about bundles and BimlFlex, which are both BimlStudio features. I skimmed this chapter.
- Chapter 11 – Biml and Analysis Services. Also a BimlStudio feature, so I skipped the whole chapter.
- Chapter 12 – Biml for T-SQL and Other Little Helpers. This chapter talks about how you can use your Biml metadata to generate all kinds of T-SQL statements. Interesting.
- Chapter 13 – Documenting your Biml Solution. Interesting chapter, but if you use BimlExpress you have to generate the documentation yourself. If you want HTML, this means translating the Biml metadata into XML-type documents. There are some examples of various methods for this in the chapter.
- Chapter 14 – Troubleshooting Metadata. I’m not sure why it’s called troubleshooting as it’s all about collecting metadata. It’s about how you can use the INFORMATION_SCHEMA views to collect metadata of the RDBMS for example. In my opinion, this chapter should be combined with chapter 4.
- Chapter 15 – Troubleshooting Biml. A short chapter on how you can debug your Biml code. It appears I’m the only one who uses MessageBoxes Very useful chapter, must read.
- Appendix A – Source Control. I already know how to use TFS in Visual Studio, so I skipped this appendix.
- Appendix B – Parallel Load Patterns in Biml. Basically a short example of how to use the Balanced Data Distributor in Biml.
- Appendix C – Metadata Persistence. A very long piece that builds upon the metadata models described in chapter 9. Lots of code. As mentioned before, I was not convinced of the short-term usefulness of this, so I really skimmed this appendix.
As you can see, a wide variety of topics are covered in this book, so there’s something for every level of Biml developer. The book is mainly about “how does this work in Biml” and there aren’t that many examples of “how can we achieve this task in Biml” (with chapter 3 being an exception). For example, you won’t find an example of how to load a star schema in this book. The book has all the building blocks, but it’s up to you to put them together. Maybe a next book “Data warehouse patterns with Biml” will be a good addition? A book where it’s shown how to load SCD type 2 dimensions, how to generate data vaults, how to handle large fact tables and changing metadata and so on.
Conclusion: a very good book that lays a great foundation for developing Biml frameworks. A must-read for every SSIS/ETL developer.