February 26, 2025 at 12:00 am
Comments posted to this topic are about the item Why use Tally Tables in the Fabric Warehouse? Data Engineering in Fabric
John F. Miner III
Microsoft Data Platform MVP
a.k.a. - The Crafty DBA
February 26, 2025 at 2:20 am
Hi, John. Gosh... it's been a long time since we've seen each other in person.
Nice article and thank you for the honorable mention.
Just to add a bit of a clarification, I haven't used an actual Tally Table in a very, very long time and I haven't populated one "in production" using references to any tables in even a longer period of time. Instead, I've used a "Moden-ized" variation of Itzik Ben-Gan's GetNums iTVF (inline Table Value Function), which is based on what I call "cCTEs" (Cascading CTEs and not to be confused with recursive CTEs or rCTEs), which also have the nice property of being read-less.
My big question, though, is any sequence generator made by "us" still a requirement? Since FABRIC is supposed to be the latest and greatest, are you telling us that it doesn't have anything like a built-in "Generate_Series()" function?
--Jeff Moden
Change is inevitable... Change for the better is not.
February 26, 2025 at 9:10 am
Forgive (unwashed!) isn't generate_series() available?
February 26, 2025 at 3:53 pm
Yes, Generate Series() function was add in SQL Server 2022. Waiting for Steve Jones to add a PS at the bottom to include this detail. I did not do testing with it since I wanted to show that systems tables can not be used in distributed queries. I suggest you do your own research on timings and report back. Thanks - J
John F. Miner III
Microsoft Data Platform MVP
a.k.a. - The Crafty DBA
February 26, 2025 at 3:58 pm
Hi Jeff,
I think the last time we chatted in person, Grant, you and I were in a class room talking about how to manipulate bit patterns form a Application Specific Integrated Circuit that was stored as a var binary data type.
I actually have been doing a lot more work with Data Lakes and Data Warehouses now adays. The Generate Series was an oversight of mine since it has been a while since they allowed me to work with SQL Server.
I am curious how it performs compared to the tally table. Regardless, RBAR is extremely painful in Fabric since each insert is translated into a parquet and json file (delta file format).
If you are ever in New England, give me a ring. We will grab a beer and seafood down in East Greenwich. It is beautiful place in the summer.
Take Care,
John Miner
The Crafty DBA
John F. Miner III
Microsoft Data Platform MVP
a.k.a. - The Crafty DBA
February 26, 2025 at 8:50 pm
I am curious how it (GENERATE_SERIES function)performs compared to the tally table.
It's nasty fast. It's not as flexible as you'll find in other flavors of SQL because it "only" returns numeric sequences instead of other things like a sequence of dates and/or times but that's an easy leap to make for most folks to make.
The is one very nasty problem with it for me because I build a whole lot of test data. Please see the following article on the subjects of test data generation and "Minimal Logging", which can also cause issues in production if you try to use it with "Minimal Logging".
https://www.sqlservercentral.com/articles/generate_series-breaks-minimal-logging
Needless to say, I won't be dumping my fnTally function anytime soon.
Regardless, RBAR is extremely painful in Fabric since each insert is translated into a parquet and json file (delta file format)
Heh... gotta love it. What a great way to punish those that use RBAR.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy