SQLServerCentral.com Community Update for {date}

Problems displaying this newsletter? View online.

Featured Contents

Question of the Day

Data Migration Assistant Targets

The Voice of the DBA

The Load of Real Time Data Warehouses

If you have a data warehouse, what do you think your ratio of reads to writes is on any given day? Do you think 1:1, as in one read for each write? Is it 10:1, with 10 reads for each write? 100:1? Do you track this in any way?

One would think that most of the databases we work on in the transactional world have many more reads than writes. I'd have assumed the ratios might be higher for data warehouses, where we load data that is queried (read) as the primary use case. After all, I expect that there are lots of people querying data that is loaded into this warehouse, with relatively few changes.

I saw a presentation recently of a paper from Amazon that analyzed Redshift workloads (Why TPC is Not Enough). Redshift is the Amazon AWS data warehousing platform that has proven to be very popular in many companies as a cloud warehouse. One interesting piece of analysis is that there are more reads than writes, but barely. There is a lot of insert/update/delete activity, which is different than the TPC benchmarks for warehouses (TPC-H and TPC-DS). Those benchmarks tend to be 75+% reads. Hence the paper and analysis.

However, Redshift isn't like that in the real world. I had a chance to chat with one of the authors, and with another ETL specialist and they noted that the current pipelines that many companies use aren't running in batch mode, but run in constant mode, moving data almost constantly. Combining this with relatively few queries from smaller groups of people results in a fairly close ratio of 1:1 for reads to writes. That was quite surprising to me.

Presumably, some of this is because people will run relatively large queries for a report, and then spend time analyzing the data while the ETL/ELT processes continue to move data to the warehouse. Much of the design for storing data in warehouses, especially columnar ones, is with the idea that the data doesn't change much. Certainly, columnstore indexes perform better when they are being read, not necessarily when they are updated.

I wonder how much of this architecture of constant writes and updates has driven the world towards a Lakehouse architecture where data is written out and then transformed into cleaner versions that are copies, albeit some of them smaller than the original. Often this data is also written to separate delta/parquet files as well, which means new writes usually occur to discrete objects rather than a managed database structure. From a logical thought standpoint, that seems to make more sense to me, even though I still tend to think most of us could use a relational structure for warehousing data.

If you have a real-time warehouse being updated, perhaps you want to measure your ratio of reads to writes and possibly rethink how to manage the system. If you don't have a specific warehouse, which is most of us, you probably like querying the real-time updates of your transactional data in the same place as it's stored. Either way, it's interesting to think about the impact of a real-time workload on the performance your users experience.

Steve Jones - SSC Editor

Join the debate, and respond to today's editorial on the forums

Featured Contents

How to Delete Large Amounts of Data in Microsoft SQL Server

AR from SQLServerCentral

A short look at some of the options for deleting lots of data from a SQL Server table.

Introducing the PASS Data Community Summit Keynotes from Microsoft, Redgate, and a Community Star

Additional Articles from PASS

After meeting the stellar pre-con speakers and exploring the 5 learning pathways at #PASSDataSummit 2024, we’re excited to reveal the keynote lineup. Each day will kick off with inspiring keynotes from Microsoft, Redgate, and a special community-driven session led by industry experts. Don’t miss out on these insightful sessions designed to energize and empower you throughout PASS Summit 2024.

Uniqueness, Keys, and Identity

Additional Articles from SimpleTalk

If you’ve ever had a traditional logic course, you’ll have run into “The Law of Identity” as the founding principle of all Western thought. It says that: “To be is to be something in particular; to be nothing in particular or anything in general, is to be nothing at all.”

From the SQL Server Central Blogs - Getting Started with Azure SQL Database – A Beginners Guide

John Morehouse from John Morehouse | Sqlrus.com

Last Updated on September 24, 2024 by John Morehouse Years ago, I saw the wave of the future and that wave was Cloud. The Cloud is not a fad...

From the SQL Server Central Blogs - SQL Query Stats via Powershell

alevyinroc from FLX SQL

Sometimes, work tasks and questions that come up in the SQL Server Community Slack coincide.
A couple weeks ago, DrData asked
With SET STATISTICS IO ON, is there really no way...

The Definitive Guide to DAX: Business Intelligence for Microsoft Power BI, SQL Server Analysis Services, and Excel

Site Owners from SQLServerCentral

Now expanded and updated with modern best practices, this is the most complete guide to Microsoft’s DAX language for business intelligence, data modeling, and analytics. Expert Microsoft BI consultants Marco Russo and Alberto Ferrari help you master everything from table functions through advanced code and model optimization.

Question of the Day

Today's question (by Steve Jones - SSC Editor):
	Data Migration Assistant Targets
In the current (2024) version of the Data Migration Assistant, which on-premises targets are available?
Think you know the answer? Click here, and find out if you are right.

Yesterday's Question of the Day (by Steve Jones - SSC Editor)

How are statistics stored?

How are the statistics for query optimization stored in SQL Server?

Answer: As a BLOB

Explanation: Statistics are stored as a BLOB, and include a histogram in the object. Ref: Statistics Components and concepts - https://learn.microsoft.com/en-us/sql/relational-databases/statistics/statistics?view=sql-server-ver16#DefinitionQOStatistics

Discuss this question and answer on the forums

Database Pros Who Need Your Help

Here's a few of the new posts today on the forums. To see more, visit the forums.

SQL Server 2016 - Development and T-SQL

Help with index rebuild - Hi there! This should make some eyes roll for some of you DBA's out there. Disclaimer: I am not a DBA and shouldn't be trying to play one. I desperately need help in trying to remedy a problem I caused on our 2016 SQL Server (Std Edition). We realized that the weekly maintenance job hadn't […]

SQL Server 2019 - Administration

sql service account - We are trying to update a sqlservice account from a user account to a domain account. The service runs fine under the user account but when we update it to a domain account we face the following error and the sqlservice does not start. Error: SQL Server is attempting to register a Service Principal Name […]

Calling Invoke-Sqlcmd from powershell and passing an object containing a '-' - Invoke-Sqlcmd -Query "ALTER AVAILABILITY GROUP $AAG REMOVE DATABASE $DB " -ServerInstance $Instance Invoke-Sqlcmd : Incorrect syntax near '-'. Msg 102, Level 15, State 1, Procedure , Line 1. Hello all, As you can see from the sample above $AAG contains 'MY-AG' (not the real AG name but the hyphen is in the name) results in […]

SSMS V20 encryption vs SQL Server SSL Encryption - I assume SSMS V20 encryption and SQL Server SSL encryption are different. I have a few queries. Please see below. Do these two encryptions conflict? Can we make 'SQL Server SSL Encryption' optional? So we can control which applications to bypass SSL and force others? Is there any other things we should be aware of […]

List of objects that require VIEW SERVER STATE permission - Hi all, I need a list of objects (DMV's and probably other objects) that require VIEW SERVER STATE permission. Our DB security team is asking me why I need it. I can only recall sys.dm_exec_requests from the top of my head. Will appreciate of somebody would paste a whole list. Thanks

sql monitoring tool - Has anybody had any experience with Nagios as a SQLServer monitoring tool. What were the pros and cons you had faced while using it. Please advise.

SQL Server 2019 - Development

max and min dates - Hi I am trying to get the first and last days for the following where date_next_compare is less than 28 but having more than that would mean the max date for that range. Sorry if isn't clear. I currently have this: I want to see two rows like below. I tried doing min and […]

Pivot Mistake - Hi all, I have this query where I make a pivot to extract the data segmented by Hours: select ‘Amount’ as Amount, [0] ,[1] ,[2] ,[2] ,[3] ,[4] ,[5] ,[6] ,[7] ,[8] ,[9] ,[10] ,[11] ,[12] ,[13] ,[14] ,[15] ,[16] ,[17] ,[18] ,[19] ,[20] ,[21] ,[22] ,[23] from (select sum(efeneg) ‘Amount’,left(hormsg,2) ‘Time’,left(hormsg,2) ‘Time’,left(hormsg,2) ‘Time’. from […]

SQL Azure - Administration

Azure Managed Instance Backup Schedule with SQL Job - Lets do configure backups in my own in Azure Managed Instance......My Lead has directed me to configure regular backups scheduled as like in traditional SQL Server...Please suggest how to make this possible to create Job that will take backups daily/weekly in Azure Managed Instance.

General

What is Azure Data Studio or GitHub asking me? - I have an Azure Data Studio question (or maybe it's a GitHub question, I'm not sure), but I couldn't find a forum here to ask it. So, please excuse my asking it here. I am busy writing a T-SQL script, using Azure Data Studio. I committed my changes locally, then I tried pushing them to […]

Analysis Services

Cannot connect to Analysis Services via Excel - For myself this seems to work and for other users they receive this error: Errors in the OLE DB provide. We could not establish a connection to the "Server/Instance" Analysis Services server. Please verify that you can connect to this server, and that the SQL Browser redirector service is enabled. This a 2016 SSAS Tabluar […]

SQL Server 2022 - Administration

Memory grants pending root cause - Recently I got sql alerts from one of QA servers, every 2 or 3 hours. DESCRIPTION: The SQL Server performance counter 'Memory Grants Pending' (instance 'N/A') of object 'Memory Manager' is now above the threshold of 0.00 (the current value is 1.00).COMMENT: Ideally there should be no processes waiting for memory and this should = […]

SQLExternalMonitoring what is it in sp_who2 result - If I run sp_who2, one of item in the list is showing using program SQLExternalMonitoring , what is it? it uses quite a bit cpu and ram. 70 sleeping NT AUTHORITY\SYSTEM SNOCSMASQP01 . master AWAITING COMMAND 170846212 57458731 09/26 10:31:41 SQLExternalMonitoring 70 0 Thanks,

Linked Server to SAP HANA database. - Greetings all. It's been a very long time, but I'm back. Does anyone have experience with setting up a linked server to a SAP HANA database? I have hanaclient-2.8.20-windows-x64 installed and a DSN created using the HDBODBC driver that comes with the client install. I am able to use the DSN to connect with generic […]

SQL Server 2022 - Development

Check sequential order of status of items - There is a table tblItems with below structure ( i cannot change the below table structure) Objective is to check if the sequential order of the statuses is correct or not. Order of the statuses should be in this sequence : QualityCheck, Printing, Labelling, Packing, Dispatch If the status is not in this order for […]

This email has been sent to {email}. To be removed from this list, please click here. If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com. This newsletter was sent to you because you signed up at SQLServerCentral.com.