| The Complete Weekly Roundup of SQL Server News by SQLServerCentral.com | Hand-picked content to sharpen your professional edge |
| Analyzing Breached Data A few of you out there might be data scientists who profile data regularly. Probably a fair number of you do import/export work and learn to check data values, perhaps with counts, distincts, or other aggregates. I don't know if the performance tuners out there look at the skew of data or the details of what is in a query that needs improvement. However, all of you are likely familiar with data and trying to query it for some type of meaning. One of the largest data breaches occurred with National Public Data. Troy Hunt analyzed the breach as a part of his work with haveIbeenpwned. The piece is an interesting analysis of the data, trying to determine both it's legitimacy as well as what is actually included in the breach. It's a fascinating read and I encourage you to look at it not just from the data analysis side, but also to be aware of what data about you is being aggregated and sold by companies. The read is interesting as it is a bit of a detective story, digging through data in a folder, which is something I've had to do. I've had people in previous jobs just dump a bunch of data on me and ask me to load it into a database. Or a table. Often without them knowing what type of data it is, what formats, do files relate to each other? Are there multiple tables worth of data in a file? All questions I've had to ask myself (and answer), and similar to what Troy did to analyze the breach. Data is very important to many of us, in different ways, but I'm often amazed at how few people actually understand how to organize data and ensure others can track the metadata about their data (what their data represents). I'm guessing this is why every person that gets an extract of data to load into Excel formats it in different ways. In many cases, people want the ability to query data, but they prefer to just focus on one table that contains a lot of information. They don't want to know how to "join" data together. I think this might be the reason we see so many views in databases, and why we have views built on views. Each new client of the database needs their own view structure. The world of data is a mess, even inside an organization. Once we start moving data between organizations, it's truly a mess. We might bemoan all the inefficiencies and work we do to move, change, and re-load data as custom, human ETL machines, but there is one great thing about this tangled web. It provides for steady, secure jobs for many of us with no end of work in sight. Steve Jones - SSC Editor Join the debate, and respond to the editorial on the forums |
The Weekly News | All the headlines and interesting SQL Server information that we've collected over the past week, and sometimes even a few repeats if we think they fit. |
Vendors/3rd Party Products |
Data containers can be created, destroyed and reset remarkably quickly, making them very effective for test-driven database development. Each time we reset a data container, it will have the same host, but both the port and password will be different. This article demonstrates how we can use simple PowerShell automation to smooth the process of dealing with ephemeral connection details in our IDEs and command line tasks. |
AI/Machine Learning/Cognitive Services |
Explore a PowerShell script that allows users to interact with a Word document using natural language queries, inspired by Microsoft’s Copilot. |
Many beginners will initially rely on the train-test method to evaluate their models. This method is straightforward and seems to give a clear indication of how well a model performs on unseen data. However, this approach can often lead to an incomplete understanding of a model’s capabilities. In this blog, we’ll discuss why it’s important to go beyond the basic train-test split and how cross-validation can offer a more thorough evaluation of model performance. |
One of the most important questions about using AI responsibly has very little to do with data, models, or anything technical. It has to do with the power of... |
Interesting paper: “Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data“: Generative, multimodal artificial intelligence (GenAI) offers transformative potential across industries, but its misuse poses significant risks.... |
Administration of SQL Server |
Whether you are backing up your SQL Server databases with Ola Hallengren’s scripts, Maintenance Plans, or some third-party software, your backups are your lifeline for recovering your data in... |
Sure, you can right click on a running session for Extended Events and open the Live Data window, but are there other ways of observing what Extended Events is... |
Azure Databricks, Spark and Snowflake |
You can use Databricks for a vast range of applications these days. From handling streaming datasets, running Deep Learning models to populating data model fact tables with complex transformations, choosing the correct compute option can seem a lot like a stab in the dark followed by (potentially expensive) trial end error. |
Now that we know so much about getting data from Snowflake using the query history, we should be able to get all kinds of data that will be helpful for us. In the next few queries, I want to provide some queries I use that should make your Snowflake usage a bit more useful. |
This article offers technical guidance on automating updates and tracking changes of data examples using Snowflake’s CDC capabilities. The examples will be using healthcare type data, but the basic... |
What are Dev Containers? A development container essentially packages up your project's development environment using the Development Container Specification (devcontainer.json). This specification enriches your container with metadata and content necessary... |
Azure SQL Managed Instance |
Previous posts discussed what CLR is, how we can import 3rd party DLLs and how we can use CLR to invoke REST APIs directly from Azure SQL MI. Today, we will touch upon another pain point that we’ve observed – transferring CLR assemblies from on-prem. to cloud; and we will do that by creating a brand-new MI link. |
This capability simplifies the migration to SQL Managed Instance and unblock the migration of legacy applications that are tied to windows logins. This feature plays a vital role for... |
Learn how to create a backup of an AWS RDS for SQL... |
Conferences, Classes, Events, and Webinars |
In celebration of their 25th anniversary in 2024, Redgate - host of PASS Summit - is thrilled to introduce the PASS Summit Futures Scholarship. This initiative aims to empower the next generation of data professionals from diverse backgrounds. Applications are open internationally to students and early-career professionals. Each of the 10 winners will be awarded one complimentary 3-day ticket to PASS Summit 2024 and US$2000 payment towards flights, accommodation and expenses. Applicants should ensure they can comply with the full Terms and Conditions before applying, and the deadline to submit applications is August 23, 2024. |
This year, PASS is introducing scholarships to support individuals who might not typically have the opportunity to attend the PASS Data Community Summit. These scholarships are available to two... |
Spring Training This September, I’ll be presenting my full day training session The Foundations Of SQL Server Performance Tuning for Data Saturday Dallas. All attendees will get free access for... |
Atlanta BI fans, please join us in person for the next meeting on Monday, September 3th at 6:30 PM ET. Your humble correspondent will show you how to use... |
In this blog post, I will discuss when you need to add classes with Add-Type and when you can provide classes with New-Object. I would say, let’s get started. |
It’s time to follow up on a post from a few months ago where I said I’d give my thoughts on Start-DbaMigration from dbatools after picking up more experience.... |
PowerShell expert Brien Posey demonstrates a technique for managing long-running scripts. |
Data Privacy, Compliance, and Governance |
Texas is suing General Motors for collecting driver data without consent and then selling it to insurance companies: From CNN: In car models from 2015 and later, the Detroit-based car manufacturer... |
We live in a data-driven culture, which means that as a business leader, you probably have more data than you know what to do with. To gain control over... |
Database Design, Theory and Development |
This month’s #tsql2sday is hosted by Malathi, a.k.a Mala, a.k.a diligentdba (b), and asks us about how we manage our database… Read More »T-SQL Tuesday 177: Managing database code The post T-SQL Tuesday... |
DevOps and Continuous Delivery (CI/CD) |
Have you ever experienced a scenario as a developer when a seemingly simple weekend bug fix throws your entire project into disarray? You push the changes, excited to resolve the issue, only to be met with a cascade of errors and a frustrated team come Monday morning. This scenario, unfortunately, is a common problem for projects that rely on manual deployments. But there’s a way to automate these processes, ensuring smoother releases, fewer errors, and a more efficient development workflow. The hack is it to use CI/CD (continuous integration/continuous delivery) pipelines to solve this problem and achieve the desired outcome. |
In August 2024's T-SQL Tuesday, Mala Mahadevan asks us to reflect on how we manage database code. Thanks for hosting, Mala! |
In this article, we walk through how to set up a simple Excel interface to work with SQL Server data to insert, update, delete, and select data. |
Earlier this month, AMD launched the first two desktop CPUs using their latest Zen 5 microarchitecture: the Ryzen 7 9700X and the Ryzen 5 9600X. As part of the new... |
Microsoft Fabric ( Azure Synapse Analytics, OneLake, ADLS, Data Science) |
Here is table that is getting more and more difficult to find as searching for Fabric capacity limits returns results about CU compute units (for the most part meaningless... |
As I mentioned in my Power BI and Fabric Capacities: Thinking Outside the Box, memory limits of Fabric capacities could be rather restrictive for large semantic models with imported... |
Oracle/PostgreSQL/MySQL/other RDBMS |
When I joined a growing startup company as a backend developer, we were at crossroads choosing between MySQL and PostgreSQL for our backend. Our team was divided: some favored MySQL for its speed and simplicity, while others leaned towards PostgreSQL for its advanced features and robustness. |
When I joined a growing startup company as a backend developer, we were at crossroads choosing between MySQL and PostgreSQL for our backend. Our team was divided: some favored... |
Performance Tuning SQL Server |
Five Reasons Why Your SQL Server Is Slow Right Now Thanks for watching! Going Further If this is the kind of SQL Server stuff you love learning about, you’ll... |
A Little About Out Of Date Statistics In SQL Server Thanks for watching! Going Further If this is the kind of SQL Server stuff you love learning about, you’ll... |
In this article, we look at the steps to change a ... |
Over the last week, I’ve been working on putting together a Postgres version of the Stack Overflow database, just like the SQL Server one that I’ve distributed for almost... |
This is the story of an unexpected challenge I encountered and a tiny but fearless response to address the Postgres optimiser underestimations caused by a data skew, miss in statistics or inconsistency between statistics... |
PowerPivot/PowerQuery/PowerBI |
DESCRIPTION Many of us learn through action an... |
Product Reviews and Articles |
I had some clear time this morning so I read a recent book called Deciphering Data Architectures (Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data... |
Redgate Test Data Manager allows developers to save each new version of a database as a data container revision. After making local development changes to the container, or running... |
Product Upgrades and Releases |
Introduction These are my SQL Server Diagnostic Information Queries for August 2024, aka my DMV Diagnostic Queries. They allow you to get a very comprehensive view of the configuration... |
On August 1, 2024, Microsoft released SQL Server 2019 Cumulative Update 28. This is Build 15.0.4385.2. By Microsoft’s count, there are 12 public fixes or improvements in this CU,... |
We are excited to announce that Hyperscale named replicas now support configuring a specific maintenance windows. You can now choose from predefined time slots for maintenance and setup alerts to be... |
Pyspark has many flexible syntaxes which are not so common to other languages. One of these syntaxes is the loop format. Loops in PySpark can be used to build... |
String concatenation is a fundamental operation in data manipulation and cleaning. If you are working in R, mastering string concatenation will significantly enhance your data processing capabilities. This blog post will cover different ways to concatenate strings using base R, the stringr, stringi, and glue packages. |
SQL Server Security and Auditing |
Making It Easy To Tell When Data Changed With Change Data Capture In SQL Server Thanks for watching! Going Further If this is the kind of SQL Server stuff... |
From the Federal Register: After three rounds of evaluation and analysis, NIST selected four algorithms it will standardize as a result of the PQC Standardization Process. The public-key encapsulation mechanism... |
T-SQL and Query Languages |
This article will compare the SQL Server XACT_STATE and @@TRANCOUNT functions and illustrate the differences with examples. The goal is to reduce the confusion between these two functions, allowing... |
This Query Exercise was very different: I didn’t ask you to solve a particular problem. I pointed out that I’ve heard advice that SELECT MAX is faster than SELECT... |
Microsoft encourages us not to use the datetime data type: |
Open-source licenses give you rights to copy, and in some cases, they define patent rights and responsibilities, but when they mention trademarks at all, most of them exclude rights... |
Publishers say blocking the company's AI bot could also prevent their sites from showing up in search. |
Businesses today rely heavily on data and strive to harness its full potential while navigating a constantly changing landscape. The secret is to combine smart analytics with a strong... |
This action would mark Washington's first attempt to break up a company for illegal monopolization since the failed bid to dismantle Microsoft two decades ago. |
Follow these four steps to proactively identify and eliminate risks from problematic open source packages. |
We all know that building SQL queries via string c... |
This email has been sent to {email}. To be removed from this list, please click here. If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com. This newsletter was sent to you because you signed up at SQLServerCentral.com. Note: This is not the SQLServerCentral.com daily newsletter list, and unsubscribing to this newsletter will not stop you receiving the SQL Server Central daily newsletters. If you want to be removed from that list, you can follow the instructions on the daily newsletter. |
|
|