| The Complete Weekly Roundup of SQL Server News | Hand-picked content to sharpen your professional edge |
| Using Big Data to Improve Health One of the frustrations of medical research is in getting good representative data. It should, you might suppose, be easy: you just scoop up all the medical records into a huge medical database, anonymize the data, and then just sharpen up your Python skills. Sigh; nope. The problem with the main 'big-data' medical sets such as the IBM Watson/Truven database is that they largely represent only people with medical insurance. This doesn't always cause problems, provided all conclusions bear in mind that it is a skewed population. This is why the medical records of a country such as the UK are potentially so valuable for research: they represent an entire population. With the help of a database that provides information on this scale, we can do a great deal more to advance medical and pharmaceutical science. What, if anything, stands in our way? The first problem is that such records can't reliably be pseudonymized or 'de-identified'. If you are researching individuals, and you know quite a bit about them, there are likely to be a couple of unusual characteristics that will allow you to match the records to the individual. Researchers have shown that it can result in a match rate of over 90%. The second problem is a regulatory problem. Medical records in Europe are not owned by the entity that collects or stores the data but to the patient. Although few people will disagree with the intention of allowing their records to be used, many will refuse consent because there are doubts about security. If there is a breach, you can't change your medical history as you would your password. The GDPR's guidance is that confidential patient information can only be used by hospital and university researchers, medical royal colleges and pharmaceutical companies researching new treatments. The third problem is the poor general understanding of the constraints of statistics. Statistical methods should come with the same warnings as a chain-saw. We still haven't reversed out of the statistical nightmares that presented a false '40%' conclusion about the value of statins in reducing cholesterol, certain types of which we now discover we need in spadesful for good health, unlike statins. The fourth problem is that there is no central database for medical records. It is difficult for this to happen because of the poor "interoperability" of the data, and the inability of the many health information systems to work together to join up the many separate, and sometimes warring 'care settings' making up the NHS (National Health Service). A fifth problem is bad data; by which I mean poor data quality, completeness and accuracy. There have been many reports of potentially incorrect codes being used to record illnesses and treatments, as well as missing or invalid identifiers, such as NHS numbers. It is typical of projects of this sort that few, if any, problems involve database technology. We have all the analysis tools we need. The tasks we face are mainly organizational and they'll take time to resolve. Phil Factor Join the debate, and respond to the editorial on the forums |
The Weekly News | All the headlines and interesting SQL Server information that we've collected over the past week, and sometimes even a few repeats if we think they fit. |
Vendors/3rd Party Products |
Redgate invite you to look around their newly refurbished University. If you are new to their tools there are simple, easy to follow courses to get you up and running, and if you are a long-time customer you can check out the recently added courses which take you through some more technical tips, tricks and troubleshooting. |
SQL Change Automation's Development component for developing new databases and modifying existing databases, using migrations, now integrates directly into SQL Server Management Studio as well Visual Studio. It allows teams to collaborate effectively during development, regardless of their preferred IDE, and in a way that integrates easily with common build/integration servers and release management tools. |
Phil Factor describes the freedom of being able to "self-serve" databases, during testing and development, and explains how it works with SQL Clone. |
Database deployments, like the sheep of exasperated hill-farmers, often find strange and unexpected ways to self-destruct. Phil Factor describes the most common things that can go wrong, and how a reliable automated database build process can prevent messy accidents. |
Whether you are a long time Redgate customer or have recently made a purchase you can find helpful and easy to follow video courses on Redgate University. With the brand-new look and the recently added courses you will soon be getting the most from your Redgate tools. |
Are you working with Oracle databases or know somebody that is? Does a company or client that you are working with also have an Oracle counterpart? If so, we want to talk to you/them about how they are currently working. We are currently running an Early Access Program (EAP) for a migrations-based solution for Oracle users. Sign up today, get the beta release and let us know your feedback! |
Pick your specialty, pick your learning method... |
Brent Ozar's not-so-secret SQL Server setup checklist. |
If you’re up in the billion row range, automatic stats creation and updates might just start to hurt. On SQL Server 2019, you’ll have some evidence for when refreshes take a long time, but still nothing for when the initial creation takes a long time. |
Azure Databricks and Spark |
Applying Linux patches for the TCP SACKs vulnerabilities caused some regressions when running the Databricks Runtime (DBR) on the Amazon Web Services (AWS) platform. This article describes the analysis that determined the TCP stack was the source of the degradation. |
A quick summary post of the various different types of Azure SQL Database. |
Accelerated Database Recovery for Azure SQL Databases and SQL Server 2019 will potentially cut the recovery time for canceled/rolled-back transactions and the time spent on database recovery. |
On the various ways to start SQL Server in single user mode, for example when you need to restore a master database or other system databases from backup. |
Computing in the Cloud (Azure, Google, AWS) |
Having seen a few improperly configured VMs lately, Joey D'Antoni explains how storage works in Azure, versus the way we traditionally did things on-premises. |
Conferences, Classes, Events, and Webinars |
OnWednesday October 2 16.00-17.00 BST/ 10.00-11.00 Central join John Sterrett, CEO of Procure SQL, to learn the difference between Azure SQL Database, SQL Managed Instances, Elastic Pools, and SQL Virtual Machines. |
Redgate is hosting an upcoming series of exclusive Summits, in cities worldwide, to help senior data professionals deliver value quicker, while keeping their data safe. Find out more here and register for your nearest event |
Aaron Nelson describes one of his favorite ways to take control and make PowerShell easier to work with: Argument Completers. |
New features and bug fixes in the latest preview of PowerShell 7, currently targeting December 2019 for a Release Candidate and January 2020 for General Availability. |
Every DBA should have basic PowerShell skills. In this article, Greg Moore explains how to write a PowerShell script that takes parameters. |
ETL/SSIS/Azure Data Factory/Biml |
When you build transformations that need to handle changing source schemas, your logic becomes tricky. This article discusses the possible approaches, their pros and cons. |
Tim Mitchell explains the basics of how project connections in SSIS can reduce the number of distinct connection managers in a single project. |
Andy Leonard demonstrates the challenges of scripting SSIS Catalog Environments, Environment Variables, References, Reference Mappings, and Literals for deployment to another SSIS Catalog. |
On Sept 28, 2019, a free day of SQL Server training in Orlando, FL, US. |
On Sept 28, 2019, a free day of SQL Server training in Sao Paulo, Brazil. |
On Sept 28, 2019, a free day of SQL Server training in Kharkiv, Ukraine. |
Sept 28, 2019, a free day of SQL Server training in Madrid, Spain. |
Sept 28, 2019, a free day of SQL Server training in Torun, Poland. |
Glenn Berry's build guide for a Mid-Range 1440P AMD Gaming PC ($750.00 to $1500.00 for the main components). |
With physical servers using NUMA configuration - or vms using vNUMA - a number of factors can lead to uneven CPU utilization among the NUMA nodes. |
This article describes what weak relationships are and the differences between strong and weak relationship in Power BI and DAX. |
Performance Tuning SQL Server |
A T-SQL tool to help application programmers analyze application script performance issues based on their debug log. |
Forced parameterization is a valuable feature in SQL Server, reducing plan cache bloat by not introducing a new plan for every single combination of predicate literals. However, if you use filtered indexes, those indexes could be rendered useless. |
A reminder that keeping your statistics up to date is one of the more critical components of maintaining good SQL Server query performance. |
PowerPivot/PowerQuery/PowerBI |
A Power BI Comparer-tool makes it super easy to compare all properties of 2 Power BI files. |
Chris Webb explains some useful date and time formats that aren’t covered well by the examples built into Power BI Desktop, and highlights a few things that aren’t possible right now. |
When you need to explain to a manager, or product stakeholders what the IIS web server logs say, you need visuals, or a dashboard, not command line output. Enter Power BI. |
Some Power BI datasets have been successfully refreshed but there have also been some inconsistent refresh failures. |
There is no DAX Quarter function in Power BI, but Reza Rad explains some easy methods for calculating it. |
How to prevent a refresh failure caused by duplicates appearing in a lookup table after the report has been built, the model has been loaded to PowerBI.com and the scheduled refresh has been set up using a gateway. |
Power BI reports can get slow if you have huge amount of data. One of the ways to speed up the performance is to load only the part that is needed for analysis. |
This article will cover many areas that database administrators need to be aware of in order to properly license, recover, and tune a Reporting Services installation. These topics apply to both SQL Server Reporting Services as well as Power BI Report Server. |
Tory Hunt states the case against arbitrary password restrictions on banks such as short max lengths and disallowed characters, but also why they might not matter. |
Context switching refers to modifying the context in which code operates from one user to another. The same code can function differently across user contexts because database role memberships and permissions may differ between users. |
On investigating the cause of inexplicable version store growth, Josh the Coder discovers that when you update the values in the column being altered, during an online column operation, the tempdb version store is used to store the new values until the operation completes. |
One of the biggest and undead myths in SQL is that COUNT(*) is faster than COUNT(1). Or was it that COUNT(1) is faster than COUNT(*)? Impossible to remember, because there’s really no reason at all why one should be faster than the other. But is the myth justified? Let’s measure! |
A bug when you use OBJECTPROPERTYEX to get the cardinality of a clustered columnstore index. |
Suddenly, the SSMS window will not register any mouse-clicks although it will seemingly look fine. The reason is because there is a Modal Window somewhere hidden. |
A fresh take at computing the time-to-payment on a large number of invoices, with multiple, overlapping, partial payments. |
Is it possible to read a simple JSON file using T-SQL? It certainly is. |
Bob Walker explains some techniques to make blue/green deployments with a database achievable and straightforward. |
Virtualization and Containers/Kubernetes |
When working with SQL Server running containers the Error Log is written to standard out. Kubernetes will expose that information to you via kubectl. Let’s check out how it works. |
SQL Server 2019 containers are now safer, because they start the SQL Server process as a non-root user, by default. |
The new SQL Server dockerfile, in the mssql-server-linux-non-root directory of the mssql-docker repository on Github, allows SQL Server containers to run as non-root. Andrew Pruski explains why this is a good thing. |
This email has been sent to {email}. To be removed from this list, please click here. If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com. This newsletter was sent to you because you signed up at SQLServerCentral.com. Note: This is not the SQLServerCentral.com daily newsletter list, and unsubscribing to this newsletter will not stop you receiving the SQL Server Central daily newsletters. If you want to be removed from that list, you can follow the instructions on the daily newsletter. |
|
|