There were so many good presentations at Microsoft Ignite, all of which can be viewed on-demand. I wanted to list the big data related presentations that I found the most useful. It’s a lot of stuff to watch and with our busy schedules can be quite challenging to view them all. What I do is set aside 40 minutes every day to watch half a session (they are 75 minutes). If may take a few weeks, but if you consistently watch you will be rewarded by a much better understanding of all the product options and their uses cases, and my last blog post (Use cases of various products for a big data cloud solution) can be used as a summary of all these options:
Modernize your on-premises applications with SQL Database Managed Instances: More and more customers who are looking to modernize their data centers have the need to lift and shift their fleet of databases to public cloud with the low effort and cost. We’ve developed Azure SQL Database to be the ideal destination, with enterprise security, full application compatibility and unique intelligent PaaS capabilities that reduce overall TCO. In this session, through preview stories and demos learn what SQL Database Managed Instances are, and how you can use them to speed up and simplify your journey to cloud.
Architect your big data solutions with SQL Data Warehouse and Azure Analysis Services: Have you ever wondered what’s the secret sauce that allows a company to use their data effectively? How do they ingest all their data, analyze it, and then make it available to thousands of end users? What happens if you need to scale the solution? Come find out how some of the top companies in the world are building big data solutions with Azure Data Lake, Azure HDInsight, Azure SQL Data Warehouse, and Azure Analysis Services. We cover some of the reference architectures of these companies, best practices, and sample some of the new features that enable insight at the speed of thought.
Database migration roadmap with Microsoft: Today’s organizations must adapt quickly to change, using new technologies to fuel competitive advantage, or risk getting left behind. Organizations understand that data is a key strategic asset which, when combined with the scale and intelligence of cloud, can provide the opportunity to automate, innovate, and increase the speed of business. But every migration journey is unique, so knowing the tricks of the trade will make your journey far easier. In this session, we use real-world case studies to provide details about how to perform large-scale migrations. We also share information about how Microsoft is investing in making this journey simpler with Azure Database Migration Service and related tools.
What’s new with Azure SQL Database: Focus on your business, not on the database: – Azure SQL Database is Microsoft’s fully managed, database-as-a-service offering based on the world’s top relational database management system, SQL Server. In this session, learn about the latest innovations in Azure SQL Database and how customers are using our managed service to modernize their applications. Our most recent version combines advanced intelligence, enterprise-grade performance, high-availability, and industry-leading security in one easy-to-use database. Thanks to innovations such as In-Memory OLTP, Columnstore indexes, and our most recent Adaptive Query Processing feature family, customers can rely on Azure SQL DB for their relational data management needs, from managing just a few megabytes of transactional data.
Deep dive into SQL Server Integration Services (SSIS) 2017 and beyond: See how to use the latest SSIS 2017 to modernize traditional on-premises ETL workflows, transforming them into scalable hybrid ETL/ELT workflows. We showcase the latest additions to SSIS Azure Feature Pack, introducing/improving Azure connectivity components, and take a deep dive into SSIS Scale-Out feature, guiding you end-to-end from cluster installation to parallel execution, to help reduce the overall runtime of your workflows. Finally, we show you how to orchestrate/schedule SSIS executions using Azure Data Factory (ADF) and share our cloud-first product roadmap towards SSIS Platform as a Service (PaaS).
New capabilities for data integration in the cloud: This session focuses on the needs of the data integrator whether that be for data warehousing/BI, advanced analytics or preparation of data for SaaS applications. We walk through, by example, a comprehensive set of new additions to Azure Data Factory to make moving and integrating data across on-premises and cloud simple, scalable and reliable. Topics covered include: how to lift SSIS packages to the cloud via first-class SSIS support in data factory, a new serverless data factory application model and runtime capabilities, parallel data movement to/from the cloud, a new code-free experience for building and monitoring data pipelines and more.
Understanding big data on Azure – structured, unstructured and streaming: Data is the new Electricity, and Big Data technologies are helping organizations leverage this new phenomena to foster their businesses in innovative ways. In this session, we show how you can leverage the big data services such as Data Warehousing, Hadoop, Spark, Machine Learning, and Real Time Analytics on Azure and how you can make the most of these for your business scenarios. This is a foundational session to ground your understanding on the technology, its use cases, patterns, and customer scenarios. You will see a lot of these technologies in action and get a good view of the breadth. Join this session if you want to get a real understanding of Big Data on Azure, and how the services are structured to achieve your desired outcome.
Building Petabyte scale Interactive Data warehouse in Azure HDInsight: Come learn to understand real world challenges associated with building a complex, large-scale data warehouse in the cloud. Learn how technologies such as Low Latency Analytical Processing [LLAP] and Hive 2.x are making it better by dramatically improved performance and simplified architecture that suites the public clouds. In this session, we go deep into LLAP’s performance and architecture benefits and how it compares with Spark and Presto. We also look at how business analysts can use familiar tools such as Microsoft Excel and Power BI, and do interactive query over their data lake without moving data outside the data lake.
Building modern data pipelines with Spark on Azure HDInsight: You are already familiar with the key value propositions of Apache Spark. In this session, we cover new capabilities coming in the latest versions of Spark. More importantly we cover how customers are using Apache Spark for building end-to-end data analytics pipeline. It starts from ingestion, Spark streaming, and then goes into the details on data manipulation and finally getting your data ready for serving to your BI analysts.
Azure Blob Storage: Scalable, efficient storage for PBs of unstructured data: Azure Blob Storage is the exa-scale object storage service for Microsoft Azure. In this session, we cover new services and features including the brand new Archival Storage tier, dramatically larger storage accounts, throughput and latency improvements and more. We give you an overview of the new features, present use cases and customer success stories with Blob Storage, and help you get started with these exciting new improvements.
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platform, and intelligent: Increasingly, customers looking to modernize their analytics needs are exploring the data lake approach. They are challenged by poorly-integrated technologies, a variety of data formats, and inconvenient data types. We explore a modern ETL pipeline through the lens of Azure Data Lake. This approach allows pipelines to scale to thousands of nodes instantly and lets pipelines integrate code written in .NET, Python, and R. This degree of extensibility allows pipelines to handle formats such as CSV, XML, JSON, Images, etc. Finally, we explore how the next generation of ETL scenarios are enabled by integrating intelligence in the data layer in the form of built-in cognitive capabilities.
Azure Cosmos DB: The globally distributed, multi-model database: Earlier this year, we announced Azure Cosmos DB – the first and only globally distributed, multi-model database system. The service is designed to allow customers to elastically and horizontally scale both throughput and storage across any number of geographical regions, it offers guaranteed <10 ms latencies at the 99th percentile, 99.99% high availability and five well defined consistency models to developers. It’s been powering Microsoft’s internet-scale services for years. In this session, we present an overview of Azure Cosmos DB—from global distribution to scaling out throughput and storage—enabling you to build highly scalable mission critical applications.
First look at What’s New in Azure Machine Learning: Take in the huge set of capabilities announced at Ignite for the next generation of the Azure Machine Learning platform. Build and deploy ML applications in the cloud, on-premises, and at the edge. Get started by wrangling your data into shape easily and efficiently, then take advantage of popular tools like Cognitive Toolkit, Jupyter, and Tensorflow to build advanced ML models and train them locally or at large scale in the cloud. Learn how to deploy models with a powerful, new, Docker-based hosting service complete with the ability to monitor and manage everything in production.
Delivering enterprise BI with Azure Analysis Services: Learn how to deliver analytics at the speed of thought with Azure Analysis Services on top of a petabyte-scale SQL Data Warehouse and Azure HDInsight Spark implementation. This session covers best practices for managing processing and query accelerating at scale, implementing change management for data governance, and designing for performance and security. These advanced techniques are demonstrated through an actual implementation including architecture, code, data flows, and tips and tricks.
Creating enterprise grade BI models with Azure Analysis Services: Microsoft Analysis Services enables you to build comprehensive, enterprise-scale analytic solutions that deliver actionable insights through familiar data visualization tools such as Microsoft Power BI and Microsoft Excel. Analysis Services enables consistent data across reports and users of Power BI. This session covers new features such as improved Power BI Desktop feature integration, Power Query connectivity, and techniques for modeling and data loading which enable the best reporting experiences. Various modeling enhancements are included, such as Detail Rows allowing users to easily see transactional records; and deployment and application-lifecycle management (ALM) features to bridge the gap between self-service and corporate BI.
Streaming big data on azure with HDInsight, Kafka, Storm, and Spark: Implementing big data streaming pipelines for robust, enterprise use cases is hard. Doing so with open source technologies is even harder. To help with this, HDInsight recently added Kafka as a managed service to complete a scalable, big data streaming scenario on Azure. This service processes millions+ of events/sec, pedabytes of data/day to power scenarios like Toyota’s connected car, Office 365’s clickstream analytics, fraud detection for large banks, etc. We will discuss the streaming landscape, challenges in building production ready streaming services, and build an enterprise grade realtime pipeline. We will then discuss the learnings and future investments on managed open source streaming through Azure HDInsight.