(2019-Dec-15) While working on data integration projects and using Azure Data Factory as your main orchestration tool will help you to develop strategic forward thinking about your development tasks: to ponder on what your data sources are, point of destinations to land this information into a new data model and transformation steps to shape data from the source to its destination. Just like when you play chess and have to plan ahead several of your next moves.
Along with this structural thinking to develop and execute your data flows, timely notifications of when something goes left or right would give you additional peace of mind.
There are several ways to trigger and initiate Data Factory communicating back to you: (1) Email, (2) Internal Alerts, (3) Log Analytics Alerts. However, this list is not complete and you can always use other Azure messaging services, creativity is your limit.
(1) Email - is a simple way to notify end users and inform them of successful or not so data factory pipeline runs. To create email notifications in the Azure Data Factory, you need to create a separate Logic App as your transport layer for emails and ADF web activity to call this Logic App when it's needed.
Step to create this communication channel:
1) Logic App. Trigger: When an HTTP POST request is received
2) Logic App. Action: Send Email (could be Office 365, SMTP or even Gmail would work)
3) ADF Web Activity. Copy HTTP Url from the Logic App in the Azure portal and paste its value in the Web taskPass URL
4) ADF Web Activity. Construct content of your POST request to the Logic App
5) ADF. Use your new ADF web activity task as an output from either successful, failed or completed activities in your ADF pipeline.
SSIS flash from the past:
If you worked with SQL Server Integration Services (SSIS) then Azure Data Factory email capability will remind you of the "Send Mail" task there.
I have also blogged about the use of Email notifications in Azure Data Factory last year - http://datanrg.blogspot.com/2018/11/email-notifications-in-azure-data.html
Pros:
- Easy and fast way to create Logic App engine to send emails
- Most of email notifications settings, such as a list of recipients, email subject and body can be constructed in ADF and passed to your Logic App to transfer.
Cons:
- Logic App becomes an additional component to maintain and support in your development cycle
- Emails (Web Activity / Logic App) is not a recommended way even for a small scale of your ADF pipeline development, it's a bit cumbersome and complicated for more complex notifications scenario on a single pipeline or data factory level.
(2) Internal Alerts - is a Data Factory built-in mechanism to increase your operational productivity and create alerts after various data integration metric-driven events. Azure Alerts infrastructure is used for this.
Step to create this communication channel:
1) Go to ADF Monitory and click "New Alert Rule" to create a new alert
2) Set the Alert rule name and its severity:
- Sev 0 = Critical
- Sev 1 = Error
- Sev 2 = Warning
- Sev 3 = Informational
- Sev 4 = Verbose
3) Set the Alert critera using one the available metrics:
- Cancelled pipeline runs metrics
- Cancelled trigger runs metrics
- Failed activity runs metrics
- Failed pipeline runs metrics
- Failed trigger runs metrics
- Integration runtime available memory
- Integration runtime available node count
- Integration runtime CPU utilization
- Integration runtime queue duration
- Integration runtime queue length
- Maximum allowed entities count
- Maximum allowed factory size (GB unit)
- Succeeded activity runs metrics
- Succeeded pipeline runs metrics
- Succeeded trigger runs metrics
- Total entities count
- Total factory size (GB unit)
4) Create or use existing Action group and notification type (Email, SMS, Phone Call or Azure app Push notifications.
5) Click "Create Alert Rule" button.
If you get "The subscription is not registered to use namespace 'microsoft.insights'" error message then run this PowerShell command in your cloud shell and then try to click the "Create Alert Rule" button again.
Register-AzResourceProvider -microsoft.insights
When one of your ADF pipelines fails you will be automatically notified. In this example I received this email:
SSIS flash from the past:
If you worked with SSIS then internal Azure Data Factory alerts will remind you of the "Event Handlers" there.
Pros:
- Built-in feature of Azure Data Factory
- Very easy and intuitive user interface to create and configure alerts
- Action groups once defined could be used in other alerts of your data factory
Cons:
- Alerts are created within a single data factory workspace and don't cover multiple data factory scenario unless you recreate similar alerts in other data factories as well.
- I understand that there is a way to programmatically create alerts within a data factory, however, I couldn't find an ARM template for them in order to save their definition in my source control for further deployment to other environments.
(3) Log Analytics Alerts - When you monitor Data Factory pipelines using Azure Monitor and Log Analytics, you're not only getting access to a very large dataset of logged information generated by various events, but you also have a capability to create alerts based on custom metrics' querying results.
Step to create this communication channel:
1) Create a new Log Analytics workspace to collect Azure Monitor log data, or you can re-use an existing Log Analytics workspace if you have one.
2) Add "Azure Data Factory Analytics" a solution to your Log Analytics workspace from the Azure Market place
3) Go to Azure Monitor, then turn on diagnostics for Data Factories of your interest and specifically set the output to Log Analytics workspace.
4) Validate Log information in Azure Monitor
Three ADF datasets become available for you to query:
- ADFActivityRun
- ADFPipelineRun
- ADFTriggerRun
4.a) General log information for executed pipelines using Kusto language; in this example, we're looking at 10 last ADF pipeline runs info:
4.b) This Kusto query provides me with the list of failed ADF pipelines during the last 5 minutes and this will help me to create notification alerts:
ADFPipelineRun
| where Status == "Failed"
| where TimeGenerated > ago(5m)
5) Azure alerts could be created either in Azure Monitor directly or you can start with your Kusto log query and click "New alert rule" button:
5.a) Choose Log Analytics workspace
5.b) Set a triggering condition for your alert rule
5.c) Create an action group and configure it with a necessary action type (delivery method):
In my case, I'm reusing the action group that I created for my internal ADF alerts
Then when one of your ADF pipelines fails you will be automatically notified. In this example I received this email:
SSIS flash from the past:
If you worked with SSIS then Log Analytics based alerts for Azure Data Factory will remind you of the SSISDB Catalog or Integrations Services Logging using external data providers, such as Tibco messaging services.
Pros:
- Azure Monitor with Logs, Metrics, and Alerts is used.
- Log Analytics provides you with a very reach dataset experience to create metric-driven or dataset results' alerts.
- Action group could be reused both in internal ADF alerts and Log Analytics alerts
- Log Analytics could be configured for multiple data factories and potentially in multiple azure subscriptions, this will enable you to create and run Kusto query your against your logs for several data factories at once
- Action groups once defined could be used in other alerts of your data factory
- Since both internal ADF alerts and Log Analytics alerts are based on the same Azure Alerts infrastructure, you can manage them all in one place. Here, I can see both alerts created in Azure Data Factory and Log Analytics workspace:
Cons:
- The only disadvantage of this approach would be time and some complexity for the initial setup, but looking at the volume of statistics that can be logged and custom logic to query it, then this contra-argument becomes your additional point for the use of Log Analytics.
Therefore, it's not hard to guess that Log Analytics alerts for Azure Data Factory are my favorite ones!