
Azure Data Factory Cookbook
By :

In this recipe, we shall implement error-handling logic for our pipeline similar to the previous recipe, but with a more sophisticated design: we shall isolate the error-handling flow in its own pipeline. Our main parent pipeline will then call the child pipeline. This recipe also introduces three very useful activities to the user: Lookup, Web, and Execute Pipeline. The recipe will illustrate how to retrieve information from an Azure SQL table and how to invoke other Azure services from the pipeline.
We shall be using all the Azure services mentioned in the Technical requirements section at the beginning of the chapter. In addition, this recipe requires a table to store the email addresses of the status email recipients. Please refer to the Technical requirements section for the table creation scripts and instructions.
We shall be building a pipeline that sends an email in the case of failure. There is no activity in ADF capable of sending emails, so we shall be using the Azure Logic Apps service. Follow these steps to create an instance of this service:
ADF-Email-Logic-App
and fill in the Subscription, Resource Group, and Region information fields.Figure 2.27: HTTP trigger
{
“subject”: “<subject of the email message>”,
“messageBody”: “<body of the email message >”,
“emailAddress”: “<email-address>”
}
Enter the code in the box as shown in the following figure:
Figure 2.28: Configuring a logic app – The capture message body
NOTE
Even though we use Gmail for the purposes of this tutorial, you can also send emails using Office 365 Outlook or Outlook.com. In the See also section of this recipe, we include a link to a tutorial on how to send emails using those providers.
Figure 2.29: Configuring a logic app – Specifying an email service
Figure 2.30: Configuring a logic app – specifying the Body, Subject, and Recipient fields
@{triggerBody()['emailAddress']}
.@{triggerBody()['subject']}
in the Subject text field.@{triggerBody()['messageBody']}
. You should end up with something similar to the following screenshot:
Figure 2.31: Configuring a logic app – Specifying the To, Subject, and Body values
First, we shall create the child pipeline to retrieve the email addresses of the email recipients and send the status email:
pl_orchestration_recipe_5_child
.SELECT * FROM [dbo].[EmailRecipient]
into the text box. Make sure to uncheck the First row
only
checkbox at the bottom. Your Settings tab should look similar to the following figure:
Figure 2.32: The Get Email Recipients activity settings
In the Settings tab, enter @activity('Get Email Recipients').output.value
into the Items textbox.
We shall now configure the Web activity. First, go to the General tab, and rename it Send Email
. Then, in the URL text field, paste the URL for the logic app (which you created in the Getting ready section):
application/json
into the Value textbox.@json(concat(‘{“emailAddress”: “‘, item().emailAddress, ‘”, “subject”: “ADF Pipeline Failure”, “messageBody”: “ADF Pipeline Failed”}’))
Your Settings tab should look similar to Figure 2.33:
Figure 2.33: The Send Email activity settings
EmailRecipients
table in order to test your pipeline. You can also verify that the email was sent out by going to the ADF-Email-LogicApp UI in the Azure portal and examining the run in the Overview pane:
Figure 2.34: Logic Apps portal view
pl_orchestration_recipe_5_parent
.Send Email On Failure
.pl_orchestration_recipe_5_child
.Figure 2.35: Parent pipeline after modifying the On Failure activity
In this recipe, we introduced the concept of parent and child pipelines and used the pipeline hierarchy to incorporate the error-handling functionality. This technique offers several benefits:
To craft the child pipeline, we started by adding a Lookup activity to retrieve a list of email recipients from the database table. This is a very common use for the Lookup activity: fetching a dataset for subsequent processing. In the configuration, we specified a query for the dataset retrieval: SELECT * from [dbo].[EmailRecipient]
. We can also use a more sophisticated query to filter the email recipients, or we can retrieve all the data by selecting the Table radio button. The ability to specify a query gives users a lot of choice and flexibility in filtering a dataset or using field projections with very little effort.
The list of email recipients was processed by the ForEach activity. We encountered the ForEach activity in the previous recipe. However, inside the ForEach activity, we introduced a new kind of activity: the Web activity, which we configured to invoke a simple logic app. This illustrates the power of the Web activity: it enables the user to invoke external REST APIs without leaving the Data Factory pipeline.
There is another ADF activity that offers the user an option to integrate external APIs into a pipeline: the Webhook activity. It has a lot of similarities to the Web activity, with two major differences:
callBackUri
property to the external service, along with the other parameters you specify in the request body. It expects to receive a response from the invoked web application. If the response is not received within the configurable timeout period, the Webhook activity fails. The Web activity does not have a callBackUri
property, and, while it does have a timeout period, it is not configurable and is limited to 1 minute.
This feature of the Webhook activity can be used to control the execution flow of the pipeline – for example, to wait for user input into a web form before proceeding with further steps.
Change the font size
Change margin width
Change background colour