Running the aggregation pipeline
In production, the aggregation pipeline will be invoked automatically once the curation pipeline completes successfully. Since we are in the unit testing phase, we will trigger it manually for now. Let's get started:
- You must invoke the following commands on Azure Cloud Shell. They will invoke electroniz_aggregation_pipeline based on the parameters we pass. Please make sure you edit EXT_TAB_LOCATION as per the instructions provided. Using the Azure portal, navigate to your Azure Data Lake Storage account by going to Home > All Resources > traininglakehouse > Endpoints.
Note the URL of Blob Service and use this to edit
EXT_TAB_LOCATION
:RESOURCEGROUPNAME="training_rg" DATAFACTORYNAME="traininglakehousedf" PIPELINENAME="electroniz_batch_aggregation_pipeline" az datafactory pipeline create-run --factory-name $DATAFACTORYNAME --name $PIPELINENAME --resource-group $RESOURCEGROUPNAME \ --parameters "...