We use Azure DevOps for running our build(CI), retraining trigger and release (CD) pipelines. If you don't already have an Azure DevOps account, create one by following the instructions here.
If you already have Azure DevOps account, create a new project.
To create service principal, register an application entity in Azure Active Directory (Azure AD) and grant it the Contributor or Owner role of the subscription or the resource group where the web service belongs to. See how to create service principal and assign permissions to manage Azure resource.
Please make note of the following values after creating a service principal, we will need them in subsequent steps:
- Application (client) ID
- Directory (tenant) ID
- Application Secret
Note: You must have sufficient permissions to register an application with your Azure AD tenant, and assign the application to a role in your Azure subscription. Contact your subscription administrator if you don't have the permissions. Normally a subscription admin can create a Service principal and can provide you the details.
We make use of variable group inside Azure DevOps to store variables and their values that we want to make available across multiple pipelines. You can either store the values directly in Azure DevOps or connect to an Azure Key Vault in your subscription. Please refer to the documentation here to learn more about how to create a variable group and link it to your pipeline. Click on Library in the Pipelines section as indicated below:
Please name your variable group devopsforai-aml-vg as we are using this
name within our build yaml file.
The variable group should contain the following required variables:
| Variable Name | Suggested Value |
|---|---|
| BASE_NAME | [unique base name] |
| LOCATION | centralus |
| SP_APP_ID | |
| SP_APP_SECRET | |
| SUBSCRIPTION_ID | |
| TENANT_ID | |
| RESOURCE_GROUP | |
| WORKSPACE_NAME | mlops-AML-WS |
Mark SP_APP_SECRET variable as a secret one.
Note:
The WORKSPACE_NAME parameter is used for the Azure Machine Learning Workspace creation. You can provide here an existing AML Workspace if you have one.
The BASE_NAME parameter is used throughout the solution for naming Azure resources. When the solution is used in a shared subscription, there can be naming collisions with resources that require unique names like azure blob storage and registry DNS naming. Make sure to give a unique value to the BASE_NAME variable (e.g. MyUniqueML), so that the created resources will have unique names (e.g. MyUniqueML-AML-RG, MyUniqueML-AML-KV, etc.). The length of the BASE_NAME value should not exceed 10 characters.
Make sure to select the Allow access to all pipelines checkbox in the variable group configuration.
There are more variables used in the project. They're defined in two places one for local execution one for using Azure DevOps Pipelines
In order to configure the project locally you have to create a copy from .env.example to the root and name it .env. Fill out all missing values and adjust the existing ones to your needs. Please be aware that the local environment also needs access to the Azure subscription so you have to provide the credentials of your service principal and Azure account information here as well.
For using Azure DevOps Pipelines all other variables are stored in the file .pipelines/azdo-variables.yml. Adjust as needed the variables, also the defaults will give you an easy jump start.
Up until now you should have:
- Forked (or cloned) the repo
- Created a devops account or use an existing one
- Got service principal details and subscription id
- A variable group with all configuration values
The easiest way to create all required resources (Resource Group, ML Workspace, Container Registry, Storage Account, etc.) is to leverage an "Infrastructure as Code" pipeline in this repository. This IaC pipeline takes care of setting up all required resources based on these ARM templates.
To set up this pipeline, you will need to do the following steps:
- Create an Azure Resource Manager Service Connection
- Create a Build IaC Pipeline
The pipeline requires an Azure Resource Manager service connection. Given this service connection, you will be able to run the IaC pipeline and have the required permissions to generate resources.
Use AzureResourceConnection as the connection name, since it is used
in the IaC pipeline definition. Leave the Resource Group field empty.
In your DevOps project, create a build pipeline from your forked GitHub repository:
Then, refer to an Existing Azure Pipelines YAML file:
Having done that, run the pipeline:
Check out created resources in the Azure Portal:
Alternatively, you can also use a cleaning pipeline that removes resources created for this project or you can just delete a resource group in the Azure Portal.
Once this resource group is created, be sure that the Service Principal you have created has access to this resource group.
Now that you have all the required resources created from the IaC pipeline, you can set up the rest of the pipelines necessary for deploying your ML model to production. These are the pipelines that you will be setting up:
- Build pipeline: triggered on code change to master branch on GitHub, performs linting, unit testing, publishing a training pipeline, and runs the published training pipeline to train, evaluate, and register a model.
- Release Deployment pipeline: deploys a model to QA (ACI) and Prod (AKS) environments.
In your Azure DevOps project create and run a new build pipeline referring to the azdo-ci-build-train.yml pipeline in your forked GitHub repository:
Name the pipeline ci-build. Once the pipline is finished, explore the execution logs:
and checkout a published training pipeline in the mlops-AML-WS workspace in Azure Portal:
Great, you now have the build pipeline set up which automatically triggers every time there's a change in the master branch. The pipeline performs linting, unit testing, builds and publishes and executes a ML Training Pipeline in a ML Workspace.
Note: The build pipeline contains disabled steps to build and publish ML
pipelines using R to train a model. Enable these steps if you want to play with
this approach by changing the build-train-script pipeline variable to either build_train_pipeline_with_r.py, or build_train_pipeline_with_r_on_dbricks.py. For the pipeline training a model with R on Databricks you have
to manually create a Databricks cluster and attach it to the ML Workspace as a
compute (Values DB_CLUSTER_ID and DATABRICKS_COMPUTE_NAME variables shoud be
specified).
The training pipeline will train, evaluate, and register a new model. Wait until it is finished and make sure there is a new model in the ML Workspace:
To disable the automatic trigger of the training pipeline, change the auto-trigger-training variable as listed in the .pipelines\azdo-ci-build-train.yml pipeline to false. This can also be overridden at runtime execution of the pipeline.
The final step is to deploy the model across environments with a release
pipeline. There will be a QA environment running on
Azure Container Instances
and a Prod environment running on
Azure Kubernetes Service.
This is the final picture of what your release pipeline should look like:
The pipeline consumes two artifacts:
- the result of the Build Pipeline as it contains configuration files
- the model trained and registered by the ML training pipeline
Install the Azure Machine Learning extension to your organization from the marketplace, so that you can set up a service connection to your AML workspace.
To configure a model artifact, there should be a service connection to mlops-AML-WS workspace. To get there, go to the project settings (by clicking on the cog wheel to the bottom left of the screen), and then click on Service connections under the Pipelines section:
Note: Creating service connection using Azure Machine Learning extension requires 'Owner' or 'User Access Administrator' permissions on the Workspace.
Add an artifact to the pipeline and select AzureML Model Artifact source type. Select the Service Endpoint and Model Names from the drop down lists. Service Endpoint refers to the Service connection created in the previous step:
Go to the new Releases Pipelines section, and click new to create a new release pipeline. A first stage is automatically created and choose start with an Empty job. Name the stage QA (ACI) and add a single task to the job Azure ML Model Deploy. Make sure that the Agent Specification is ubuntu-16.04 under the Agent Job:
Specify task parameters as it is shown in the table below:
| Parameter | Value |
|---|---|
| Display Name | Azure ML Model Deploy |
| Azure ML Workspace | mlops-AML-WS |
| Inference config Path | $(System.DefaultWorkingDirectory)/_ci-build/mlops-pipelines/code/scoring/inference_config.yml |
| Model Deployment Target | Azure Container Instance |
| Deployment Name | mlopspython-aci |
| Deployment Configuration file | $(System.DefaultWorkingDirectory)/_ci-build/mlops-pipelines/code/scoring/deployment_config_aci.yml |
| Overwrite existing deployment | X |
In a similar way, create a stage Prod (AKS) and add a single task to the job Azure ML Model Deploy. Make sure that the Agent Specification is ubuntu-16.04 under the Agent Job:
Specify task parameters as it is shown in the table below:
| Parameter | Value |
|---|---|
| Display Name | Azure ML Model Deploy |
| Azure ML Workspace | mlops-AML-WS |
| Inference config Path | $(System.DefaultWorkingDirectory)/_ci-build/mlops-pipelines/code/scoring/inference_config.yml |
| Model Deployment Target | Azure Kubernetes Service |
| Select AKS Cluster for Deployment | YOUR_DEPLOYMENT_K8S_CLUSTER |
| Deployment Name | mlopspython-aks |
| Deployment Configuration file | $(System.DefaultWorkingDirectory)/_ci-build/mlops-pipelines/code/scoring/deployment_config_aks.yml |
| Overwrite existing deployment | X |
Note: Creating of a Kubernetes cluster on AKS is out of scope of this tutorial, but you can find set up information in the docs here.
Similarly to the Invoke Training Pipeline release pipeline, previously created, in order to trigger a coutinuous integration, click on the lightning bolt icon, make sure the Continuous deployment trigger is checked and save the trigger:
Congratulations! You have three pipelines set up end to end:
- Build pipeline: triggered on code change to master branch on GitHub, performs linting, unit testing and publishing a training pipeline.
- Release Trigger pipeline: runs a published training pipeline to train, evaluate and register a model.
- Release Deployment pipeline: deploys a model to QA (ACI) and Prod (AKS) environments.
Note: This is an optional step and can be used only if you are deploying your scoring service on Azure Web Apps.
Create Image Script can be used to create a scoring image from the release pipeline. The image created by this script will be registered under Azure Container Registry (ACR) instance that belongs to Azure Machine Learning Service. Any dependencies that scoring file depends on can also be packaged with the container with Image config. To learn more on how to create a container with AML SDK click here.
Below is release pipeline with two tasks one to create an image using the above
script and second is the deploy the image to Web App for containers
.
For the bash script task to invoke the Create Image Script, specify the following task parameters:
| Parameter | Value |
|---|---|
| Display Name | Create Scoring Image |
| Script | python3 $(System.DefaultWorkingDirectory)/_MLOpsPythonRepo/ml_service/util/create_scoring_image.py |
Finally, for the Azure WebApp on Container Task, specify the following task parameters as it is shown in the table below:
| Parameter | Value |
|---|---|
| Azure subscription | Subscription used to deploy Web App |
| App name | Web App for Containers name |
| Image name | Specify the fully qualified container image name. For example, 'myregistry.azurecr.io/nginx:latest' |
Save the pipeline and create a release to trigger it manually. To create the trigger, click on the "Create release" button on the top right of your screen, leave the fields blank and click on Create at the bottom of the screen. Once the pipeline execution is finished, check out deployments in the mlops-AML-WS workspace.
















