Since ADF doesn’t provide a built-in way for automated deployment, in order to do this you have to write a custom script. Then you can run it on build server such as VSTS or TeamCity.
If you’re in the interactive mode, you need to login and select the corresponding subscription. If you run the script on build server, the subscription should be selected for you automatically.
Login-AzureRmAccount
Select-AzureRmSubscription -SubscriptionName YourSubName
Here’s the parameters you can’t get programmatically so have to specify explicitly. Last parameter is a path to the JSON files that represent the resources to be deployed: linked services, datasets, and pipelines.
param(
[Parameter(Mandatory=$true)][string]$ResourceGroupName,
[Parameter(Mandatory=$true)][string]$Location,
[Parameter(Mandatory=$true)][string]$DataFactoryName,
[Parameter(Mandatory=$true)][string]$InputFolderPath
)
First, create the factory itself. Flag -Force
helps to continue on error in case it’s already exists. Then load it into a variable:
New-AzureRmDataFactory -ResourceGroupName $ResourceGroupName -Name $DataFactoryName -Location $Location -Force -ErrorAction Stop
$dataFactory = Get-AzureRmDataFactory -ResourceGroupName $ResourceGroupName -Name $DataFactoryName -ErrorAction Stop
Next three loops read files based on the following convention:
- Linked services start with
LinkedService
, e.g.LinkedService-SQL.json
- Datasets start with
Dataset
, e.g.Dataset-Orders.json
- Pipelines start with
Pipeline
, e.g.Pipeline-CopyOrders.json
First, create linked services. If any references a gateway then create it as well. Since the cmdlet doesn’t support flag -Force
, we can use -ErrorAction Continue
to continue on error in case it’s already exists:
$files = Get-ChildItem $InputFolderPath -Recurse -Include *.json -Filter LinkedService* -ErrorAction Stop
foreach ($file in $files)
{
Write-Output "Creating linked services from $($file.FullName)"
New-AzureRmDataFactoryLinkedService -DataFactory $dataFactory -File $file.FullName -Force -ErrorAction Stop
$json = Get-Content $file.FullName -Raw -ErrorAction Stop
$svc = $json | ConvertFrom-Json
$gwName = $svc.properties.typeProperties.gatewayName
if ($gwName)
{
Write-Output "Creating gateway $($gwName) from $($file.FullName)"
New-AzureRmDataFactoryGateway -DataFactory $dataFactory -Name $gwName -ErrorAction Continue
}
}
Then create datasets:
$files = Get-ChildItem $InputFolderPath -Recurse -Include *.json -Filter Dataset* -ErrorAction Stop
foreach ($file in $files)
{
Write-Output "Creating dataset from $($file.FullName)"
New-AzureRmDataFactoryDataset -DataFactory $dataFactory -File $file.FullName -Force -ErrorAction Stop
}
And finally, pipelines:
$files = Get-ChildItem $InputFolderPath -Recurse -Include *.json -Filter Pipeline* -ErrorAction Stop
foreach ($file in $files)
{
Write-Output "Creating pipeline from $($file.FullName)"
New-AzureRmDataFactoryPipeline -DataFactory $dataFactory -File $file.FullName -Force -ErrorAction Stop
}
That’s it. By this time all pipelines should be deployed, verified, and started. Happy data movement!
Hello,
I need to do build continous integration using VSTS and use this powershell for deploying my linked services, datasets, pipelines etc. But in the powershell script what is the input folder path i need to specify as I ADF files are in VSTS..i tried giving my local system path and it was throwing an error.
First you need to run msbuild on your ADF project (adfproj), this will merge environment configs and produce respective folder, e.g. bin\Debug\Dev. Then take the content of Dev and deploy it using the PowerShell script. Its path is one of the input parameters.
Im currently building my ADF code using VSTS..can you let me know how to build using MSBUILD??
VSTS has a dedicated task to build using msbuild named accordingly
In my powershell script for deployment how will i refer my ADF JSON files which is build using VSTS..
The path to the said folder is the
$InputFolderPath
parameterthis inputfolderpath i can specify if I’m deploying manually..but since im doing this through VSTS using powershell and building continous deployment..im not sure how this will pick the ADF files..any pointers for this would help..
There is no difference between building locally and using VSTS. It just executes commands in given order which you configure using a web UI. The path is on a build agent. All same. This script I’m using at work to build and deploy using same VSTS as you use.
Thanks for your reply. this is the PowerShell script I’m using for deploying from my local PowerShell:
in this if i need to do the same thing using VSTS – continous deployment what should be the path? how will i find the path in builagent?. Im struggling to find this out as im new to VSTS and azure
I’m not sure about building my project using MSBUILD..do you have some links which provides steps for this..
Use the dedicated task, point it to either sln or adfproj, not sure which one. The only caveat is that you need to check-in (put under the source control) the ADF build target file which is installed by VS somewhere under %LocalAppData%.
Hello, this is my build log file:
do i need to refer “D:\a\1\s\ADFAutomation\” this path in my PowerShell script?
Correct me if I’m wrong. Thanks for your help.
Use the built-in variables, see https://docs.microsoft.com/en-us/vsts/build-release/concepts/definitions/build/variables?tabs=batch
Thanks. I tried this yesterday but with no luck. See if you can provide any help in this.
ADF files i took MSBUILD and used copyfiles to copy into “Build.ArtifactStagingDirectory” and then published the artifacts.
I have this PowerShell script as a another project in my solution and trying to use this path Build.ArtifactStagingDirectory – but it throws an error stating:
Please let me know how to proceed in this case? Thanks.
Don’t put VSTS variables into PowerShell script’s body as it’s not aware of them. Put them when you invoke the said script in VSTS task. That it, see the initial post: the script uses the path from a parameter, the parameter is populated from the task.
Note that the path to the script is adjusted.
Thanks for the guidance. Please let me know if you can help in this.
I need to do Continuous Integration and Deployment for my Azure Data factory. For this in a visual studio solution I have two projects one for ADF json files (linked services, datasets etc) and another one PowerЫhell script for deploying this ADF into a Azure subscription.
Steps followed –
Took MSBUILD of ADF codes and used copy files task to copy into $(Build.ArtifactStagingDirectory). Used Publish artifacts task to publish in VSTS.
Publish artifacts for PowerShell script as a separate build.
In my release I have a Azure PowerShell script which will invoke these ADF files and deploy it in Azure subscription. I’m using “Build.ArtifactStagingDirectory” for referring my ADF files. But I’m getting the below error –
The term ‘Build.ArtifactStagingDirectory’ is not recognized as the name of a cmdlet, function, script file, or operable program
foreach($file in Get-ChildItem “$(Build.ArtifactStagingDirectory)” -filter “LinkedService”) { New-AzureRmDataFactoryLinkedService -ResourceGroupName “ADFAutomationResource” -DataFactoryName “ADFCICD190218” -Name $file.BaseName -File $file.FullName -Force | Format-List }
Let me know how to proceed in this case as there are no sufficient links explaining this.
Thanks for your reply. This is the first few lines of my powershell script
And in the Script Arguments i have specified as below:
But it throws me an error stating:
I have tried with other options as well for Script arguments “‘D:\a\1\a\ADFCICD1402\” with no luck. Looking forward to your help in this. I’m not able to get any links which explains this. Sorry for the trouble.
It should be
$(Build.ArtifactStagingDirectory)
from VSTS web UI.thanks.still no luck..getting the same error..
Now i have my ADF code and powershell scripts in a single build and in the release im trying to invoke this powershell script with arguments -InputFolderPath $(Build.ArtifactStagingDirectory). Any help on this regard will be great. Thanks.
In the build these are the following steps i followed –
MSBUILD
Copy files – source folder – $(Build.SourcesDirectory) contents ** target folder – $(Build.ArtifactStagingDirectory)
Publish artifacts – Path to publish – $(Build.ArtifactStagingDirectory)
my script.ps1 is available along with the ADF codes.
Is it Build or Release? Release has its own set of variables: https://docs.microsoft.com/en-us/vsts/build-release/concepts/definitions/release/variables
Any luck?
Where can I get help in detail for CI through VSTS for Azure DataFactory?? I need the detail on CI and not CD.
Do you mean to build ASF on VSTS? I was using plain msbuild task, that’s all what was available by that time, a year ago. What version are you using, v1 or v2?
I want a CI for json files ( ADF ) which are developed in Visual Studio 2015 by developers. These should go to artifacts folder of VSTS. I tried MSBUILD but failed. Now I just simply using the copy task, is this sufficient? As I thought CD will take care of deployment and in case of data factory build( MSBUILD) task not needed unlike USQL files. Please suggest.
Please note that ADF version is very important as there were significant changed, basically a rewrite. I can speak only about v1. Also I have nothing to do with the ADF team so speak only from my experience with ADF.
When you say failed what does this mean? What was the error? Did you try locally or in VSTS? Try locally first and let me know.
Also please note that adfproj references a target file from your %LocalAppData% or something like that. It to work in VSTS you have to put it under the source control alongside your project, or create a nuget package and reference it from the packages folder.
Why just copy won’t work? Because you likely have environment configuration files and some settings like connection strings vary from environment to environment. So you need to merge configs into linked services. And that’s what msbuild will do.
I used copy task instead of MSBUILD and it worked now. But I have 2 concerns here…
1. It is just copying artifacts( json files ) of ADF ( version 1 ) projects to artifact folder. Not building anything actually. So it won’t be possible to trace if there any syntactical error in json files.
2. Is it OK to use copy activity only during build instead of MSBUILD?
If you have only 1 environment so keep all settings such as connection strings inside linked service file then you don’t need msbuild task. Because as I said all what it does is merge environment configs into actual files per environment, e.g. Dev, Test, Prod.
If that’s the case the only thing you need to do is to run the PowerShell script to deploy the files you’ve got directly.
I have multiple environment. Wanted to make sure proper json file which is syntactically OK goes for deployment. So wanted to build or parse before deploy to validate the Json. What do you suggest?
Msbuild does both validation and environment configs merge. You can see this by running it locally, i.e.
>msbuild MyADF\MyADF.adfproj
. Then replicate that as a step in VSTS build.Thanks Abatishchey for your help on this.Really appreciate it. I was able to resolve the problem after adding this to my code
$ScriptFolder = $PSScriptRoot
Write-Output “PSScriptRoot is: $PSScriptRoot”
$InputFolderPath = $PSScriptRoot.
Hello..have you done CI/CD for Azure Data lake – USQL. If yes can you provide some links for it..
Merge all your scripts into 1 file and then execute this PowerShell command:
Submit-AzureRmDataLakeAnalyticsJob -Account $AnalyticsAccountName -Name $scriptName -ScriptPath $file -DegreeOfParallelism $DegreeOfParallelism
. Otherwise it might take hours and cost hundreds of dollars to deploy an object per script per job. However beware of dependencies: you may need to deploy tables first, then types, then procs.Thanks.Have you taken build for U-SQL project. I’m facing build issues while taking the build using VSBuild. Have you faced any kind of error before while building U-SQL Code –
Do you get this error locally or in VSTS? If the latter then it means that you didn’t bring the target and its dependencies within your source code and rely that it’s installed on the build agent. But it’s not.
Thanks Abathishchev. Im able to take a build now with U-SQL code. thanks for your reply
While deploying this release into azure subscription I’m getting the below error – “The user is not authorised to perform this operation on storage.” What kind of access do i need to give to resolve this issue.
This is related to Endpoints managements in VSTS, see https://docs.microsoft.com/en-us/vsts/build-release/concepts/library/service-endpoints?view=vsts#sep-azure-rm. You’ll need an AAD app and grant it the Contributor permissions in your subscription.
I have created a Service Endpoint and given permissions for the Azure Data lake storage. But still getting the error..
Sorry, it’s really hard to tell – VSTS authentication model is complex and bizarre, so is Azure AD. ADLS has hierarchical permissions – every change has to be re-applied to every object (folder and file) what can easily take an hour, depending on the number of files. Takes just seconds on almost empty account though. Also check what role you assigned to the said app id, whether it’s Contributor or something else.
Thanks Abatish..I gave reader access to the app id i created
Hi abatishchev,
Scenario:
In my ADF project, I have multiple pipelines, linked services, and datasets.
In the Powershell script, it has -Force property which means existing linked services, datasets and pipelines will be replaced without prompt.
Problem
I want only one pipeline and its datasets to be deployed to ADF using PowerShell. Is it possible?
Basically, it should compare the changes in the linked services, datasets and pipelines and then deploy only the changed objects to ADF.
Reason
I don’t want all my multiple pipelines, linked services and datasets to re-deploy again as I am not working on the same.
hi, sorry for not replying right away.
When a deployment script with the -Force option recreates the resources, have you observed any actual interruption in slices? We’re using the exactly same script for 1.5 years now and I don’t think we saw any. Just want to make sure before we’re going deeper into this issue.
Thanks Abatish. I don’t see any interruption in slices, but my point is we deploy only the required pipeline. Right?
Sure, this makes sense. But be prepared for the added complexity.
What I was told other people do is using the Git repository history to analyze whether there were changes and if yes then where exactly.
I see they’ve added some more useful docs, comparing to what available a year ago. See https://azure.microsoft.com/en-us/blog/continuous-integration-and-deployment-using-data-factory/ and https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment
P.S.
I’m Alexander, or shortly just Alex.