If you have a large repository it can be a long wait on the pipeline to get it checked out, only to use a few files. In this post, I will guide you through using PowerShell to download only the files you need with either the Azure DevOps REST API or CLI.
API Service
In both instances we will be using the REST API endpoint Get Items
This endpoint can return items from a Git repository within Azure DevOps, including its metadata, files content and/or the files themselves. There are multiple query parameters that affect the response of the API request. Below are the query parameters we will be using to download the JSON for either the files or folders metadata.
ScopePath
This is either the folder path or files path including the file name. The path should be relative to where it is within the repository not the file system.
For example:
File: /myexamplefolder/filename.txt
Folder: /myexamplefolder
VersionDescriptor.version
This is the name of the branch where to source the code from. If you use a dynamic method like the pre-defined variable System.SourceBranch then you will need to remove the refs/heads/ to leave just the branch name.
You also cannot use the dynamic Pull Request branch path for example refs/pull/37/merge
For example: mainfeature/my-feature
RecursionLevel
This parameter tells the request how far down to scan and return back. We are going to use the value full, which for a folder it will keep recursively scanning down until there are no more folders to go through and for a file it will just stop at that item.
format
The format query parameter details what format to return the requested content. Without this parameter when you request a folder it automatically returns it as JSON, but when requesting a file it tries to return the binary of the file. Therefore, adding this as default doesn’t affect the folder request and can force the file request to return it as JSON. The value for this parameter is json
Requesting files using PowerShell
Below will be the PowerShell code on how to download the files or folders. The supporting code between using the REST API or the CLI is exactly the same as the response back from both will be formatted from JSON into the same object.
Parameters
These are the parameters we will pass into the whole PowerShell file and use throughout the execution.
param(
[string]$organisationUrl, #https://dev.azure.com/{organisation name}/
[string]$repositoryName, #{repository name}
[string]$projectName, #{project name}
[string]$sourceBranch, #refs/heads/{branch name}
[string]$resourcePath, #resource/path or resource/file.txt
[string]$saveToPath #/home/root/path
)
Organisation URL
This is the URL to your organisation, which is https://dev.azure.com/{organisation-name}/.
Repository Name
This is the Azure DevOps Repository name.
Project Name
This is the Azure DevOps Project name.
Source Branch
This is the branch name, which can include the refs/heads/ as later we will prepare the string by removing this. It means you can use the predefined variable for the branch within Azure DevOps without needing to pre-process it.
Resource Path
This is the file or folders path within the repository. It should be relative to the branch rather than the files system.
Save To Path
This is the path to save the file or folders to. Without this it would just save to the current directory where the script is being executed, but we might want it saved to a specified place so we know where to find the files.
Setting Up
After we have set all the parameters we will do some setup for the variables to use it later. First we are going to assume you have already logged into the Azure CLI, so we will get your signed in personal access token. This is the token that will be used later in the requests, so you will need to make sure the account signed in is the one with the correct permissions. You could change this to parameter instead and use the `System.AccessToken` which will use the Build Services permissions that is running the pipeline.
$access = az account get-access-token -o json | ConvertFrom-Json
$accessToken = $access.accessToken
Next, as mentioned before, we will process the requested branch name to give the cleaned name and after we can just print out the requested parameters.
$branchName = $sourceBranch -replace "refs/heads/", ""
Write-Information -InformationAction Continue "Organisation: $organisationUrl"
Write-Information -InformationAction Continue "Project: $projectName"
Write-Information -InformationAction Continue "Repository: $repositoryName"
Write-Information -InformationAction Continue "Branch: $branchName"
Write-Information -InformationAction Continue "Path to Download: $resourcePath"
Process the files
With this function we will assume the object of files that are being passed in, which we will request later. This function will use this object to validate then save the files and folders to the required directory.
First we validate if there are any files being sent over and if not we can error out before attempting to process them. The object we will receive has a property count that contains how many items have been returned, so we can use this to determine if there are any files.
if ($files.count -eq 0) {
Write-Information -InformationAction Continue "##[error]No Files/Folders returned"
throw "There was no files or folders returned from $uri"
}
Next we can process the items, which we can loop through all the blobs, that are the files themseleves. If we are requesting a file then it will only be a single blob item returned, but if it is a folder requested then it will be a mixture of folder and blob items. All these items are held within the property value which is an array of objects.
foreach ($item in $($files.value | Where-Object { $_.gitObjectType -eq "blob" })) {
Within the loop we will make sure the blob items path exists. We could look at all the folders first and create the folders, but it is inefficient to double loop through all the object items. Instead we take each blob items path and combine it with the root directory we want to save the file to. This will give us a complete path, which we can then get the parent path that doesn’t include the files name.
With this we can check to see if that path exists or not and if it does not we can force create the whole path. This will create not just the last folder name but all the folders leading up to where the file is stored.
$filePath = Join-Path $targetBasePath $item.path
$fileParentPath = Split-Path $filePath -Parent
if (!(Test-Path $fileParentPath)) {
Write-Information -InformationAction Continue "Creating folder $fileParentPath"
New-Item -ItemType Directory -Force -Path $fileParentPath
}
With each item its request URL to get its content is returned, which is what we can use for each to request the contents of the file. The value is held in the url property of the item, that we use with the access token generated earlier for the authentication of the API request. As created earlier, we will use the $filePath variable which is the complete path including the file name. It is used as the output target where the API request will save the response content.
Invoke-WebRequest -Uri $item.url -Headers @{"Authorization" = "Bearer $accessToken" } -OutFile $filePath
Get Files By REST API
We will start with the REST API method to get the files, that will use the URL mentioned above. The URL contains the user requested Organisation URL, Project name and Repository name within the path, then the resource path and branch name within the query parameters. The parameters recursion level and format will be hardcoded as their values will never change.
$uri = "$organisationUrl$projectName/_apis/git/repositories/$repositoryName/items?scopePath=$resourcePath&versionDescriptor[version]=$branchName&recursionLevel=full&`$format=json"
With the access token generated or passed in earlier we can then simply request the API endpoint and save the content within the $response variable.
$response = Invoke-WebRequest -Uri $uri -Headers @{"Authorization" = "Bearer $accessToken" }
We can test if the response was successful after. This checks if the request was successful rather than if there was any files returned. This means if you put in a path that returned 0 files, you would still get a successful response. As this is a HTTP Request, we can check the status code is 200 success and if not report an error with the content.
if ($response.StatusCode -ne "200") {
Write-Information -InformationAction Continue "##[error]API Response Error"
throw $response.content
}
Finally before we turn the object, we need to convert the JSON response into the PowerShell Object.
$files = $response.content | ConvertFrom-Json
return $files;
Get Files by Azure CLI
As it is mentioned at the start of this post, we are going to use the API endpoint Get Items. When we request this via the REST API, we use the built in Invoke-WebRequest command to invoke the URL. Using the parameters mentioned above, this is what the URL looks like.
$url = "$organisationUrl$projectName/_apis/git/repositories/$repositoryName/items?scopePath=$resourcePath&versionDescriptor[version]=$branchName&recursionLevel=full&`$format=json"
This has all the data and the query parameters required to get the response we need for the functionality above.
For the CLI command, we do exactly the same but using the URLs information, we set the area as git and the resource as items. These are the components of the URL, that enable the CLI to do the look up for the REST endpoint. Behind the CLI, it is simply calling the same endpoint as the above, which is why we also then provided it will all the same route details and query parameters.
Using the CLI, as you will see from the full code further down, makes the authentication easier and code shorter, but I do notice the CLI takes longer. It is not a huge difference, but the CLI does some extra checking and looking up compared to directly calling the API.
$files = az devops invoke --http-method GET --org $organisationUrl --area "git" --resource "items" --route-parameters project=$projectName repositoryId=$repositoryName --query-parameters scopePath=$resourcePath versionDescriptor.version=$branchName recursionLevel="full" `$format="json" | ConvertFrom-Json
From both of these, we convert the response from JSON to a PowerShell object to be used in the function above.
Subscribe to get access
Subscribe and get access to the full examples of scripts and solutions.
Final Script
This is the final script all together using both the REST API and the CLI for your choice.
Subscribe to continue reading
Subscribe to get access to the rest of this post and other subscriber-only content.