Uncategorized

Running Pester Tests in Parallel

I’m working on a project which uses pester tests to validate our deployment and system health.

We’ve accumulated a lot of tests. Nearly all of these sit and wait on things to happen in the deployment and we’re running them sequentially which takes around 20mins. Each of our tests is in it’s own file.

What I wanted to do was, as these are IO bound tests waiting on things to trip in the environment or polling http endpoints, run them all in parallel. One file per thead or something similar.

This issue discusses this topic. I took a lead from the Invoke-Parallel snipped and started playing, unfortunately the output from the tests was mangled and overlapping.

Then I realised I could use Powershell Jobs to schedule the work and poll for the job to be completed then receive each jobs to have to output displayed nicely.

So now the output looks good:



Note: We’re using Pester v4 you’ll have to do a little bit of fiddling to port to Pester 5

Install-Module Name Pester RequiredVersion 4.6.0 force
$testFilePath = "./tests"
# Start a jobs running each of the test files
$testFiles = Get-ChildItem $testFilePath
$resultFileNumber = 0
foreach ($testFile in $testFiles)
{
$resultFileNumber++
$testName = Split-Path $testFile leaf
# Create the job, be sure to pass argument in from the ArgumentList which
# are needed for inside the script block, they are NOT automatically passed.
Start-Job `
ArgumentList $testFile, $resultFileNumber `
Name $testName `
ScriptBlock {
param($testFile, $resultFileNumber)
# Start trace for local debugging if TEST_LOG=true
# the traces will show you output in the ./testlogs folder and the files
# are updated as the tests run so you can follow along
if ($env:TEST_LOGS -eq "true") {
Start-Transcript Path "./testlogs/$(Split-Path $testFile leaf).integrationtest.log"
}
# Run the test file
Write-Host "$testFile to result file #$resultFileNumber"
$result = Invoke-Pester "$testFile"
if ($result.FailedCount -gt 0) {
throw "1 or more assertions failed"
}
}
}
# Poll to give insight into which jobs are still running so you can spot long running ones
do {
Write-Host ">> Still running tests @ $(Get-Date Format "HH:mm:ss")" ForegroundColor Blue
Get-Job | Where-Object { $_.State -eq "Running" } | Format-Table AutoSize
Start-Sleep Seconds 15
} while ((get-job | Where-Object { $_.State -eq "Running" } | Measure-Object).Count -gt 1)
# Catch edge cases by wait for all of them to finish
Get-Job | Wait-Job
$failedJobs = Get-Job | Where-Object { -not ($_.State -eq "Completed")}
# Receive the results of all the jobs, don't stop at errors
Get-Job | Receive-Job AutoRemoveJob Wait ErrorAction 'Continue'
if ($failedJobs.Count -gt 0) {
Write-Host "Failed Jobs" ForegroundColor Red
$failedJobs
throw "One or more tests failed"
}
view raw run.ps1 hosted with ❤ by GitHub

Here is the full repro: https://github.com/lawrencegripper/hack-parallelpester

Standard
Uncategorized

Azure Devops: How to run a Task if files have changed since last build

Shout out to the awesome work here from Alex Yates! This post builds on that work and updates a few bits.

What is the the aim? I have a file called IMAGETAG.txt which contains a simple version v1.0.1. It is used to build and push a Docker container as part of the build. If the file is changed in a commit, I want to build and push the docker image.

Now normally you could use the Path filtering stuff build into the Azure Devops Triggers but in this case I have lots of other tasks which I DO want to run and I don’t want multiple builds.

So how does it work? Well first up we’re going to create a script based off Alex’s work and update the params, API versions used and update the way it matches files.

The result is something we can call like this,

      changesSinceLastBuild.ps1` 
            -outputVariableName 'ML_IMAGE_TAG_CHANGED'`
            -fileMatchExpression 'containers/IMAGETAG.txt'`
            -branch 'refs/heads/main'`
            -buildDefinitionId '20'

This goes and gets the latest successful build from main for the specified build definition (in case you have multiple builds) and compares the changes done between then and the current HEAD commit.

In then uses powershell -match to find out if the file containers/IMAGETAG.txt was changed.

If it was it sets the Azure Devops Build variable ML_IMAGE_TAG_CHANGED to true.

We can then use this as a condition on another task in the Job.

So all together it looks like this, in my case the tasks invoke psake targets for ciml-docker.

job: ciml
displayName: "Machine Learning CI"
pool:
vmImage: 'Ubuntu 20.04'
steps:
task: PowerShell@2
displayName: 'Run CI Task from make.ps1 in Devcontainer'
inputs:
targetType: 'inline'
script: 'Install-Module -Name PSake -Force && Invoke-psake ./make.ps1 ciml'
errorActionPreference: 'stop'
pwsh: true
workingDirectory: $(Build.SourcesDirectory)
task: PowerShell@2
displayName: 'Check if IMAGETAG.txt has changed since last build on main'
env:
System_AccessToken: $(System.AccessToken)
inputs:
targetType: 'filePath'
filePath: ./scripts/build_scripts/changesSinceLastBuild.ps1
arguments: >-
-outputVariableName 'ML_IMAGE_TAG_CHANGED'
-fileMatchExpression 'containers/IMAGETAG.txt'
-branch 'refs/heads/main'
-buildDefinitionId '20'
errorActionPreference: 'stop'
pwsh: true
workingDirectory: $(Build.SourcesDirectory)
task: AzurePowerShell@5
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'), eq(variables.ML_IMAGE_TAG_CHANGED, 'true'))
displayName: 'Build ML Container and Push to ACR'
inputs:
azureSubscription: $(azure_connection)
ScriptType: 'inlineScript'
pwsh: true
inline: 'Install-Module -Name PSake -Force && Invoke-psake ./make.ps1 ciml-docker; exit (!$psake.build_success)'
errorActionPreference: 'stop'
workingDirectory: $(Build.SourcesDirectory)
azurePowerShellVersion: 'LatestVersion'
# This is an updated and tweaked version of a script designed to find if changes
# have occurred since the last build. We've moved to the latest API and tweaked to spot changes in a file.
# Original Source: http://workingwithdevs.com/azure-devops-services-api-powershell-hosted-build-agents/
param(
# The variable in AZDO to set when changes are found
[string]$outputVariableName = "changesFound",
# This can be 'containers/IMAGETAG.txt' or '.*/IMAGETAG.txt'. Expression is passed to powershell '-match' so valid regex NOT minimatch/glob
[string]$fileMatchExpression = $env:FILE_TO_CHECK,
# The build definition ID. In this case it's 20, look at the URL when editing a build to find it.
[string]$buildDefinitionId = "20",
[string]$branch = "refs/heads/main"
)
# Configuring this PS script to use the Azure DevOps Service Rest API
# Attempting to follow steps at: http://donovanbrown.com/post/how-to-call-team-services-rest-api-from-powershell
$pat = "Bearer $env:System_AccessToken"
# Set $env:LocalTest to test things out locally
if ($env:LocalTest) {
# When testing locally use the below to authenticate
$pat = $env:PAT_TOKEN # <- set this env to your pat token
$pat = "Basic " + [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes(":$pat"))
$env:SYSTEM_TEAMFOUNDATIONCOLLECTIONURI = "https://dev.azure.com/{ORGNAME}/"
$env:SYSTEM_TEAMPROJECTID = "{PROJECT NAME}"
$env:BUILD_BUILDID = 1163 # <- pick the build you'd like to pretend to be
}
Write-Host $env:System_AccessToken
# Setting the script to authenticate using the system access token on the Azure DevOps Build Agent
# Note: Remember to set the agent job to "Allow scripts to access OAuth token" (under "Additional options")
# FINDING THE PREVIOUS SUCCESSFUL BUILD
# Note: Combining double quotes and single quotes below because double quotes allow variable substitution
# (desired in first part) but single quotes do not (not desired in second part)
$env:SYSTEM_TEAMPROJECTID = [uri]::EscapeUriString($env:SYSTEM_TEAMPROJECTID)
$lastGreenBuildUrl = "$($env:SYSTEM_TEAMFOUNDATIONCOLLECTIONURI)$env:SYSTEM_TEAMPROJECTID/_apis/build/builds?branchName=$branch&definitions=$buildDefinitionId&queryOrder=finishTimeDescending&resultFilter=succeeded&api-version=6.0"
Write-Host "url: $lastGreenBuildUrl"
$data = Invoke-RestMethod Uri "$lastGreenBuildUrl" Headers @{Authorization = $pat}
Write-Host "Raw data returned from Green Build API call: $data"
$lastGreenBuild = $data.value.id | Select-Object First 1
Write-Host "Last successfult build was $lastGreenBuild"
# FINDING ALL COMMITS SINCE THE LAST SUCCESSFUL COMMIT
$from = $lastGreenBuild
$to = $env:BUILD_BUILDID
Write-Host "from: $from"
Write-Host "to: $to"
$url = "$($env:SYSTEM_TEAMFOUNDATIONCOLLECTIONURI)$env:SYSTEM_TEAMPROJECTID/_apis/build/changes?fromBuildId=$from&toBuildId=$to&api-version=5.0-preview.2"
Write-Host "url: $url"
$commits = Invoke-RestMethod Uri $url Headers @{Authorization = $pat }
Write-Host "Raw data returned from API call: $commits"
# saving all commit ids into a list
$commitHashes = @()
foreach ($id in $commits.value) {
$rawData = Get-Member inputObject $id name "id" | Out-String
# $rawData is in format "string id=<hash> ". We need to clean it up.
$label, $hash = $rawData.Split("=")
$hash = $hash.trim()
# Adding $hash to $commitHashes
$commitHashes += $hash
}
Write-Host "Commit hashes associated with this build:"
foreach ($hash in $commitHashes) {
Write-Host $hash
}
# We can use this function to check if a given directory has been updated since last build
$lastCommit = $commitHashes | Select-Object Last 1
$currentCommit = git revparse HEAD
$currentCommit = $currentCommit.Trim()
Write-Host "Finding files changed between $lastCommit and $currentCommit"
# Show diff between
# See: https://stackoverflow.com/questions/3368590/show-diff-between-commits
$changedFiles = git diff $lastCommit^..$currentCommit nameonly
Write-Host "Changed files"
$changedFiles
# Check if the file we're interested in changed
$relevantChanges = $changedFiles -match $fileMatchExpression
if ($relevantChanges) {
Write-Host ">>>>> CHANGED FOUND in file $fileMatchExpression since last successful build $from. Build var $outputVariableName set to 'true'"
# Updating the build variables accordingly
Write-Output ("##vso[task.setvariable variable=$outputVariableName;]true")
} else {
Write-Host "No relevant changes found.. variable set to $changesFound"
}

Standard
Coding, How to

K8s Operator with dynamic CRDs using controller runtime (no structs)

Warning: This is a bit of a brain dump.

I’m working on a project at the moment which dynamically creates a set of CRDs in Kubernetes and an operator to manage them based off a schema which is provided by the user at runtime.

When the code is being built it doesn’t know the schema/shape of the CRDs. This means the standard approach used in Kubebuilder with controller-gen isn’t going to work.

Now, for those that haven’t played with Kubebuilder it’s gives you a few super useful things to build a K8s operator in Go:

  1. Controller-gen creates all the structs, templated controllers and keeps all those type files in sync for you. So you change a CRDs Struct and the CRD Yaml spec is updated etc. These are all build time tools so we can’t use em.
  2. A nice abstraction around how to interact with K8s as a controller – The controller-runtime. As the name suggests we can use this one at runtime.

So while we can’t use the build time controller-gen we can still use all the goodness of the controller-runtime. In theory.

This is where the fun came in, there aren’t any docs on interacting with a dynamic/unstructured object type using the controller runtime so I did a bit of playing around.

(Code samples for illustration – if you want end2end running example skip to the bottom).

To get started on this journey we need a helping hand. Kuberentes has an API for working which objects which don’t have a Golang struct defined. This is how we can start: Lets check out the go docs for unstructured..

"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"

Ok so this gives us some nice ways to work with a CRD which doesn’t have a struct defined.

To use this meaningfully we’re going to have to tell it the type it represents – In K8s this means telling it it’s Group, Version and Kind. These are all wrapped up nicely in the schema.GroupVersionKind struct. Lets look at the docs:

"k8s.io/apimachinery/pkg/runtime/schema"

Great so hooking these two up together we can create an Unstructured instance that represents a CRD, like so!

Cool, so what can we do from here? Well the controller runtime uses the runtime.object interface for all it’s interactions and guess what we have now? Yup a runtime.Object.. wrapper method to make things obvious

Well now we can create an instance of the controller for our unstructured CRD.

Notice that I’m passing the GroupVersionKind into the controller struct – this will be useful when we come to make changes to a CRD we’re handling.

In the same way that you can use the r.Client on the controller in Kubebuilder you can now use it with the unstructured resource. We use the gvk again here to set the type so that the client knows how to work with it.

Now you might be thinking – wow isn’t it going to be painful working without the strongly typed CRD structs?

Yes it’s harder but there are some helper methods in the unstructured api which make things much easier. For example, the following let you easily retrieve or set a string which is nested inside it.

unstructured.NestedString(resource.Object, "status", "_tfoperator", "tfState")

unstructured.SetNestedField(resource.Object, string(serializedState), "status", "_tfoperator", "tfState")

Here is the end result hooking up the controller runtime to a set of dynamically created and managed CRDS. It’s very much a work in progress and I’d love feedback if there are easier ways to tackle this or things that I’ve got wrong.

Standard