Azure

Azure: Workbook without hard coded resources for automated deployment

Brain dump post, excuse typos and writing getting this out of my head while I still remember it.

In this post I’m going to go through how I used query based parameters to setup an Application Insights workbook so it does not have hard coded resource ID’s in it’s definition.

This means it’s much easier to use for automated deployments where these ID’s aren’t known and as new resources are deployed the existing workbook picks these up automatically without requiring manual changes.

In this scenario we’re deploying a Resource Group with ARM/Terraform. Each group has it’s own Application Insights deployed. In the group are some App Service plans, Cosmos DBs and Service Bus Namespaces. We want the workbook to deploy into the Application insights instance in the group and graph the resources for App Service Plans, Cosmos and ServiceBus.

I’m new to workbooks so be aware there may be a simpler way to do this that I’ve not yet found!

First up what does it look like if you don’t use this approach and just use the GUI to add metrics to your workbook.

Along the top we use the “Resource Type” and other drop downs to select the resource we want to graph then we add our “CPU” metric.

If you click into the “Advanced editor” you’ll see the following, notice that the “resourceIds” field now has a hard coded reference to the resource we selected.

This means if we exported this JSON and deployed it using ARM or Terraform the workbook wouldn’t work. We’d want it to graph the metrics for the resource deployed alongside it not the resource that is hard coded.

So how do we fix this?

Well we can use workbook parameters.

Parameters can be simple strings or more complex queries and resource selectors.

The first step is to find out the resource group we’re deployed into, this can be done by creating a parameter which finds the “Owned Resources”.

“Owned Resources” for this Application Insights instance is itself and the query returns it’s full Azure ID like: /subscriptions/YOURSUB/resourceGroups/rg-processing/providers/Microsoft.Insights/components/app-insights

We’re going to use this to extract the current resource group’s name.

Next we use the an Azure Resource Graph query where id == "{OwningAppInsights}" | project split(id, "/", 4)[0] to pull the Resource Group name out of this ID.

The query finds the application insights instance then pulls out the Resource Group it’s deployed into (this is what the split is doing on the Azure ID). We add this as a new parameter called “ResourceGroup” notice this param can depend on the previous param “OwningAppInsights” we just created.

Now we can create our last workbook parameter, one which selects the App Service Plans in the resource group. This uses the output of the “ResourceGroup” parameter above to query for all the Plans in the group by filter the “type” of the resources in the group.

To find out which type you should use in the above query run the following Azure Graph Query and review the results (note turn “formatted results” off to see the original values not the cleaned up ones) where resourceGroup == "processing-myrg" | project type, name

So the query where resourceGroup == "{ResourceGroup}" and type == "microsoft.web/serverfarms" is returning all the resources that are servicefarms… this is internal azure speak for App Service Plans.

We’ve ticked the box to “Allow multiple selections” and we’ve ticked “Hide parameter in reading mode” as we don’t want users of the workbook to change this manually.

Then we can use this parameter when setting up our metric graph like so, we can select the “ResourceApplicationPlans” parameter from the drop down and the graph now uses our auto-populated set of App Service plans.

Now we’re they’re the code/json of the workbook no longer contains any hard coded references to ID’s

You can see the “ResourceIds” is now set by out “ResourceApplicationPlans” parameter which is dynamically generated and selects all the App Plans that are deployed in the resource group the workbook is deployed in.

We can now automate the deployment of the workbook without templating the json!

Bonus if you add a new App Plan the workbook will pick it up and start graphing it. You can use the same approach to add parameters detecting other resource types like cosmos and graph those too.

Standard
Apps, Azure

Azure: Automate hosting a Windows Container inside VNET

The quickest and easiest way to start running a Windows Container in Azure is using Azure Container Instances (ACI).

The problem is that they currently (as of 03/21) don’t support running Windows Containers inside a VNET. This blog is about how I worked around this limitation by automating the deployment and management of a Windows Containers with PowerShell DSC and Terraform.

As we needed VNET integration for the sensitive data handled on the project, I set out to build an ACI like experience on a VM and have that connected to a VNET in Azure.

Warning: Please ensure you fully understand how PowerShell DSC works and review the code in full as there is complexity in this approach.

First up it must:

  • Handle restarting the container if things go wrong
  • Give an easy way to retrieve the container logs from the commandline
  • Connect reliably to the VNET
  • Support updating easily (ie. When I push a new image tag the container is restarted running the new version or when new environment variables are applied handle restarting the container to pick these up)
  • Support authentication to an Azure Container Repository
  • Be runnable as part of a Terraform Deployment

Seems like a pretty long list right? At this point I reached out to a friend, Marcus Robison, who’d done more Windows Admin than me in his time. He suggested looking at Powershell DSC.

So what is Powershell DSC, what does it give us?

  • Desired State Configuration for the VM. “I want a VM that looks like x” and it makes that happen. Much like Terraform or a K8s operator. It queries the current state and takes actions that move the current state closer to the desired state
  • Integration with Azure VMs. There is a nice extension in Azure which allows you to submit a DSC config and Azure manages starting it on the VM for you.
  • Handling of sensitive variables securely, with the Azure extension variables are encrypted

What does this all look like when you have it finished?

dsc_config.ps1

This script is responsible for configuring the machine, logging into ACR and ensuring the container is running. This runs periodically and handles things like restarting the container if a new deployment has been made with update environment variables.

Each Script (think resource in terraform) has a Get, Set and Test method. Test checks the current state of things, if they’re not how they’re meant to be Set is responsible for getting them configured correctly and lastly Get returns an identifier for the item.

module.tf

This is the terraform responsible for creating the VM and pushing up the DSC script for it to run.

It takes the dsc_config.ps1 and creates a zip file, this zip is then passed to the PowerShell DSC extension for the Azure VM which is responsible for applying the configuration to the VM.

As well as this the module also takes the environment variables you want set for your container. These are provided as a map and converted to a base64 encoded .env file. The DSC config on the VM decodes them and provides the .env file to the docker run command used to start the container.

*Worth nothing env.tpl is used in the process of creating the env file.

usage.tf

This is an example of using the terraform module from module.tf to create a VM which runs a container image on a VNET with a set of environment variables.

getlogs.ps1

Once deployed this little script demonstrates how you can get the logs out from the container running in the VM. It requires the Azure CLI to be installed and you to provide the VM’s Azure ID.

You can also hook this up to the outputs of your Terraform to automate it further.

All together now!

<#
.SYNOPSIS
Uses PowerShell DSC to configure the machine to run the container
.PARAMETER Image
Docker Image to run complete with tag
.PARAMETER Command
Command to run in the docker image
.PARAMETER RegistryUrl
Azure container registry url
.PARAMETER RegistryUsername
.PARAMETER RegistryPassword
.PARAMETER EnvironmentVariables
A base64 encoded string of a .env file to use when running the container
.PARAMETER InstanceName
The name to give the running docker container
.PARAMETER DockerConfigLocation
[Optional defaults to '"C:\ProgramData\Docker\config\daemon.json"'] The location on disk of the docker daemon.json config file
.PARAMETER DockerDataDir
[Optional defaults to 'D:\\'] Location on disk for docker to store volumes and images
#>
Configuration DockerImageStart {
[CmdletBinding()]
param
(
[Parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[String]
$Image,
[Parameter(Mandatory = $true)]
[String]
$Command,
[Parameter(Mandatory = $false)]
[String]
$ContainerName = "dscManagedContainerInstance",
[Parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[String]
$RegistryUrl,
[Parameter(Mandatory = $true)]
[String]
$RegistryUsername,
[Parameter(Mandatory = $true)]
[string]
$RegistryPassword,
[Parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[string]
$EnvironmentVariables,
[Parameter(Mandatory = $false)]
[string]
$DockerConfigLocation = "C:\ProgramData\Docker\config\daemon.json",
[Parameter(Mandatory = $false)]
[string]
$DockerDataDir = "D:\\"
)
Import-DscResource ModuleName 'PSDesiredStateConfiguration'
Node localhost
{
# Have the machine check every 15 mins that config is good.
# See details here: https://devblogs.microsoft.com/powershell/understanding-meta-configuration-in-windows-powershell-desired-state-configuration/
LocalConfigurationManager {
ConfigurationMode = "ApplyAndAutoCorrect"
RefreshFrequencyMins = 30
ConfigurationModeFrequencyMins = 15
RefreshMode = "PUSH"
RebootNodeIfNeeded = $true
}
# Docs: Each 'Script' resource ensures a configruation is setup correctly on the VM
# Get, test and set: https://docs.microsoft.com/en-us/powershell/scripting/dsc/resources/get-test-set?view=powershell-7.1
# Script module: https://docs.microsoft.com/en-us/powershell/scripting/dsc/reference/resources/windows/scriptResource?view=powershell-7.1
# Responsible for configuring the storage to use the ephemeral locally attached disk for docker storage
Script DockerStorageLocation {
SetScript = {
Set-Content Path $using:DockerConfigLocation Value "{ `"data-root`": `"$using:DockerDataDir`" }"
# Restart the Daemon so it picks up the new config
Restart-Service Force Docker
}
TestScript = {
if (!(Test-Path $using:DockerConfigLocation)) {
return $false
}
$dataroot = (Get-Content $using:DockerConfigLocation | ConvertFrom-Json)."data-root"
if ($dataroot -ne $using:DockerDataDir) {
return $false
}
if ((Get-Service Docker).Status -ne "Running") {
return $false
}
return $true
}
GetScript = {
# Return the ID of the current container
@{ Result = (Get-Content $using:DockerConfigLocation | ConvertFrom-Json)."data-root" }
}
}
# Responsible for ensuring that docker is logged into the ACR
Script AzureContainerRepositoryLogin {
DependsOn = "[Script]DockerStorageLocation"
SetScript = {
# Handle running the paramaterised docker commands: https://stackoverflow.com/questions/6338015/how-do-you-execute-an-arbitrary-native-command-from-a-string
function Invoke-Login($command) {
Write-Verbose "Running command $command"
$output = $using:RegistryPassword | & 'docker' $command.Split(" ") 2>&1
if (!$? -and -not ($output -like "Login Succeeded"))
{
throw "Docker command failed, err: $output"
}
Write-Output $output
}
# Login to the ACR
try {
$output = Invoke-Login "login -u $using:RegistryUsername –password-stdin $using:RegistryUrl"
}
catch {
Write-Error "Failed running login command $_ $output"
}
}
TestScript = {
# Check we're logged into the ACR
if (!(Test-Path ~/.docker/config.json))
{
Write-Verbose "No docker config file found so can't be logged in"
return $false
}
$ConfigSettings = Get-Content ~/.docker/config.json | ConvertFrom-Json
$LoginDetailsBase64 = $ConfigSettings.auth."$using:RegistryUrl"
if (!$LoginDetailsBase64) {
Write-Verbose "Didn't find login details for the repo $using:RegistryUrl"
return $false
}
$LoginDetailsRaw = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($LoginDetailsBase64))
if ($LoginDetailsRaw -ne "$($using:RegistryUsername):$using:RegistryPassword") {
Write-Verbose "Current credentials don't match expected creds for $using:RegistryUrl"
return $false
}
return $true
}
GetScript = {
@{ Result = (Get-Content ~/.docker/config.json | ConvertFrom-Json).auth."$using:RegistryUrl" }
}
}
# Responsible for starting the container and keeping it running
Script ContainerInstance {
DependsOn = '[Script]AzureContainerRepositoryLogin', '[Script]DockerStorageLocation'
SetScript = {
# Handle running the paramaterised docker commands: https://stackoverflow.com/questions/6338015/how-do-you-execute-an-arbitrary-native-command-from-a-string
function Invoke-Executable($command) {
Write-Verbose "Running command $command"
$output = & 'docker' $command.Split(" ") 2>&1
if (!$?)
{
throw "Docker command failed, err: $output"
}
Write-Output $output
}
# Attempt to remove the container if it exists
try {
Write-Verbose "Attempting to remove existing container"
$output = Invoke-Executable "container rm -f $using:ContainerName"
}
catch {
Write-Warning "Failed to remove existing container Error: $_ $output"
# This is allowed to fail, for example the container might not be present
}
# Pull image
try {
$output = Invoke-Executable "pull $using:Image"
}
catch {
Write-Error "An error occurred pulling image: $_ stdOut: $output"
}
# Start the container
try {
$EnvFileContent = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($using:EnvironmentVariables))
$EnvFileLocation = "C:\$($using:ContainerName).env"
Set-Content Path $EnvFileLocation Value $EnvFileContent
$EnvFileHash = (Get-FileHash $EnvFileLocation).Hash
$output = Invoke-Executable "container run -d –restart=always –name=$using:ContainerName –env-file=$EnvFileLocation –label EnvFileHash=$EnvFileHash $using:Image $using:Command"
}
catch {
Write-Error "An error occurred starting the container: $_ stdOut: $output"
}
}
TestScript = {
# Track errors from external commands: https://stackoverflow.com/questions/12359427/try-catch-on-executable-exe-in-powershell
$ErrorActionPreference = 'Stop'
# Retrieve all running conatiners
$RunningContainers = iex 'docker ps –format "{{json . }}"' | ConvertFrom-Json | Where-Object { $_.Names -eq $using:ContainerName }
# Write a "-check" version of the env file provided
$EnvFileContent = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($using:EnvironmentVariables))
$EnvFileLocation = "C:\$($using:ContainerName)-check.env"
Set-Content Path $EnvFileLocation Value $EnvFileContent
# Get the files hash
$EnvFileHash = (Get-FileHash $EnvFileLocation).Hash
if (
($RunningContainers | Measure-Object).Count -eq 1 `
-and $RunningContainers[0].Image -eq $using:Image `
-and $RunningContainers[0].Status -like "up *" `
)
{
# Cool so it's running, is it running the right version of the environment variables?
# Compare the hash to the currently set 'envhash' label on the running instance
$DockerInspecResult = iex "docker inspect $($RunningContainers[0].ID)" | ConvertFrom-Json
Write-Verbose "Env hash $($DockerInspecResult.Config.Labels.EnvFileHash)"
if ($DockerInspecResult.Config.Labels.EnvFileHash -eq $EnvFileHash) {
Write-Verbose "Container is in running state with correct image and env file"
return $true
}
}
Write-Verbose "Container does not exist, is not in a running state or has the incorrect image or env file"
return $false
}
GetScript = {
# Return the ID of the current container
@{ Result = (iex 'docker ps –format "{{json . }}"' | ConvertFrom-Json | Where-Object { $_.Names -eq $using:ContainerName }).ID}
}
}
}
}
view raw dsc_config.ps1 hosted with ❤ by GitHub
%{ for config_key, config_value in config ~}
${config_key}=${config_value}
%{ endfor ~}
view raw env.tpl hosted with ❤ by GitHub
$output = az vm runcommand invoke ids <machine_id_here> commandid RunPowershellScript scripts "docker inspect dscManagedContainerInstance" | ConvertFrom-Json
Write-Host $output.value[0].message
view raw getlogs.ps1 hosted with ❤ by GitHub
variable shared_env {
type = any
}
variable subnet_id {
type = string
}
variable name {
type = string
}
variable image {
type = string
}
variable command {
type = string
}
variable environment_variables {
description = "The environment variables to be set in the container"
default = {}
}
variable docker_registry_url {}
variable docker_registry_username {}
variable docker_registry_password {}
variable releases_storage_account_name {
type = string
}
variable releases_storage_account_key {
type = string
}
variable releases_container_name {
type = string
}
variable releases_storage_sas {
type = string
}
locals {
env_file = base64encode(templatefile(
"${path.module}/env.tpl",
{
config = var.environment_variables
}
))
script_name = "dsc_config.ps1"
script_zip = "dsc_config.zip"
script_hash = filemd5("${path.module}/dsc_config.ps1")
}
resource "random_string" "random" {
length = 5
special = false
upper = false
number = false
}
resource "random_string" "adminpw" {
length = 18
special = true
upper = true
number = true
}
resource "azurerm_network_interface" "nic" {
name = "${var.name}${random_string.random.result}"
resource_group_name = var.shared_env.rg.name
location = var.shared_env.rg.location
ip_configuration {
name = "internal"
subnet_id = var.subnet_id
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_windows_virtual_machine" "vm" {
name = "${var.name}${random_string.random.result}-vm"
resource_group_name = var.shared_env.rg.name
location = var.shared_env.rg.location
computer_name = "relimporter"
// 8 cores, 16GB ram and 128GB temp disk for import data to live on
size = "Standard_F8"
admin_username = "adminuser"
admin_password = random_string.adminpw.result
network_interface_ids = [
azurerm_network_interface.nic.id,
]
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "MicrosoftWindowsServer"
offer = "WindowsServer"
sku = "2019-Datacenter-Core-with-Containers"
version = "latest"
}
patch_mode = "AutomaticByOS"
}
data "archive_file" "script_zip" {
type = "zip"
source_file = "${path.module}/${local.script_name}"
output_path = "${path.module}/${local.script_zip}"
}
resource "azurerm_storage_blob" "dscps1" {
name = "dsc${local.script_hash}.zip"
storage_account_name = var.releases_storage_account_name
storage_container_name = var.releases_container_name
type = "Block"
source = "${path.module}/${local.script_zip}"
depends_on = [data.archive_file.script_zip]
}
// Using this https://docs.microsoft.com/en-us/azure/virtual-machines/extensions/dsc-windows
// to run a DSC configuration with will make sure the VM is running the container and
// periodically check for any issue and correct them.
// See: https://docs.microsoft.com/en-us/powershell/scripting/dsc/overview/overview?view=powershell-7.1
resource "azurerm_virtual_machine_extension" "dscconfig" {
name = "dscconfig${local.script_hash}"
virtual_machine_id = azurerm_windows_virtual_machine.vm.id
publisher = "Microsoft.Powershell"
type = "DSC"
type_handler_version = "2.77"
auto_upgrade_minor_version = true
depends_on = [azurerm_storage_blob.dscps1]
settings = jsonencode(jsondecode(<<SETTINGS
{
"configuration": {
"url": "https://${var.releases_storage_account_name}.blob.core.windows.net/${var.releases_container_name}/dsc${local.script_hash}.zip",
"script": "dsc_config.ps1",
"function": "DockerImageStart"
}
}
SETTINGS
))
protected_settings = jsonencode(jsondecode(<<JSON
{
"configurationArguments": {
"Image": "${var.image}",
"Command": "${var.command}",
"RegistryUrl": "${var.docker_registry_url}",
"RegistryUsername": "${var.docker_registry_username}",
"RegistryPassword": "${var.docker_registry_password}",
"EnvironmentVariables": "${local.env_file}"
},
"configurationUrlSasToken": "${var.releases_storage_sas}"
}
JSON
))
}
view raw module.tf hosted with ❤ by GitHub
# An example of using the above module
module "container_vm" {
source = "./docker_vm"
shared_env = local.shared_env
subnet_id = var.subnet_id
docker_registry_username = var.docker_registry_username
docker_registry_password = var.docker_registry_password
docker_registry_url = var.docker_registry_url
releases_storage_account_name = module.core.releases_storage_account_name
releases_storage_account_key = module.core.releases_storage_account_key
releases_storage_sas = module.core.releases_account_sas
releases_container_name = module.core.releases_container_name
name = "consolecontainer"
image = "myregistry.azurecr.io/thingy:imagetag"
command = "myConsoleApp.exe"
environment_variables = {
COSMOS_KEY = module.core.cosmos_account_key,
COSMOS_ENDPOINT = module.core.cosmos_account_endpoint,
COSMOS_DB_NAME = module.core.cosmos_db_name,
COSMOS_CONTAINER_NAME = module.core.cosmos_container_name,
}
}
view raw usage.tf hosted with ❤ by GitHub
Standard
#terraform, Apps, Azure, Coding, How to

Azure Functions Get Key from Terraform without InternalServerError

So you’re trying to use the Terraform azurerm_function_app_host_keys resource to get the keys from an Azure function after deployment. Sadly, as of 03/2021, this can fail intermittently 😢 (See issue 1 and 2).+

[Edit: Hopefully this issue is resolved by this PR once released so worth reviewing once the change is released]

These errors can look something like these below:

Error making Read request on AzureRM Function App Hostkeys “***”: web.AppsClient#ListHostKeys: Failure responding to request: StatusCode=400 — Original Error: autorest/azure: Service returned an error. Status=400 Code=”BadRequest” Message=”Encountered an error (ServiceUnavailable) from host runtime”

Error: Error making Read request on AzureRM Function App Hostkeys “somefunx”: web.AppsClient#ListHostKeys: Failure responding to request: StatusCode=400

You can work around this by using my previous workaround with ARM templates but it’s a bit clunky so I was looking at another way to do it.

There is an AWESOME project by Scott Winkler called Shell Provider, it lets you write a custom Terraform provider using scripts. You can implement data types and full resources with CRUD support.

Looking into the errors returned by the azurerm_function_app_host_keys resource they’re intermittent and look like they’re related to a timing issue. Did you know the curl command support retrying out of the box?

So using the Shell provider we can create a simple script to make the REST request to the Azure API and use curls inbuilt retry support to have the request retried with an exponential back-off until it succeeds or 5mins is up!

Warning: This script uses –retry-all-errors which is only available in v7.71 and above. The version shipped with the distro your using might not be up-to-date user curl --version to check.

Here is a rough example of what you end up with:

terraform {
required_providers {
shell = {
source = "scottwinkler/shell"
version = "1.7.7"
}
}
}
resource "azurerm_function_app" "functions" {
name = "${var.function_name}${var.random_string}-premium"
location = var.resource_group_location
resource_group_name = var.resource_group_name
app_service_plan_id = var.app_service_plan_id
version = "~3"
storage_account_name = var.storage_account_name
storage_account_access_key = var.storage_account_key
identity {
type = "SystemAssigned"
}
site_config {
# Ensure we use all the mem on the box and not only 3.5GB of it!
use_32_bit_worker_process = false
pre_warmed_instance_count = 1
}
app_settings = merge({
StorageContainerName = var.test_storage_container_name
https_only = true
FUNCTIONS_WORKER_RUNTIME = "dotnet"
HASH = base64encode(filesha256(local.func_zip_path))
WEBSITE_RUN_FROM_PACKAGE = "https://${var.storage_account_name}.blob.core.windows.net/${var.deployment_container_name}/${azurerm_storage_blob.appcode.name}${var.storage_sas}"
# Route outbound requests over VNET see: https://docs.microsoft.com/en-us/azure/azure-functions/functions-networking-options#regional-virtual-network-integration
WEBSITE_DNS_SERVER = "168.63.129.16"
WEBSITE_VNET_ROUTE_ALL = 1
}, var.app_settings)
}
data "azurerm_subscription" "current" {
}
data "shell_script" "functions_key" {
lifecycle_commands {
read = file("${path.module}/readkey.sh")
}
environment = {
FUNC_NAME = azurerm_function_app.functions.name
RG_NAME = var.resource_group_name
SUB_ID = data.azurerm_subscription.current.subscription_id
}
depends_on = [azurerm_function_app.functions]
}
view raw main.tf hosted with ❤ by GitHub
output "function_master_key" {
# Try is used here to ensure destroy works as expected. On destroy the map will be
# empty so try instead returns an empty string
# See: https://www.terraform.io/docs/language/functions/try.html
value = try(data.shell_script.functions_key.output["masterKey"], "")
}
output "function_hostname" {
value = azurerm_function_app.functions.default_hostname
}
output "function_name" {
value = azurerm_function_app.functions.name
}
view raw output.tf hosted with ❤ by GitHub
#!/bin/bash
set -e
# Get a token so we can call the ARM api
TOKEN=$(az account get-access-token -o json | jq -r .accessToken)
# Attempt to list the keys with exponential backoff and do this for 5mins max
# –fail required see https://github.com/curl/curl/issues/6712
curl "https://management.azure.com/subscriptions/$SUB_ID/resourceGroups/$RG_NAME/providers/Microsoft.Web/sites/$FUNC_NAME/host/default/listkeys?api-version=2018-11-01" \
–compressed -H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: Bearer $TOKEN" -d "{}" \
–retry 8 –retry-max-time 360 –retry-all-errors –fail –silent
view raw readkeys.sh hosted with ❤ by GitHub

Standard