Apps, Azure

Azure: Automate hosting a Windows Container inside VNET

The quickest and easiest way to start running a Windows Container in Azure is using Azure Container Instances (ACI).

The problem is that they currently (as of 03/21) don’t support running Windows Containers inside a VNET. This blog is about how I worked around this limitation by automating the deployment and management of a Windows Containers with PowerShell DSC and Terraform.

As we needed VNET integration for the sensitive data handled on the project, I set out to build an ACI like experience on a VM and have that connected to a VNET in Azure.

Warning: Please ensure you fully understand how PowerShell DSC works and review the code in full as there is complexity in this approach.

First up it must:

  • Handle restarting the container if things go wrong
  • Give an easy way to retrieve the container logs from the commandline
  • Connect reliably to the VNET
  • Support updating easily (ie. When I push a new image tag the container is restarted running the new version or when new environment variables are applied handle restarting the container to pick these up)
  • Support authentication to an Azure Container Repository
  • Be runnable as part of a Terraform Deployment

Seems like a pretty long list right? At this point I reached out to a friend, Marcus Robison, who’d done more Windows Admin than me in his time. He suggested looking at Powershell DSC.

So what is Powershell DSC, what does it give us?

  • Desired State Configuration for the VM. “I want a VM that looks like x” and it makes that happen. Much like Terraform or a K8s operator. It queries the current state and takes actions that move the current state closer to the desired state
  • Integration with Azure VMs. There is a nice extension in Azure which allows you to submit a DSC config and Azure manages starting it on the VM for you.
  • Handling of sensitive variables securely, with the Azure extension variables are encrypted

What does this all look like when you have it finished?

dsc_config.ps1

This script is responsible for configuring the machine, logging into ACR and ensuring the container is running. This runs periodically and handles things like restarting the container if a new deployment has been made with update environment variables.

Each Script (think resource in terraform) has a Get, Set and Test method. Test checks the current state of things, if they’re not how they’re meant to be Set is responsible for getting them configured correctly and lastly Get returns an identifier for the item.

module.tf

This is the terraform responsible for creating the VM and pushing up the DSC script for it to run.

It takes the dsc_config.ps1 and creates a zip file, this zip is then passed to the PowerShell DSC extension for the Azure VM which is responsible for applying the configuration to the VM.

As well as this the module also takes the environment variables you want set for your container. These are provided as a map and converted to a base64 encoded .env file. The DSC config on the VM decodes them and provides the .env file to the docker run command used to start the container.

*Worth nothing env.tpl is used in the process of creating the env file.

usage.tf

This is an example of using the terraform module from module.tf to create a VM which runs a container image on a VNET with a set of environment variables.

getlogs.ps1

Once deployed this little script demonstrates how you can get the logs out from the container running in the VM. It requires the Azure CLI to be installed and you to provide the VM’s Azure ID.

You can also hook this up to the outputs of your Terraform to automate it further.

All together now!

<#
.SYNOPSIS
Uses PowerShell DSC to configure the machine to run the container
.PARAMETER Image
Docker Image to run complete with tag
.PARAMETER Command
Command to run in the docker image
.PARAMETER RegistryUrl
Azure container registry url
.PARAMETER RegistryUsername
.PARAMETER RegistryPassword
.PARAMETER EnvironmentVariables
A base64 encoded string of a .env file to use when running the container
.PARAMETER InstanceName
The name to give the running docker container
.PARAMETER DockerConfigLocation
[Optional defaults to '"C:\ProgramData\Docker\config\daemon.json"'] The location on disk of the docker daemon.json config file
.PARAMETER DockerDataDir
[Optional defaults to 'D:\\'] Location on disk for docker to store volumes and images
#>
Configuration DockerImageStart {
[CmdletBinding()]
param
(
[Parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[String]
$Image,
[Parameter(Mandatory = $true)]
[String]
$Command,
[Parameter(Mandatory = $false)]
[String]
$ContainerName = "dscManagedContainerInstance",
[Parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[String]
$RegistryUrl,
[Parameter(Mandatory = $true)]
[String]
$RegistryUsername,
[Parameter(Mandatory = $true)]
[string]
$RegistryPassword,
[Parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[string]
$EnvironmentVariables,
[Parameter(Mandatory = $false)]
[string]
$DockerConfigLocation = "C:\ProgramData\Docker\config\daemon.json",
[Parameter(Mandatory = $false)]
[string]
$DockerDataDir = "D:\\"
)
Import-DscResource ModuleName 'PSDesiredStateConfiguration'
Node localhost
{
# Have the machine check every 15 mins that config is good.
# See details here: https://devblogs.microsoft.com/powershell/understanding-meta-configuration-in-windows-powershell-desired-state-configuration/
LocalConfigurationManager {
ConfigurationMode = "ApplyAndAutoCorrect"
RefreshFrequencyMins = 30
ConfigurationModeFrequencyMins = 15
RefreshMode = "PUSH"
RebootNodeIfNeeded = $true
}
# Docs: Each 'Script' resource ensures a configruation is setup correctly on the VM
# Get, test and set: https://docs.microsoft.com/en-us/powershell/scripting/dsc/resources/get-test-set?view=powershell-7.1
# Script module: https://docs.microsoft.com/en-us/powershell/scripting/dsc/reference/resources/windows/scriptResource?view=powershell-7.1
# Responsible for configuring the storage to use the ephemeral locally attached disk for docker storage
Script DockerStorageLocation {
SetScript = {
Set-Content Path $using:DockerConfigLocation Value "{ `"data-root`": `"$using:DockerDataDir`" }"
# Restart the Daemon so it picks up the new config
Restart-Service Force Docker
}
TestScript = {
if (!(Test-Path $using:DockerConfigLocation)) {
return $false
}
$dataroot = (Get-Content $using:DockerConfigLocation | ConvertFrom-Json)."data-root"
if ($dataroot -ne $using:DockerDataDir) {
return $false
}
if ((Get-Service Docker).Status -ne "Running") {
return $false
}
return $true
}
GetScript = {
# Return the ID of the current container
@{ Result = (Get-Content $using:DockerConfigLocation | ConvertFrom-Json)."data-root" }
}
}
# Responsible for ensuring that docker is logged into the ACR
Script AzureContainerRepositoryLogin {
DependsOn = "[Script]DockerStorageLocation"
SetScript = {
# Handle running the paramaterised docker commands: https://stackoverflow.com/questions/6338015/how-do-you-execute-an-arbitrary-native-command-from-a-string
function Invoke-Login($command) {
Write-Verbose "Running command $command"
$output = $using:RegistryPassword | & 'docker' $command.Split(" ") 2>&1
if (!$? -and -not ($output -like "Login Succeeded"))
{
throw "Docker command failed, err: $output"
}
Write-Output $output
}
# Login to the ACR
try {
$output = Invoke-Login "login -u $using:RegistryUsername –password-stdin $using:RegistryUrl"
}
catch {
Write-Error "Failed running login command $_ $output"
}
}
TestScript = {
# Check we're logged into the ACR
if (!(Test-Path ~/.docker/config.json))
{
Write-Verbose "No docker config file found so can't be logged in"
return $false
}
$ConfigSettings = Get-Content ~/.docker/config.json | ConvertFrom-Json
$LoginDetailsBase64 = $ConfigSettings.auth."$using:RegistryUrl"
if (!$LoginDetailsBase64) {
Write-Verbose "Didn't find login details for the repo $using:RegistryUrl"
return $false
}
$LoginDetailsRaw = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($LoginDetailsBase64))
if ($LoginDetailsRaw -ne "$($using:RegistryUsername):$using:RegistryPassword") {
Write-Verbose "Current credentials don't match expected creds for $using:RegistryUrl"
return $false
}
return $true
}
GetScript = {
@{ Result = (Get-Content ~/.docker/config.json | ConvertFrom-Json).auth."$using:RegistryUrl" }
}
}
# Responsible for starting the container and keeping it running
Script ContainerInstance {
DependsOn = '[Script]AzureContainerRepositoryLogin', '[Script]DockerStorageLocation'
SetScript = {
# Handle running the paramaterised docker commands: https://stackoverflow.com/questions/6338015/how-do-you-execute-an-arbitrary-native-command-from-a-string
function Invoke-Executable($command) {
Write-Verbose "Running command $command"
$output = & 'docker' $command.Split(" ") 2>&1
if (!$?)
{
throw "Docker command failed, err: $output"
}
Write-Output $output
}
# Attempt to remove the container if it exists
try {
Write-Verbose "Attempting to remove existing container"
$output = Invoke-Executable "container rm -f $using:ContainerName"
}
catch {
Write-Warning "Failed to remove existing container Error: $_ $output"
# This is allowed to fail, for example the container might not be present
}
# Pull image
try {
$output = Invoke-Executable "pull $using:Image"
}
catch {
Write-Error "An error occurred pulling image: $_ stdOut: $output"
}
# Start the container
try {
$EnvFileContent = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($using:EnvironmentVariables))
$EnvFileLocation = "C:\$($using:ContainerName).env"
Set-Content Path $EnvFileLocation Value $EnvFileContent
$EnvFileHash = (Get-FileHash $EnvFileLocation).Hash
$output = Invoke-Executable "container run -d –restart=always –name=$using:ContainerName –env-file=$EnvFileLocation –label EnvFileHash=$EnvFileHash $using:Image $using:Command"
}
catch {
Write-Error "An error occurred starting the container: $_ stdOut: $output"
}
}
TestScript = {
# Track errors from external commands: https://stackoverflow.com/questions/12359427/try-catch-on-executable-exe-in-powershell
$ErrorActionPreference = 'Stop'
# Retrieve all running conatiners
$RunningContainers = iex 'docker ps –format "{{json . }}"' | ConvertFrom-Json | Where-Object { $_.Names -eq $using:ContainerName }
# Write a "-check" version of the env file provided
$EnvFileContent = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($using:EnvironmentVariables))
$EnvFileLocation = "C:\$($using:ContainerName)-check.env"
Set-Content Path $EnvFileLocation Value $EnvFileContent
# Get the files hash
$EnvFileHash = (Get-FileHash $EnvFileLocation).Hash
if (
($RunningContainers | Measure-Object).Count -eq 1 `
-and $RunningContainers[0].Image -eq $using:Image `
-and $RunningContainers[0].Status -like "up *" `
)
{
# Cool so it's running, is it running the right version of the environment variables?
# Compare the hash to the currently set 'envhash' label on the running instance
$DockerInspecResult = iex "docker inspect $($RunningContainers[0].ID)" | ConvertFrom-Json
Write-Verbose "Env hash $($DockerInspecResult.Config.Labels.EnvFileHash)"
if ($DockerInspecResult.Config.Labels.EnvFileHash -eq $EnvFileHash) {
Write-Verbose "Container is in running state with correct image and env file"
return $true
}
}
Write-Verbose "Container does not exist, is not in a running state or has the incorrect image or env file"
return $false
}
GetScript = {
# Return the ID of the current container
@{ Result = (iex 'docker ps –format "{{json . }}"' | ConvertFrom-Json | Where-Object { $_.Names -eq $using:ContainerName }).ID}
}
}
}
}
view raw dsc_config.ps1 hosted with ❤ by GitHub
%{ for config_key, config_value in config ~}
${config_key}=${config_value}
%{ endfor ~}
view raw env.tpl hosted with ❤ by GitHub
$output = az vm runcommand invoke ids <machine_id_here> commandid RunPowershellScript scripts "docker inspect dscManagedContainerInstance" | ConvertFrom-Json
Write-Host $output.value[0].message
view raw getlogs.ps1 hosted with ❤ by GitHub
variable shared_env {
type = any
}
variable subnet_id {
type = string
}
variable name {
type = string
}
variable image {
type = string
}
variable command {
type = string
}
variable environment_variables {
description = "The environment variables to be set in the container"
default = {}
}
variable docker_registry_url {}
variable docker_registry_username {}
variable docker_registry_password {}
variable releases_storage_account_name {
type = string
}
variable releases_storage_account_key {
type = string
}
variable releases_container_name {
type = string
}
variable releases_storage_sas {
type = string
}
locals {
env_file = base64encode(templatefile(
"${path.module}/env.tpl",
{
config = var.environment_variables
}
))
script_name = "dsc_config.ps1"
script_zip = "dsc_config.zip"
script_hash = filemd5("${path.module}/dsc_config.ps1")
}
resource "random_string" "random" {
length = 5
special = false
upper = false
number = false
}
resource "random_string" "adminpw" {
length = 18
special = true
upper = true
number = true
}
resource "azurerm_network_interface" "nic" {
name = "${var.name}${random_string.random.result}"
resource_group_name = var.shared_env.rg.name
location = var.shared_env.rg.location
ip_configuration {
name = "internal"
subnet_id = var.subnet_id
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_windows_virtual_machine" "vm" {
name = "${var.name}${random_string.random.result}-vm"
resource_group_name = var.shared_env.rg.name
location = var.shared_env.rg.location
computer_name = "relimporter"
// 8 cores, 16GB ram and 128GB temp disk for import data to live on
size = "Standard_F8"
admin_username = "adminuser"
admin_password = random_string.adminpw.result
network_interface_ids = [
azurerm_network_interface.nic.id,
]
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "MicrosoftWindowsServer"
offer = "WindowsServer"
sku = "2019-Datacenter-Core-with-Containers"
version = "latest"
}
patch_mode = "AutomaticByOS"
}
data "archive_file" "script_zip" {
type = "zip"
source_file = "${path.module}/${local.script_name}"
output_path = "${path.module}/${local.script_zip}"
}
resource "azurerm_storage_blob" "dscps1" {
name = "dsc${local.script_hash}.zip"
storage_account_name = var.releases_storage_account_name
storage_container_name = var.releases_container_name
type = "Block"
source = "${path.module}/${local.script_zip}"
depends_on = [data.archive_file.script_zip]
}
// Using this https://docs.microsoft.com/en-us/azure/virtual-machines/extensions/dsc-windows
// to run a DSC configuration with will make sure the VM is running the container and
// periodically check for any issue and correct them.
// See: https://docs.microsoft.com/en-us/powershell/scripting/dsc/overview/overview?view=powershell-7.1
resource "azurerm_virtual_machine_extension" "dscconfig" {
name = "dscconfig${local.script_hash}"
virtual_machine_id = azurerm_windows_virtual_machine.vm.id
publisher = "Microsoft.Powershell"
type = "DSC"
type_handler_version = "2.77"
auto_upgrade_minor_version = true
depends_on = [azurerm_storage_blob.dscps1]
settings = jsonencode(jsondecode(<<SETTINGS
{
"configuration": {
"url": "https://${var.releases_storage_account_name}.blob.core.windows.net/${var.releases_container_name}/dsc${local.script_hash}.zip",
"script": "dsc_config.ps1",
"function": "DockerImageStart"
}
}
SETTINGS
))
protected_settings = jsonencode(jsondecode(<<JSON
{
"configurationArguments": {
"Image": "${var.image}",
"Command": "${var.command}",
"RegistryUrl": "${var.docker_registry_url}",
"RegistryUsername": "${var.docker_registry_username}",
"RegistryPassword": "${var.docker_registry_password}",
"EnvironmentVariables": "${local.env_file}"
},
"configurationUrlSasToken": "${var.releases_storage_sas}"
}
JSON
))
}
view raw module.tf hosted with ❤ by GitHub
# An example of using the above module
module "container_vm" {
source = "./docker_vm"
shared_env = local.shared_env
subnet_id = var.subnet_id
docker_registry_username = var.docker_registry_username
docker_registry_password = var.docker_registry_password
docker_registry_url = var.docker_registry_url
releases_storage_account_name = module.core.releases_storage_account_name
releases_storage_account_key = module.core.releases_storage_account_key
releases_storage_sas = module.core.releases_account_sas
releases_container_name = module.core.releases_container_name
name = "consolecontainer"
image = "myregistry.azurecr.io/thingy:imagetag"
command = "myConsoleApp.exe"
environment_variables = {
COSMOS_KEY = module.core.cosmos_account_key,
COSMOS_ENDPOINT = module.core.cosmos_account_endpoint,
COSMOS_DB_NAME = module.core.cosmos_db_name,
COSMOS_CONTAINER_NAME = module.core.cosmos_container_name,
}
}
view raw usage.tf hosted with ❤ by GitHub
Standard
#terraform, Coding, vscode

Terraform, Docker, Ubuntu 20.04, Go 1.14 and MemLock: Down the rabbit hole

I recently upgrade my machine and and installed the latest Ubuntu 20.04 as part of that.

Very smugly I fired it up the new install and, as I use devcontainers, looked forward to not installing lots of devtools as the Dockerfile in each project had all the tooling needed for VSCode to spin up and get going.

Sadly it wasn’t that smooth. After spinning up a project which uses terraform I found an odd message when running terraform plan

failed to retrieve schema from provider “random”: rpc error: code = Unavailable desc = connection error: desc = “transport: authentication handshake failed: EOF

error from terraform plan

Terraform has a provider model which uses GRPC to talk between the CLI and the individual providers. Random is one of the HashiCorp made providers so it’s a really odd one to see a bug in.

Initially I assumed that the downloaded provider was corrupted. Nope, clearing the download and retrying didn’t help.

So assuming I’d messed something up I:

  1. Tried changing the docker image using by the devcontainer. Nope. Same problem.
  2. Different versions of terraform. Nope. Same problem.
  3. Updated the Docker version I was using. Nope. Same problem.
  4. Restarted the machine. Nope. Same problem.

Now feeling quite frustrated I finally remembered a trick I’d used lots when building my own terraform providers. I enabled debug logging on the terraform CLI.

TF_LOG=DEBUG terraform plan

This is where it gets interesting…

Continue reading
Standard
Coding, Quick-post

Docker and Healthchecks outside of Kubernetes

So I’ve been working with a containerized solution recently which runs outside of Kuberenetes using an Azure VMSS to scale out. I won’t dive into the reasons why we went down this route but one really interesting thing came of out of it.

How do you automatically healthcheck a container outside of Kubernetes?

Well it turns out docker has this covered in newer versions. You can specify a HEALTHCHECK inside the docker file to monitor the containers state

How do you ensure it restarts when unhealthy?

Well here you have a couple of options but both rely on using --restart=always when starting the container:

  1. You `healthcheck` command runs inside the container so you can have it kill the root process of the container causing the container to restart – Example: https://github.com/opencb/opencga/pull/1121/files
  2. You can use `AutoHeal` container which monitors the docker deamon via it’s socket and handles and containers which report unhealthy https://hub.docker.com/r/willfarrell/autoheal/

Note: I’m trying a new format for shorter slightly rougher blog posts covering specific topics quickly. They’ll appear under Quick-post tags. Please excuse typos and grammar issues!

Standard