#terraform, Apps, Azure, Coding, How to

Azure Functions Get Key from Terraform without InternalServerError

So you’re trying to use the Terraform azurerm_function_app_host_keys resource to get the keys from an Azure function after deployment. Sadly, as of 03/2021, this can fail intermittently 😢 (See issue 1 and 2).+

[Edit: Hopefully this issue is resolved by this PR once released so worth reviewing once the change is released]

These errors can look something like these below:

Error making Read request on AzureRM Function App Hostkeys “***”: web.AppsClient#ListHostKeys: Failure responding to request: StatusCode=400 — Original Error: autorest/azure: Service returned an error. Status=400 Code=”BadRequest” Message=”Encountered an error (ServiceUnavailable) from host runtime”

Error: Error making Read request on AzureRM Function App Hostkeys “somefunx”: web.AppsClient#ListHostKeys: Failure responding to request: StatusCode=400

You can work around this by using my previous workaround with ARM templates but it’s a bit clunky so I was looking at another way to do it.

There is an AWESOME project by Scott Winkler called Shell Provider, it lets you write a custom Terraform provider using scripts. You can implement data types and full resources with CRUD support.

Looking into the errors returned by the azurerm_function_app_host_keys resource they’re intermittent and look like they’re related to a timing issue. Did you know the curl command support retrying out of the box?

So using the Shell provider we can create a simple script to make the REST request to the Azure API and use curls inbuilt retry support to have the request retried with an exponential back-off until it succeeds or 5mins is up!

Warning: This script uses –retry-all-errors which is only available in v7.71 and above. The version shipped with the distro your using might not be up-to-date user curl --version to check.

Here is a rough example of what you end up with:

terraform {
required_providers {
shell = {
source = "scottwinkler/shell"
version = "1.7.7"
resource "azurerm_function_app" "functions" {
name = "${var.function_name}${var.random_string}-premium"
location = var.resource_group_location
resource_group_name = var.resource_group_name
app_service_plan_id = var.app_service_plan_id
version = "~3"
storage_account_name = var.storage_account_name
storage_account_access_key = var.storage_account_key
identity {
type = "SystemAssigned"
site_config {
# Ensure we use all the mem on the box and not only 3.5GB of it!
use_32_bit_worker_process = false
pre_warmed_instance_count = 1
app_settings = merge({
StorageContainerName = var.test_storage_container_name
https_only = true
HASH = base64encode(filesha256(local.func_zip_path))
WEBSITE_RUN_FROM_PACKAGE = "https://${var.storage_account_name}.blob.core.windows.net/${var.deployment_container_name}/${azurerm_storage_blob.appcode.name}${var.storage_sas}"
# Route outbound requests over VNET see: https://docs.microsoft.com/en-us/azure/azure-functions/functions-networking-options#regional-virtual-network-integration
}, var.app_settings)
data "azurerm_subscription" "current" {
data "shell_script" "functions_key" {
lifecycle_commands {
read = file("${path.module}/readkey.sh")
environment = {
FUNC_NAME = azurerm_function_app.functions.name
RG_NAME = var.resource_group_name
SUB_ID = data.azurerm_subscription.current.subscription_id
depends_on = [azurerm_function_app.functions]
view raw main.tf hosted with ❤ by GitHub
output "function_master_key" {
# Try is used here to ensure destroy works as expected. On destroy the map will be
# empty so try instead returns an empty string
# See: https://www.terraform.io/docs/language/functions/try.html
value = try(data.shell_script.functions_key.output["masterKey"], "")
output "function_hostname" {
value = azurerm_function_app.functions.default_hostname
output "function_name" {
value = azurerm_function_app.functions.name
view raw output.tf hosted with ❤ by GitHub
set -e
# Get a token so we can call the ARM api
TOKEN=$(az account get-access-token -o json | jq -r .accessToken)
# Attempt to list the keys with exponential backoff and do this for 5mins max
# –fail required see https://github.com/curl/curl/issues/6712
curl "https://management.azure.com/subscriptions/$SUB_ID/resourceGroups/$RG_NAME/providers/Microsoft.Web/sites/$FUNC_NAME/host/default/listkeys?api-version=2018-11-01" \
–compressed -H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: Bearer $TOKEN" -d "{}" \
–retry 8 –retry-max-time 360 –retry-all-errors –fail –silent
view raw readkeys.sh hosted with ❤ by GitHub

Coding, How to

K8s Operator with dynamic CRDs using controller runtime (no structs)

Warning: This is a bit of a brain dump.

I’m working on a project at the moment which dynamically creates a set of CRDs in Kubernetes and an operator to manage them based off a schema which is provided by the user at runtime.

When the code is being built it doesn’t know the schema/shape of the CRDs. This means the standard approach used in Kubebuilder with controller-gen isn’t going to work.

Now, for those that haven’t played with Kubebuilder it’s gives you a few super useful things to build a K8s operator in Go:

  1. Controller-gen creates all the structs, templated controllers and keeps all those type files in sync for you. So you change a CRDs Struct and the CRD Yaml spec is updated etc. These are all build time tools so we can’t use em.
  2. A nice abstraction around how to interact with K8s as a controller – The controller-runtime. As the name suggests we can use this one at runtime.

So while we can’t use the build time controller-gen we can still use all the goodness of the controller-runtime. In theory.

This is where the fun came in, there aren’t any docs on interacting with a dynamic/unstructured object type using the controller runtime so I did a bit of playing around.

(Code samples for illustration – if you want end2end running example skip to the bottom).

To get started on this journey we need a helping hand. Kuberentes has an API for working which objects which don’t have a Golang struct defined. This is how we can start: Lets check out the go docs for unstructured..


Ok so this gives us some nice ways to work with a CRD which doesn’t have a struct defined.

To use this meaningfully we’re going to have to tell it the type it represents – In K8s this means telling it it’s Group, Version and Kind. These are all wrapped up nicely in the schema.GroupVersionKind struct. Lets look at the docs:


Great so hooking these two up together we can create an Unstructured instance that represents a CRD, like so!

Cool, so what can we do from here? Well the controller runtime uses the runtime.object interface for all it’s interactions and guess what we have now? Yup a runtime.Object.. wrapper method to make things obvious

Well now we can create an instance of the controller for our unstructured CRD.

Notice that I’m passing the GroupVersionKind into the controller struct – this will be useful when we come to make changes to a CRD we’re handling.

In the same way that you can use the r.Client on the controller in Kubebuilder you can now use it with the unstructured resource. We use the gvk again here to set the type so that the client knows how to work with it.

Now you might be thinking – wow isn’t it going to be painful working without the strongly typed CRD structs?

Yes it’s harder but there are some helper methods in the unstructured api which make things much easier. For example, the following let you easily retrieve or set a string which is nested inside it.

unstructured.NestedString(resource.Object, "status", "_tfoperator", "tfState")

unstructured.SetNestedField(resource.Object, string(serializedState), "status", "_tfoperator", "tfState")

Here is the end result hooking up the controller runtime to a set of dynamically created and managed CRDS. It’s very much a work in progress and I’d love feedback if there are easier ways to tackle this or things that I’ve got wrong.

#terraform, Coding, vscode

Terraform, Docker, Ubuntu 20.04, Go 1.14 and MemLock: Down the rabbit hole

I recently upgrade my machine and and installed the latest Ubuntu 20.04 as part of that.

Very smugly I fired it up the new install and, as I use devcontainers, looked forward to not installing lots of devtools as the Dockerfile in each project had all the tooling needed for VSCode to spin up and get going.

Sadly it wasn’t that smooth. After spinning up a project which uses terraform I found an odd message when running terraform plan

failed to retrieve schema from provider “random”: rpc error: code = Unavailable desc = connection error: desc = “transport: authentication handshake failed: EOF

error from terraform plan

Terraform has a provider model which uses GRPC to talk between the CLI and the individual providers. Random is one of the HashiCorp made providers so it’s a really odd one to see a bug in.

Initially I assumed that the downloaded provider was corrupted. Nope, clearing the download and retrying didn’t help.

So assuming I’d messed something up I:

  1. Tried changing the docker image using by the devcontainer. Nope. Same problem.
  2. Different versions of terraform. Nope. Same problem.
  3. Updated the Docker version I was using. Nope. Same problem.
  4. Restarted the machine. Nope. Same problem.

Now feeling quite frustrated I finally remembered a trick I’d used lots when building my own terraform providers. I enabled debug logging on the terraform CLI.

TF_LOG=DEBUG terraform plan

This is where it gets interesting…

Continue reading