Connect Azure MySQL to Private Endpoint with Terraform

To connect an Azure MySQL Database, or other services in Azure, one of the most secure methods to do this is with a Private Endpoint. Microsoft document the architecture they recommend using an App Service connecting to a MySQL Server, which is good if you are using the Azure Portal, but there are some missing components if you are using Terraform.

Content

Design

Microsoft do give some more simple method to allow access to the MySQL Server, by whitelisting IP Addresses or allowing Azure Services Access. However, you can only whitelist with confidence if you have a fixed IP range and the allowing all access would open it up to all services within all subscriptions. You can see these methods on the Microsoft Documentation: https://docs.microsoft.com/en-us/azure/mysql/howto-connect-webapp

A better method is to lock the server off from the internet and allow access via a Private Endpoint in a Virtual Network. This design is references in the documentation here: https://docs.microsoft.com/en-us/azure/architecture/example-scenario/private-web-app/private-web-app. As you can see it is very simple and have few components, however, when you use the Portal to create these most are dynamically created through simple entries.

App Service 
Web App 
Configuration 
WEBSITE VNET ROUTE ALL = 1 
WEBSITE DNS SERVER 
Internet 
App Service 
Regional 
VNet 
Integration 
AppSvcSubnet 
(10.1.2.0/24) 
Vlrtual Network (10.1.0.0/16) 
Private Endpoint 
(10.1.1.4/32) 
PrivateLinkSubnet 
(10.1.1.0/24) 
SQL 
= 168.63.129.16 
DNS 
Private 
DNS Zones

I have then depicted the connectivity of these in a slightly different view that shows all the components and how we are going to connection them via Terraform.

App Service Plan 
App Service 
Virtual Network 
Application 
Subnet 
10120/24 
•o 
DNS Private zone 
privatelink.mysql.dataöase.azure.com 
Private Endpoint 
Subnet 
Private Endpoint 
MySQL 
Server 
Network Interface

Building with Terraform

In each section I will highlight any particular code, but the full example is at the end.

Virtual Network

This is the centre feature to the design that we will create first. You can also create the subnets at this time, but to break it down I am only going to create the VNet itself. With the VNet we are keeping it simple so we are only entering the name and the DNS IP Address, which we will use some hardcoded values that you can always change.

DNS

address_space       = ["10.1.0.0/16"]
dns_servers         = ["10.0.0.4", "10.0.0.5"]

MySQL Server

You probably can do this in another order, but I followed the same process in creating them in the Portal. Within this I have included dynamically creating the MySQL password. A key part is the ‘mysql_server_sku’ version as this requires to be a General Purpose version or above, which is what I have set the default to. You can also see where I have enforced the firewall to disallow public access with ‘public_network_access_enabled’.

MySQL SKU

variable "mysql_server_sku" {
  type        = string
  description = "MySQL Server SKU"
  default     = "GP_Gen5_2"
}

Dynamic Password

resource "random_password" "password" {
  length      = 20
  min_upper   = 2
  min_lower   = 2
  min_numeric = 2
  min_special = 2
}
 administrator_login_password      = var.mysql_server_password == "" ? random_password.password.result : var.mysql_server_password

MySQL Firewall Settings

variable "mysql_server_settings" {
  type = object({
    auto_grow_enabled                 = bool
    backup_retention_days             = number
    geo_redundant_backup_enabled      = bool
    infrastructure_encryption_enabled = bool
    public_network_access_enabled     = bool
    ssl_enforcement_enabled           = bool
    ssl_minimal_tls_version_enforced  = string
  })
  description = "MySQL Server Configuration"
  default = {
    auto_grow_enabled                 = true
    backup_retention_days             = 7
    geo_redundant_backup_enabled      = false
    infrastructure_encryption_enabled = false
    public_network_access_enabled     = false
    ssl_enforcement_enabled           = true
    ssl_minimal_tls_version_enforced  = "TLS1_2"
  }
}

Subnet

As the subnet will be used more then once, I have made this a custom Module. This can then be called for creating the Private Endpoint subnet and the Applications Subnet. The Private Endpoint Subnet requires there to be no Delegation set, but the Application Subnet does need it. Therefore, I have made that section of the module dynamic. In this example you can also see the subnets I am giving the Private Endpoint and Application.

Delegation

 dynamic "delegation" {
    for_each = var.subnet_delegation_name == "" ? [] : [1]
    content {
      name = var.subnet_delegation_name
      service_delegation {
        name    = var.subnet_delegation_type
        actions = var.subnet_delegation_actions
      }
    }
  }

Subnets

variable "private_endpoint_subnet" {
  type        = string
  description = "Azure Private Endpoint VNet Subnet Address"
  default     = "10.1.1.0/24"
}
variable "app_subnet" {
  type        = string
  description = "Azure Application Subnet Address"
  default     = "10.1.2.0/24"
}

Private Endpoint

As this contains a few components to link everything up, I have also put this into a custom Module. You can see the order of creation below, so you can get an idea of how they are built. I try to keep things dynamic so the connecting of the MySQL Server is part of an array, you can add more resources to this endpoint. Something that caught me out was the DNS Zone name in Terraform. Most every resource that has a ‘name’ can be anything you like within the boundaries of validation as it just gives a title to the resource, but the DNS Zone must be a valid link from the Azure Documentation, which in the Terraform variable below.

Creation Order

  1. Create Private DNS Zone
  2. Create Private Endpoint
  3. Create a Private Endpoint Connection
  4. Link the DNS Zone to the Virtual Network

Dynamic Resources

private_endpoint_service_connections = [
    {
      name                           = "${var.mysql_server_name}.privateEndpoint"
      private_connection_resource_id = azurerm_mysql_server.mysql_server.id
      subresource_names              = ["mysqlServer"]
      is_manual_connection           = false
    }
  ]

DNS Zone name

variable "dnszone_private_link" {
  type        = string
  description = "Validate Private Link URL https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-dns"
}
resource "azurerm_private_dns_zone" "private-endpoint-dns-private-zone" {
  name                = var.dnszone_private_link
  resource_group_name = var.resource_group_name
}
dnszone_private_link = "privatelink.mysql.database.azure.com"

App Service

We can now add the connecting App Service by creating an App Plan with an App Service connected to a subnet within the same VNet. This has all the standard values and settings, which do not have much requirements to this solution. Two factors that do is the App Plan SKU, which just needs to be Standard or above and the other is the App Settings. Again making thing dynamic, I allow a variable to pass in custom App Settings, but we also need the Subnet settings added. These include the ‘WEBSITE_VNET_ROUTE_ALL’ and the ‘WEBSITE_DNS_SERVER which is set to the Azure DNS IP Address unless you have private DNS Server.

App Settings

locals {
  app_settings_subnet = {
    WEBSITE_VNET_ROUTE_ALL = 1
    WEBSITE_DNS_SERVER     = "168.63.129.16"
  }
  app_settings = merge(var.webapp_app_settings, local.app_settings_subnet)
}

End-to-end Code

You can view the full code on my GitHub Repository PureRandom

These are also two sites where I drew a lot of knowledge from, so I thought they deserved a mention.

If you do find any issues with the code, please message me. This was taken from a larger project so I might have missed one or two things 🙂

Reference Sites:

Create User on Azure MySQL Database with Terraform

After you create you Azure MySQL Server, you would like to create a Database, but for security you would like to have a new custom user added with permission. Unfortunately the Azure Providers do not provide this, so we need to use some other skills to get these users in.

First we can go ahead and create the MySQL Server and Database. We use the Random provider to generate a random password for the MySQL password, with the rest of the details as default. There are a few places I have put variables in to make it more flexible, which can be expanded on or reduced.

resource "random_password" "server_pwd" {
  length      = 20
  min_upper   = 2
  min_lower   = 2
  min_numeric = 2
  min_special = 2
}
resource "azurerm_mysql_server" "mysql_server" {
  name                = var.mysql_server_name
  location            = var.location
  resource_group_name = var.resource_group_name
  administrator_login          = "mysqladminun"
  administrator_login_password = random_password.server_pwd.result
  sku_name   = "B_Gen5_2"
  storage_mb = 5120
  version    = "5.7"
  auto_grow_enabled                 = true
  backup_retention_days             = 7
  geo_redundant_backup_enabled      = false
  infrastructure_encryption_enabled = false
  public_network_access_enabled     = true
  ssl_enforcement_enabled           = true
  ssl_minimal_tls_version_enforced  = "TLS1_2"
}
resource "azurerm_mysql_database" "mysql_database" {
  name                = var.database_name
  resource_group_name = var.resource_group_name
  server_name         = var.server_name
  charset             = var.database_charset
  collation           = var.database_collation
}

When creating the user it uses standard MySQL queries, therefore we not only need connections to the Database we also need to allow the running service through the MySQL Firewall. To do this we can update the Firewall with our IP Address. You can pass in the IP manually or through a different method, but here I am using a HTTP request to ‘http://ifconfig.me/ip‘ which returns the current IP of what is running the Terraform. We then update the Firewall settings with this IP Address.

data "http" "myip" {
  url = "http://ifconfig.me/ip"
}
resource "azurerm_mysql_firewall_rule" "mysql_firewall_clientip" {
  name                = "ClientIpAddress"
  resource_group_name = var.resource_group_nameE
  server_name         = var.mysql_server_name
  start_ip_address    = chomp(data.http.myip.body)
  end_ip_address      = chomp(data.http.myip.body)
}

At this point we would have an MySQL Server, MySQL Database and our service IP Address allowed on the Firewall. From here we can then use the MySQL provider in Terraform. https://www.terraform.io/docs/providers/mysql/index.html

First you need to setup the provider with you MySQL details. We assume the suffix of the endpoint URL as it is standard from Azure, but if you have a custom domain and/or port then you will need to update this. When I have used this it is in a different Terraform execution and therefore I get the details from Variables, Data Resources and from the Key Vault for the password. You will need to update these as per your setup.

provider "mysql" {
  endpoint = "${var.mysqlserver_name}.mysql.database.azure.com:3306"
  username = "${var.mysql_server_username}@${var.mysql_server_name}"
  password = var.mysql_server_password
  tls      = true
}

You can either pass in your own Username and Password for the details being added in the MySQL Server, or I use the Random Provider to generate these details.

resource "random_password" "database_pwd" {
  length      = 20
  min_upper   = 2
  min_lower   = 2
  min_numeric = 2
  min_special = 2
}
resource "random_string" "username" {
  length           = 5
  special          = false
  override_special = "/@£$"
}
locals {
  database_username = "dbuser_${random_string.username.result}"
  database_password = random_password.database_pwd.result
}

With them you can then create the User in the MySQL Server.

resource "mysql_user" "dbuser" {
  user               = var.database_user_name == "" ? local.database_username : var.database_user_name
  host               = "%"
  plaintext_password = local.database_password
}

And finally Grant the user permissions to the MySQL Database.

resource "mysql_grant" "useraccess" {
  user       = mysql_user.dbuser.user
  host       = mysql_user.dbuser.host
  database   = var.database_name
  privileges = ["SELECT", "UPDATE", "DELETE", "EXECUTE", "INSERT"]
}

There are a few places where you can increase flexibly by using more variables/logic and there are places you can remove them if it is going to be state. These are some of the variables I have used to configure the resources, which below is the variables.tf content.

Server

variable "resource_group_name" {
  type        = string
  description = "Resource Group Name"
}
variable "location" {
  type        = string
  description = "Resource Location"
}

variable "mysql_server_name" {
  type        = string
  description = "MySQL Server Name"
}
variable "mysql_server_username" {
  type        = string
  description = "MySQL Server Username"
  default     = "mysqladmin"
}
variable "mysql_server_password" {
  type        = string
  description = "MySQL Server Password. (If empty auto generated password will be set)"
  default     = ""
}
variable "mysql_server_sku" {
  type        = string
  description = "MySQL Server SKU"
  default     = "B_Gen5_2"
}
variable "mysql_server_storage" {
  type        = number
  description = "MySQL Server Storage in MB"
  default     = 5120
}
variable "mysql_server_version" {
  type        = string
  description = "MySQL Server Version"
  default     = "8.0"
}
variable "mysql_server_settings" {
  type = object({
    auto_grow_enabled                 = bool
    backup_retention_days             = number
    geo_redundant_backup_enabled      = bool
    infrastructure_encryption_enabled = bool
    public_network_access_enabled     = bool
    ssl_enforcement_enabled           = bool
    ssl_minimal_tls_version_enforced  = string
  })
  description = "MySQL Server Configuration"
  default = {
    auto_grow_enabled                 = true
    backup_retention_days             = 7
    geo_redundant_backup_enabled      = false
    infrastructure_encryption_enabled = false
    public_network_access_enabled     = false
    ssl_enforcement_enabled           = true
    ssl_minimal_tls_version_enforced  = "TLS1_2"
  }
}

Database

variable "resource_group_name" {
  type        = string
  description = "Resource Group Name"
}
variable "mysql_server_name" {
  type        = string
  description = "MySQL server Name"
}
variable "database_name" {
  type        = string
  description = "Database Name"
}
variable "database_user_name" {
  type        = string
  description = "MySQL Database User username"
  default     = ""
}
variable "database_user_permissions" {
  type        = list(string)
  description = "MySQL Database User Privileges"
  default     = ["SELECT", "UPDATE", "DELETE", "EXECUTE", "INSERT"]
}
variable "database_charset" {
  type        = string
  description = "Database Charset"
  default     = "utf8"
}
variable "database_collation" {
  type        = string
  description = "Database Collation"
  default     = "utf8_unicode_ci"
}

Unable to delete Azure Subnet due to Resources

With Azure Subnets there is an order to deleting the resources. You must disconnect the Virtual Network Subnets before deleting the resources, or you can’t delete the Subnet. This caused myself some issue, while using Terraform as I kicked a Destroy command and did it in the wrong order. However, I found a method on how to correct it.

If you have this issue then you will be presented with an error like this:

Failed to delete subnet
Failed to delete subnet ‘app_subnet’. Error: Subnet app_subnet is in use by cp-rg/providers/Microsoft.Network/virtualNetworks/cp-vnet/subnets/app_subnet/serviceAssociationLinks/AppServiceLink’>cp-vnet/app_subnet/AppServiceLink and cannot be deleted. In order to delete the subnet, delete all the resources within the subnet. See aka.ms/deletesubnet.

The standard method to correct this is to recreate the resource. If you can recreate the resource that was in the Subnet before, then you can reattach it to the subnet. This then brings the resources back to the state they were before. From here you can then Disconnect the Resource and then delete the Subnet.

However, in my case I did not remember what the original resources name was, which means I cannot create it like for like to fix the issue. You can still find the name or names out by using the AZ CLI, which you might also be able to use the API as well.

$rgname = "cp-rg"
$subnet = "app_subnet"
$vnet = "cp-vnet"
az network vnet subnet show --resource-group $rgname --name $subnet --vnet $vnet

This then returns all the details of the subnet, including the ‘serviceAssociationLinks’ like below inlcuding the names of the services.

{
 "serviceAssociationLinks": [
    {
      "allowDelete": false,
      "etag": "W/\"00000000-0000-0000-0000-000000000000\"",
      "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/cp-rg/providers/Microsoft.Network/virtualNetworks/cp-vnet/subnets/app_subnet/serviceAssociationLinks/AppServiceLink",
      "link": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/cp-rg/providers/Microsoft.Web/serverfarms/cp-asp-666",
      "linkedResourceType": "Microsoft.Web/serverfarms",
      "locations": [],
      "name": "AppServiceLink",
      "provisioningState": "Succeeded",
      "resourceGroup": "cp-rg",
      "type": "Microsoft.Network/virtualNetworks/subnets/serviceAssociationLinks"
    }
  ]
}

You can now recreate the resource and delete the subnet.

Auto Purge Azure Container Registry Images

When adding images into the Azure Container Registry you might start getting a backlog of images that need to be cleaned down. Azure CLI has some features, but you might want more…

With the Azure CLI you can use the ‘ACR’ commands, which contain the option of ‘purge’. This can be combined with a tag filter to narrow what images to remove, plus an ‘ago’ property to filter how old the images need to be.

This can then be run to clear down the images in the Registry as per the example below and you can get more information from the GitHub Repository.

acr purge --ago 5d --filter 'myRegistry:.*' --untagged 

This can then be merged with the ACR command to set an ACR Task. This can be a schedule task run on the ACR to trigger the Command, routinely cleaning down the repository. You can see from the example below and read more detail from the Microsoft Documentation.

$azureContainerRegistryName=""
$PURGE_CMD="acr purge --ago 5d --filter 'myRegistry:.*' --untagged "
az acr task create --name myRegistry-WeeklyPurgeTask --cmd "$PURGE_CMD" --schedule "0 1 * * Sun" --registry $azureContainerRegistryName --context /dev/null

The Purge method is really good, but unless you have good tag management, of making sure your running images are tagged differently to the older images, then it doesn’t work. This will in my case keep clearing out all images even ones that are in use.

Therefore, I created a PowerShell script to clean the images by the number of them in the registry. With this we can always be certain there are at leave X amount of images in the registries.

To make this more flexibly and reusable, it works on the Azure Container Registry level instead of the Registry level. First we set the ACR name and the maximum images we want left in the registry. With this we can use the Azure CLI to get an output of all the Registries.

$AcrName = "myAcr"
$maxImages = 5
$repositories = (az acr repository list -n $AcrName --output tsv)
foreach ($repository in $repositories) {

}

With this we can loop each registry to check their image count and remove the images. To do this we use the CLI to get all the image tags in the repository in date/time order descending, so our newer images come first. This means when we loop through them we can keep a counter until we reach the maximum images variable set earlier. Once we reach the set number and only then do we start deleting images.

To delete we can call the CLI action ‘az acr repository delete’, which requires the full name of the image, including the repository name.

Below is the full PowerShell example:

$AcrName = "myAcr"
$maxImages = 5
$repositories = (az acr repository list -n $AcrName --output tsv)
foreach ($repository in $repositories) {
        
    Write-Host("repo: $repository")
    $images = az acr repository show-tags -n $AcrName --repository $repository --orderby time_desc  --output tsv
    Write-Host("image: $images")
            
    $imageCount = 0
    foreach ($image in $images) {
        if ($imageCount -gt $maxImages) {
            $imageFullName = "$repository`:$image"
            Write-Host("image: $imageFullName")
            az acr repository delete -n $AcrName --image $imageFullName
        }
        $imageCount++
    }
}

You could then contain this code into a variable like the first example, to then put it into a Scheduled ACR Task, or just create an automated schedule with other technology like Azure DevOps Pipelines where you can add this into source control.

Automate Security for Azure Container Registry

From March 2021 Azure is deprecating the Container Setting in Azure Web Apps, which changes you to use the new Development Center. This look very nice, but there is a change that is going to force you to have weaker security. This change is to have the Admin Credentials enabled, but there is something you can do to be secure.

Before this change, the best practice method was to turn Admin Credentials off in your Azure Container Registry(ACR). This is because the user is a single user, so you can’t tell different peoples interactions while using this one account, and it also has as it says in the name, admin rights meaning it can do anything.

To make this secure, you would disable the Admin Credentials and then anything trying to connect to the repository would have a Service Principle set up or a role added with the correct access. In this post I describe some of these where you can set up the ACR credentials in the App Settings, so the Azure Web App has access to pull the image. This is using Terraform, but it has the basic idea as well.
Use Terraform to connect ACR with Azure Web App

Now when you create a new Azure Web App or go to an existing one you will be presented with an error like this:

Basics Docker Monitoring 
Tags 
Review + create 
Pull container images from Azure Container Registry, Docker Hub or a private Docker repository. App Service will 
deploy the containerized app with your preferred dependencies to production in seconds. 
Options 
Image Source 
Azure container registry options 
Registry * 
Image 
Tag 
Startup Command O 
Single Container 
Azure Container Registry 
Loading... 
Cannot perform credential operations for 
o 
as admin user is 
disabled. Kindly enable admin user as per docs: https://docs.microsoft.com 
/en-us/azure/container-registry/container-registry-authentication*admin- 
account

This is now the message you get telling you to turn on the Admin Credentials, but what makes it confusing is on the documentation they point you to says:

“The admin account is designed for a single user to access the registry, mainly for testing purposes. “

ref: https://docs.microsoft.com/en-us/azure/container-registry/container-registry-authentication#admin-account

However, it seems we need to play by their conflicting rules, so we need to work with this and make it more secure.
Turning on this setting can be insecure, but what we can do is rotate the keys. As you can tell from the UI you don’t need to enter these credentials as the authentication is handled behind the scenes.

Therefore, we can regenerate the passwords without affecting the connection between the ACR and the Resource. This is not perfect, but it does mean if anyone get your password or uses it, then it will be expired very quickly if you want at least.

To do this we can use the ACR and the Azure CLI. With the CLI you can use the ACR commands to trigger a password regeneration.

az acr credential renew -n MyRegistry --password-name password
az acr credential renew -n MyRegistry --password-name password2

ref: https://docs.microsoft.com/en-us/cli/azure/acr/credential?view=azure-cli-latest#az_acr_credential_renew

We can then schedule this and tie it to the ACR by using the ACR Tasks. These can run ACR commands and be put on a repeating timer to trigger when you wish.
ref: https://docs.microsoft.com/en-us/cli/azure/acr/task?view=azure-cli-latest#az_acr_task_create

unfortunalty the ‘acr’ doesn’t contain the ‘Credential’ command and if you run the ‘az’ cli command it says you need to login.

You can put the commands into a Dockerfile and run the commands using the Azure CLI image, but this seems overkill. I would suggest using alternatives to run the commands, like setting up an automated Azure DevOps pipeline to run the commands in the CLI task.