Windows AMIs With Even Fewer Tears

Some recent image building work I was doing required images based on Windows Server. One of my more popular posts, Windows AMIs Without the Tears, detailed the fraught endeavor of making the WinRM management system work for this purpose, but since then Microsoft have substantially improved the situation by committing engineering effort to porting OpenSSH to Windows.

Fortunately I procrastinated for long enough for the OpenSSH port to start releasing fairly complete binaries, (albeit with a warning against using it in production environments), so I thought I would post an update on building images for AWS using HashiCorp’s Packer - which has itself grown a variety of new features since I last posted on this subject.

The source code accompanying this post is available on GitHub.

(Update: Packer v1.2.3 onwards contains a fix which makes it possible to build the template in the GitHub repository).

OpenSSH for Windows

Although there have been various distributions of OpenSSH available for Windows, none have integrated as comprehensively with the native security model as the Microsoft port, nor have they matched the ease of installation. In order to install OpenSSH on Windows Server, we must take the following steps:

  1. Obtain the binary release archive from the GitHub
  2. Unarchive and move OpenSSH into place on the file system
  3. Run the included installation script to configure services
  4. Configure firewall rules to allow inbound SSH connections

The port supports public key authentication as well as my preferred method of authentication using SSH certificates, although a reboot of the machine is required to enable these options at the moment, so we also need to pull down the public key of the instance key pair using the AWS instance metadata service - and configure a Windows startup task to do so on the first boot of the image, since the standard base image does not contain such a tool.

Unfortunately, the standard path for the authorized_keys file is not easily predictable if an image goes through several build steps with several invocations of sysprep, since the Administrator profile created on subsequent boot appears to append a hostname to the home directory path.

Consequently, we put the authorized_keys file in C:\ProgramData\ssh, and use the AllowUsers directive to restrict which users are allowed to authenticate using the stored keys. Hopefully this aspect will be improved at some point in the future - pull requests welcome!

EC2Launch

For Windows Server 2016 AMIs, the traditional EC2Config utility has been removed in favour of a new tool named EC2Launch.

Fortunately for our purposes, much of the behaviour is unchanged - we can execute a PowerShell script on instance first boot by including it as the “user data” for our instance, and specify that it should run as LocalSystem by including the line <runAsLocalSystem>true</runAsLocalSystem> in the userdata file, ensuring that the actual text of the PowerShell script is a child of a <powershell> element.

The mechanism for “one-time” startup jobs is unchanged - create a scheduled task to run as NT Authority\Local System at boot, and then disable the scheduled task as the final action.

User-data Script

Armed with notes from a manual run through of installing OpenSSH, I came up with the following user-data for the source instance:

<powershell>
# Version and download URL
$openSSHVersion = "7.6.1.0p1-Beta"
$openSSHURL = "https://github.com/PowerShell/Win32-OpenSSH/releases/download/v$openSSHVersion/OpenSSH-Win64.zip"

Set-ExecutionPolicy Unrestricted

# GitHub became TLS 1.2 only on Feb 22, 2018
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12;

# Function to unzip an archive to a given destination
Add-Type -AssemblyName System.IO.Compression.FileSystem
Function Unzip
{
    [CmdletBinding()]
    param(
        [Parameter(Mandatory=$true, Position=0)]
        [string] $ZipFile,
        [Parameter(Mandatory=$true, Position=1)]
        [string] $OutPath
    )

    [System.IO.Compression.ZipFile]::ExtractToDirectory($zipFile, $outPath)
}

# Set various known paths
$openSSHZip = Join-Path $env:TEMP 'OpenSSH.zip'
$openSSHInstallDir = Join-Path $env:ProgramFiles 'OpenSSH'
$openSSHInstallScript = Join-Path $openSSHInstallDir 'install-sshd.ps1'
$openSSHDownloadKeyScript = Join-Path $openSSHInstallDir 'download-key-pair.ps1'
$openSSHDaemon = Join-Path $openSSHInstallDir 'sshd.exe'
$openSSHDaemonConfig = [io.path]::combine($env:ProgramData, 'ssh', 'sshd_config')

# Download and unpack the binary distribution of OpenSSH
Invoke-WebRequest -Uri $openSSHURL `
    -OutFile $openSSHZip `
    -ErrorAction Stop

Unzip -ZipFile $openSSHZip `
    -OutPath "$env:TEMP" `
    -ErrorAction Stop

Remove-Item $openSSHZip `
    -ErrorAction SilentlyContinue

# Move into Program Files
Move-Item -Path (Join-Path $env:TEMP 'OpenSSH-Win64') `
    -Destination $openSSHInstallDir `
    -ErrorAction Stop

# Run the install script, terminate if it fails
& Powershell.exe -ExecutionPolicy Bypass -File $openSSHInstallScript
if ($LASTEXITCODE -ne 0) {
	throw("Failed to install OpenSSH Server")
}

# Add a firewall rule to allow inbound SSH connections to sshd.exe
New-NetFirewallRule -Name sshd `
    -DisplayName "OpenSSH Server (sshd)" `
    -Group "Remote Access" `
    -Description "Allow access via TCP port 22 to the OpenSSH Daemon" `
    -Enabled True `
    -Direction Inbound `
    -Protocol TCP `
    -LocalPort 22 `
    -Program "$openSSHDaemon" `
    -Action Allow `
    -ErrorAction Stop

# Ensure sshd automatically starts on boot
Set-Service sshd -StartupType Automatic `
    -ErrorAction Stop

# Set the default login shell for SSH connections to Powershell
New-Item -Path HKLM:\SOFTWARE\OpenSSH -Force
New-ItemProperty -Path HKLM:\SOFTWARE\OpenSSH `
    -Name DefaultShell `
    -Value "C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe" `
    -ErrorAction Stop

$keyDownloadScript = @'
# Download the instance key pair and authorize Administrator logins using it
$openSSHAdminUser = 'c:\ProgramData\ssh'
$openSSHAuthorizedKeys = Join-Path $openSSHAdminUser 'authorized_keys'

If (-Not (Test-Path $openSSHAdminUser)) {
    New-Item -Path $openSSHAdminUser -Type Directory
}

$keyUrl = "http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key"
$keyReq = [System.Net.WebRequest]::Create($keyUrl)
$keyResp = $keyReq.GetResponse()
$keyRespStream = $keyResp.GetResponseStream()
    $streamReader = New-Object System.IO.StreamReader $keyRespStream
$keyMaterial = $streamReader.ReadToEnd()

$keyMaterial | Out-File -Append -FilePath $openSSHAuthorizedKeys -Encoding ASCII

# Ensure access control on authorized_keys meets the requirements
$acl = Get-ACL -Path $openSSHAuthorizedKeys
$acl.SetAccessRuleProtection($True, $True)
Set-Acl -Path $openSSHAuthorizedKeys -AclObject $acl

$acl = Get-ACL -Path $openSSHAuthorizedKeys
$ar = New-Object System.Security.AccessControl.FileSystemAccessRule( `
	"NT Authority\Authenticated Users", "ReadAndExecute", "Allow")
$acl.RemoveAccessRule($ar)
$ar = New-Object System.Security.AccessControl.FileSystemAccessRule( `
	"BUILTIN\Administrators", "FullControl", "Allow")
$acl.RemoveAccessRule($ar)
$ar = New-Object System.Security.AccessControl.FileSystemAccessRule( `
	"BUILTIN\Users", "FullControl", "Allow")
$acl.RemoveAccessRule($ar)
Set-Acl -Path $openSSHAuthorizedKeys -AclObject $acl

Disable-ScheduledTask -TaskName "Download Key Pair"

$sshdConfigContent = @"
# Modified sshd_config, created by Packer provisioner

PasswordAuthentication yes
PubKeyAuthentication yes
PidFile __PROGRAMDATA__/ssh/logs/sshd.pid
AuthorizedKeysFile __PROGRAMDATA__/ssh/authorized_keys
AllowUsers Administrator

Subsystem       sftp    sftp-server.exe
"@

Set-Content -Path C:\ProgramData\ssh\sshd_config `
    -Value $sshdConfigContent

'@
$keyDownloadScript | Out-File $openSSHDownloadKeyScript

# Create Task - Ensure the name matches the verbatim version above
$taskName = "Download Key Pair"
$principal = New-ScheduledTaskPrincipal `
    -UserID "NT AUTHORITY\SYSTEM" `
    -LogonType ServiceAccount `
    -RunLevel Highest
$action = New-ScheduledTaskAction -Execute 'Powershell.exe' `
  -Argument "-NoProfile -File ""$openSSHDownloadKeyScript"""
$trigger =  New-ScheduledTaskTrigger -AtStartup
Register-ScheduledTask -Action $action `
    -Trigger $trigger `
    -Principal $principal `
    -TaskName $taskName `
    -Description $taskName
Disable-ScheduledTask -TaskName $taskName

# Run the install script, terminate if it fails
& Powershell.exe -ExecutionPolicy Bypass -File $openSSHDownloadKeyScript
if ($LASTEXITCODE -ne 0) {
	throw("Failed to download key pair")
}

# Restart to ensure public key authentication works and SSH comes up
Restart-Computer
</powershell>
<runAsLocalSystem>true</runAsLocalSystem>

Much of the script is straightforward, and does not try to recover from errors - if the process of installing OpenSSH takes too long, Packer will time out, so graceful recovery is of minimal value here.

The original version of this script used Invoke-WebRequest in preference to constructing a .NET Web Request directly. While this works in interactive mode, it does not work when running under the context of LocalSystem if an interactive logon has not occured, since the underlying Internet Explorer engine has not been initialized. Unfortunately the paucity of debugging information available for user-data execution meant this gotcha took rather a long time to find!

Between the original version of this script and this post being written, bit rot set in - GitHub started to required TLS 1.2 on Feb 22nd 2018. Fortunately there is a straightforward fix if a modern version of Windows is in use: [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12;, in the same manner as for the .NET Framework.

The script which is written for execution on startup as a scheduled task downloads the private key, sets appropriate ACLs onthe file, and ensures that the configuration for sshd is correct. Unfortunately it seems that modifications to sshd_config made in the user-data script do not survive the first first start of the service, so modifying on reboot appears to be the only (unsatisfactory) option.

Image Preparation Script

The final provisioner in Packer must remove anything undesirable from the environment - such as temporary SSH keys, reset the scheduled task which will pull down the key pair when the image is provisioned, and call sysprep. The following prepare-image.ps1 script does this:

$keysFile = [io.path]::combine($env:ProgramData, 'ssh', 'authorized_keys')
Remove-Item -Recurse -Force -Path $keysFile

Enable-ScheduledTask "Download Key Pair"

echo "Running InitializeInstance"
& Powershell.exe C:/ProgramData/Amazon/EC2-Windows/Launch/Scripts/InitializeInstance.ps1 -Schedule
if ($LASTEXITCODE -ne 0) {
	throw("Failed to run InitializeInstance")
}

echo "Running Sysprep Instance"
& Powershell.exe C:/ProgramData/Amazon/EC2-Windows/Launch/Scripts/SysprepInstance.ps1 -NoShutdown
if ($LASTEXITCODE -ne 0) {
	throw("Failed to run Sysprep")
}

Packer

Once we have a way to ensure OpenSSH will be come available during the boot process of a stock Windows image, it is straightforward to write a Packer template which automates the process of launching an instance, running provisioners (over SSH!), and prepares the running instance to create a new AMI.

{
	"variables": {
		"aws_access_key_id": "{{ env `AWS_ACCESS_KEY_ID` }}",
		"aws_secret_access_key": "{{ env `AWS_SECRET_ACCESS_KEY` }}",
		"region": "{{ env `AWS_REGION` }}",
        
		"image_name": "Windows Image",
		"ami_name_prefix": "windows-base"
	},
	"builders": [{
		"type": "amazon-ebs",

		"access_key": "{{ user `aws_access_key_id` }}",
		"secret_key": "{{ user `aws_secret_access_key` }}",
		"region": "{{user `region`}}",
		"spot_price_auto_product": "Windows (Amazon VPC)",

		"instance_type": "t2.medium",
		"associate_public_ip_address": true,
		"user_data_file": "files/configure-source-ssh.ps1",
		"source_ami_filter": {
			"filters": {
				"name": "Windows_Server-2016-English-Full-Base-*",
				"root-device-type": "ebs",
				"virtualization-type": "hvm"
			},
			"most_recent": true,
			"owners": [
				"801119661308"
			]
		},

		"communicator": "ssh",
		"ssh_timeout": "5m",
		"ssh_username": "Administrator",

		"ami_description": "{{ user `image_name` }}",
		"ami_name": "{{ user `ami_name_prefix` }}-{{isotime \"2006-01-02-1504\"}}",
		"ami_virtualization_type": "hvm",
		"snapshot_tags": {
			"Name": "{{ user `image_name` }}",
			"OS": "Windows-2016"
		},
		"tags": {
			"Name": "{{ user `image_name` }}",
			"OS": "Windows-2016"
		}
	}],
	"provisioners": [
		{
			"type": "powershell",
			"inline": [
				"echo 'Provision Things Here' | Out-File C:/test.txt"
			]
		},
		{
			"type": "powershell",
			"script": "files/prepare-image.ps1"
		}
	]
}

There are a few important notes about the template:

  • We use the new(ish) source_ami_filter to start building from the latest provider image of Windows Server 2016. The OpenSSH installation script needs modification to work with older versions of Windows.
  • We use the Windows (Amazon VPC) spot_price_auto_product to build images using instances on the spot market and automatically select a price.
  • communicator is explicitly set to ssh, as the default for powershell provisioners is to use winrm.
  • ssh_timeout needs to be set to a reasonably long period, since the instance must boot, run our user-data script and reboot within this time window. 5 minutes has proven sufficient on moderately sized instances, but for t2.micro and friends this may need increasing.
  • We call our prepare-image.ps1 script, described above, as the last provisioner before shutting down and imaging the machine.

Running packer build template.json, we get the following output:

$ packer build template.json
amazon-ebs output will be in this color.

==> amazon-ebs: Prevalidating AMI Name: windows-base-2018-04-16-0322
    amazon-ebs: Found Image ID: ami-8e1627eb
==> amazon-ebs: Creating temporary keypair: packer_5ad416f6-a9e9-da81-803f-37658f547ea4
==> amazon-ebs: Creating temporary security group for this instance: packer_5ad41720-af1d-18b8-d256-2f8478008dc7
==> amazon-ebs: Authorizing access to port 22 from 0.0.0.0/0 in the temporary security group...
==> amazon-ebs: Launching a source AWS instance...
==> amazon-ebs: Adding tags to source instance
    amazon-ebs: Adding tag: "Name": "Packer Builder"
    amazon-ebs: Instance ID: i-0b8b853d28c1938ac
==> amazon-ebs: Waiting for instance (i-0b8b853d28c1938ac) to become ready...
sn==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Connected to SSH!
==> amazon-ebs: Provisioning with Powershell...
==> amazon-ebs: Provisioning with powershell script: /var/folders/vx/sw52zhrj6wq9184wgm0_g7pc0000gp/T/packer-powershell-provisioner124158314
==> amazon-ebs: Provisioning with Powershell...
==> amazon-ebs: Provisioning with powershell script: files/prepare-image.ps1
    amazon-ebs:
    amazon-ebs: TaskPath                                       TaskName                          State
    amazon-ebs: --------                                       --------                          -----
    amazon-ebs: \                                              Download Key Pair                 Ready
    amazon-ebs: Running InitializeInstance
    amazon-ebs:
    amazon-ebs: TaskPath                                       TaskName                          State
    amazon-ebs: --------                                       --------                          -----
    amazon-ebs: \                                              Amazon Ec2 Launch - Instance I... Ready
    amazon-ebs:
    amazon-ebs:
    amazon-ebs: Running Sysprep Instance
==> amazon-ebs: Stopping the source instance...
    amazon-ebs: Stopping instance, attempt 1
==> amazon-ebs: Waiting for the instance to stop...
==> amazon-ebs: Creating the AMI: windows-base-2018-04-16-0322
    amazon-ebs: AMI: ami-eb2a198e
==> amazon-ebs: Waiting for AMI to become ready...
==> amazon-ebs: Modifying attributes on AMI (ami-eb2a198e)...
    amazon-ebs: Modifying: description
==> amazon-ebs: Modifying attributes on snapshot (snap-01378eeb604e76ec8)...
==> amazon-ebs: Adding tags to AMI (ami-eb2a198e)...
==> amazon-ebs: Tagging snapshot: snap-01378eeb604e76ec8
==> amazon-ebs: Creating AMI tags
    amazon-ebs: Adding tag: "OS": "Windows-2016"
    amazon-ebs: Adding tag: "Name": "Windows Image"
==> amazon-ebs: Creating snapshot tags
    amazon-ebs: Adding tag: "Name": "Windows Image"
    amazon-ebs: Adding tag: "OS": "Windows-2016"
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
==> amazon-ebs: No volumes to clean up, skipping
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Deleting temporary keypair...
Build 'amazon-ebs' finished.

==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:
us-east-2: ami-eb2a198e

And looking at the AWS console, we can see that a new Windows AMI has indeed been created! If we now launch a new instance based on this, with a pre-existing keypair, we can SSH in and verify that the file written during image build is in fact in the image:

$ aws ec2 run-instances \
    --image-id ami-eb2a198e \
    --instance-type t2.medium \
    --key-name jen20 \
    --security-group-ids sg-7ed1e315 \
    --region us-east-2
    
# Output elided - machine IP: 13.58.37.39

$ ssh Administrator@13.58.37.39       # key is in agent
The authenticity of host '13.58.37.39 (13.58.37.39)' can't be established.
ECDSA key fingerprint is SHA256:6LWa4VwOh78reinYv9k1G6Hls4xkKxwLD7WAgKgqLYc.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '13.58.37.39' (ECDSA) to the list of known hosts.
Windows PowerShell
Copyright (C) 2016 Microsoft Corporation. All rights reserved.

PS C:\Users\administrator>

There is, however, some further work do to:

  • Ensure that SSH host keys are printed to the console such that they can be verified prior to connection by parsig the system log.
  • Attempt to determine the logic for the Administrator profile such that authorized keys can be added in the standard location instead of in the machine configuration.

Conclusion

Overall, Microsoft have done a great job of porting OpenSSH and integrating it with Windows. Their work has made building images for AWS substantially easier and quicker than it has been in the past - something which can only be welcomed! Hopefully steps will be removed as the port nears release, and AWS will adopt it into their standard Windows image, modifying EC2Launch to deal with key pairs. It will also no doubt be useful for building local images for Vagrant as well as AMIs, and can easily be adapted for Windows instances in other clouds.