S3 API Compatibility On Microsoft Azure

October 11, 2020 by patrickd

⚠️

Note that this article is rather old and the proposed solution may no longer work with the steps described.

Microsoft's Azure Cloud Service Blob Storage (opens in a new tab) is an alternative to AWS S3's Object Storage (opens in a new tab), but that doesn't mean it is intended as a drop-in replacement. While the S3 API is now widely considered a quasi-standard for cloud data storage, this was not the case when Azure Blob Storage was first created. While it too has a relatively simple HTTP API, the interfacing is still very different to S3.

To establish compatibility between S3 API clients and the Azure Blob Storage we'll be using the S3Proxy (opens in a new tab) project. It's a Java application that acts as a proxy between a S3 API client and the Azure Blob Storage Service by translating requests accordingly.

Prerequisites

In this example setup the S3Proxy program will be deployed as a container to an AKS Cluster (opens in a new tab). Azure Kubernetes Service provides managed Kubernetes clusters that are easily set up, managed and monitored via the Azure Portal web interface.

If you do not have an AKS yet, follow Microsoft's instructions on how to setup a Cluster (opens in a new tab) with AKS. Furthermore you might want to set up an HTTPS ingress (opens in a new tab) that will allow you to securely expose the S3Proxy service on the Internet with SSL encryption.

Setup

Once your AKS cluster is ready, create an Azure Blob Storage Account (opens in a new tab) that S3 Proxy will connect to in order to store files at. After it has been created successfully, navigate to Access Keys and copy the name of your storage account and one of the keys displayed.

⚠️

When dealing with Access Keys from Azure Storage Accounts, make sure that they do NOT contain any special characters such as slashes. Otherwise you might run into issues when using them with S3Proxy. Within the Azure Portal you can keep regenerating new keys until you get one with only digits and letters (the == characters at the end are part of the base64 encoding and are fine).

The following is an example manifest file (opens in a new tab) that uses an image from dockerhub (andrewgaul/s3proxy (opens in a new tab)) which includes the S3Proxy in a way that allows us to simply configure it using environment variables:

s3proxy.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: s3proxy
---
apiVersion: v1
kind: Service
metadata:
  name: s3proxy
  namespace: s3proxy
spec:
  selector:
    app: s3proxy
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: s3proxy
  namespace: s3proxy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: s3proxy
  template:
    metadata:
      labels:
        app: s3proxy
    spec:
      containers:
      - name: s3proxy
        image: andrewgaul/s3proxy:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 80
        env:
        - name: JCLOUDS_PROVIDER
          value: azureblob
        - name: JCLOUDS_IDENTITY
          value: name-of-your-azure-storage-account
        - name: JCLOUDS_CREDENTIAL
          value: access-key-of-your-storage-account
        - name: S3PROXY_IDENTITY
          value: name-of-your-azure-storage-account
        - name: S3PROXY_CREDENTIAL
          value: access-key-of-your-storage-account
        - name: JCLOUDS_ENDPOINT
          value: https://name-of-your-azure-storage-account.blob.core.windows.net

Here you will use the Storage Account name and key that you previously copied from the Access Keys page on your Storage Account's settings.

⚠️

Note that it's generally recommended to use Secrets (opens in a new tab) when handling credentials in Kubernetes.

After making your adjustments you may now deploy S3Proxy by executing kubectl apply -f s3proxy.yaml.

Test

You can test your S3Proxy setup using the AWS CLI (opens in a new tab) on your local computer. To do so first edit the ~/.aws/credentials configuration file and add the following lines:

~/.aws/credentials

[azure]
aws_access_key_id = name-of-your-azure-storage-account
aws_secret_access_key = access-key-of-your-storage-account

Then add the following lines to your ~/.aws/config file:

~/.aws/config

[azure]
region = your-region
output = json

With that you should be able to manage your Azure Blob Storage Account using the AWS CLI while specifying --endpoint-url and --profile:

# Create a bucket (actually called a collection on azure) named "my-files"
aws --endpoint-url https://your-s3proxy-ingress.your-region.cloudapp.azure.com/ \
    --profile azure s3 mb s3://my-files