Skip to main content

Set Automatic Conversation Ingestion via Azure Blob Storage


AvonAI supports automatic ingestion of conversations from Azure Blob Storage containers. This guide covers the required environment variables and authentication options for configuring Azure-based ingestion.

Prerequisites

  • An Azure Storage account with a Blob container containing your conversation files.
  • Conversation files must follow the AvonAI JSON format.

Step 1: Configure Required Environment Variables

Add the following environment variables to your AvonAI deployment configuration:

VariableDescriptionExample
S3_INGESTION_BUCKETThe blob container from which AvonAI will read conversations. Must follow the format az://<container-name>/<optional-prefix>/.az://conversations/
AZURE_STORAGE_ACCOUNT_NAMEThe name of your Azure Storage account (not the full URL — just the account name).mystorageaccount
S3_INGESTION_BUCKET="az://<your-container-name>/"
AZURE_STORAGE_ACCOUNT_NAME="<your-storage-account-name>"

Step 2: Configure Authentication

Choose one of the following authentication methods.

No additional environment variables are required. AvonAI automatically authenticates using the managed identity assigned to the host VM or container.

Requirements:

  • The managed identity must have the Storage Blob Data Reader role (or equivalent) on the target container.
  • The identity must be assigned to the compute resource running AvonAI (e.g., Azure VM, Azure Container Instance, or AKS pod).

Option 2: Service Principal

Use an Entra ID (Azure AD) app registration to authenticate.

VariableDescription
AZURE_CLIENT_IDThe Application (client) ID of your Entra ID app registration.
AZURE_CLIENT_SECRETThe client secret associated with your app registration.
AZURE_TENANT_IDYour Entra ID (Azure AD) tenant ID.
AZURE_CLIENT_ID="<your-client-id>"
AZURE_CLIENT_SECRET="<your-client-secret>"
AZURE_TENANT_ID="<your-azure-ad-tenant-id>"

Requirements:

  • The service principal must have the Storage Blob Data Reader role on the target container.

Option 3: Storage Account Key

Use a storage account access key for authentication.

VariableDescription
AZURE_STORAGE_ACCOUNT_KEYA storage account access key.
AZURE_STORAGE_ACCOUNT_KEY="<your-account-key>"
caution

Storage account keys grant full access to all data in the storage account. For production environments, consider using Managed Identity or a Service Principal with scoped permissions instead.

Frequently Asked Questions

Do I need to provide a full Azure Storage URL?

No. AvonAI only requires three identifiers to locate your data:

IdentifierDescription
Storage accountThe name of your Azure Storage account.
ContainerThe blob container that holds your conversation data.
PathThe path (prefix) within the container where conversation files are stored.

URL schemes such as abfss:// and endpoint suffixes such as .dfs.core.windows.net are internal driver-level details. AvonAI handles the connection protocol internally, so these do not need to be provided.

Example — extracting identifiers from a full URL:

Given the following Azure Storage URL:

abfss://my-container@myaccount.dfs.core.windows.net/data/conversations

The three identifiers are:

IdentifierValueExtracted from
Storage accountmyaccountmyaccount.dfs.core.windows.net
Containermy-containerabfss://my-container@...
Path/data/conversationsThe path after the endpoint

Notes

  • Files are automatically discovered and processed on a recurring schedule.
  • Supported formats: JSON, JSONL, ZIP, TAR.GZ.
  • AvonAI does not delete or modify files in your Azure Blob container.