Set Automatic Conversation Ingestion via Azure Blob Storage
AvonAI supports automatic ingestion of conversations from Azure Blob Storage containers. This guide covers the required environment variables and authentication options for configuring Azure-based ingestion.
Prerequisites
- An Azure Storage account with a Blob container containing your conversation files.
- Conversation files must follow the AvonAI JSON format.
Step 1: Configure Required Environment Variables
Add the following environment variables to your AvonAI deployment configuration:
| Variable | Description | Example |
|---|---|---|
S3_INGESTION_BUCKET | The blob container from which AvonAI will read conversations. Must follow the format az://<container-name>/<optional-prefix>/. | az://conversations/ |
AZURE_STORAGE_ACCOUNT_NAME | The name of your Azure Storage account (not the full URL — just the account name). | mystorageaccount |
S3_INGESTION_BUCKET="az://<your-container-name>/"
AZURE_STORAGE_ACCOUNT_NAME="<your-storage-account-name>"
Step 2: Configure Authentication
Choose one of the following authentication methods.
Option 1: Managed Identity (Recommended)
No additional environment variables are required. AvonAI automatically authenticates using the managed identity assigned to the host VM or container.
Requirements:
- The managed identity must have the Storage Blob Data Reader role (or equivalent) on the target container.
- The identity must be assigned to the compute resource running AvonAI (e.g., Azure VM, Azure Container Instance, or AKS pod).
Option 2: Service Principal
Use an Entra ID (Azure AD) app registration to authenticate.
| Variable | Description |
|---|---|
AZURE_CLIENT_ID | The Application (client) ID of your Entra ID app registration. |
AZURE_CLIENT_SECRET | The client secret associated with your app registration. |
AZURE_TENANT_ID | Your Entra ID (Azure AD) tenant ID. |
AZURE_CLIENT_ID="<your-client-id>"
AZURE_CLIENT_SECRET="<your-client-secret>"
AZURE_TENANT_ID="<your-azure-ad-tenant-id>"
Requirements:
- The service principal must have the Storage Blob Data Reader role on the target container.
Option 3: Storage Account Key
Use a storage account access key for authentication.
| Variable | Description |
|---|---|
AZURE_STORAGE_ACCOUNT_KEY | A storage account access key. |
AZURE_STORAGE_ACCOUNT_KEY="<your-account-key>"
Storage account keys grant full access to all data in the storage account. For production environments, consider using Managed Identity or a Service Principal with scoped permissions instead.
Frequently Asked Questions
Do I need to provide a full Azure Storage URL?
No. AvonAI only requires three identifiers to locate your data:
| Identifier | Description |
|---|---|
| Storage account | The name of your Azure Storage account. |
| Container | The blob container that holds your conversation data. |
| Path | The path (prefix) within the container where conversation files are stored. |
URL schemes such as abfss:// and endpoint suffixes such as .dfs.core.windows.net are internal driver-level details. AvonAI handles the connection protocol internally, so these do not need to be provided.
Example — extracting identifiers from a full URL:
Given the following Azure Storage URL:
abfss://my-container@myaccount.dfs.core.windows.net/data/conversations
The three identifiers are:
| Identifier | Value | Extracted from |
|---|---|---|
| Storage account | myaccount | myaccount.dfs.core.windows.net |
| Container | my-container | abfss://my-container@... |
| Path | /data/conversations | The path after the endpoint |
Notes
- Files are automatically discovered and processed on a recurring schedule.
- Supported formats: JSON, JSONL, ZIP, TAR.GZ.
- AvonAI does not delete or modify files in your Azure Blob container.