Skip to content

ML Plugin

amsdal_ml is a machine learning plugin for the AMSDAL Framework that adds vector embeddings, semantic search, natural language CRUD operations, AI agents, and MCP server integration.

Features

  • Embeddings & Search — generate vector embeddings from your AMSDAL models and search them semantically
  • Natural Language CRUD — read, create, update, and delete records using plain English queries
  • AI Agents — ReAct and function-calling agents with pluggable tools
  • MCP Server — expose your data to AI assistants (Claude Desktop, Chatbox, etc.) via Model Context Protocol
  • OAuth — authenticated MCP access with full AMSDAL permission checks

Installation

pip install amsdal_ml

Plugin Registration

Add the plugin to AMSDAL_CONTRIBS in your .env file:

AMSDAL_CONTRIBS=amsdal_ml.app.MLPluginAppConfig

Or append to existing contribs:

AMSDAL_CONTRIBS=amsdal.contrib.auth.app.AuthAppConfig,amsdal_ml.app.MLPluginAppConfig

Configuration

amsdal_ml uses MLConfig (based on pydantic-settings) for configuration. All settings can be set via environment variables.

API Keys

Env Variable Description
OPENAI_API_KEY OpenAI API key (required for embeddings and LLM)
CLAUDE_API_KEY Anthropic API key (optional, for Claude-based models)

LLM Settings

Env Variable Default Description
LLM_MODEL_NAME gpt-4o LLM model for NL queries and agents
LLM_TEMPERATURE 0.0 Temperature for LLM responses

Embedding Settings

Env Variable Default Description
EMBED_MODEL_NAME text-embedding-3-small Embedding model name
EMBED_DIMENSIONS 1536 Embedding vector dimensions
EMBED_MAX_DEPTH 2 Max depth for recursive model walking
EMBED_MAX_CHUNKS 10 Max chunks per object
EMBED_MAX_TOKENS_PER_CHUNK 800 Max tokens per chunk

Retriever Settings

Env Variable Default Description
RETRIEVER_DEFAULT_K 8 Default number of results for similarity search
RETRIEVER_INCLUDE_TAGS_DEFAULT Default include tags (comma-separated)
RETRIEVER_EXCLUDE_TAGS_DEFAULT Default exclude tags (comma-separated)

OAuth Settings

Env Variable Default Description
OAUTH_ENABLED true Enable OAuth for MCP server
OAUTH_CLIENT_ID_EXPIRATION_DAYS 30 Client ID expiration (days)
OAUTH_CODE_EXP_MINUTES 10 Authorization code expiration (minutes)
OAUTH_ACCESS_TOKEN_EXP_HOURS 24 Access token expiration (hours)
OAUTH_REFRESH_TOKEN_EXP_DAYS 90 Refresh token expiration (days)
OAUTH_LOGIN_PATH /auth/login Login page path
OAUTH_ISSUER http://127.0.0.1:8000 OAuth issuer URL

Class Settings

Env Variable Default Description
ML_MODEL_CLASS amsdal_ml.ml_models.openai_model.OpenAIModel LLM model implementation class
MCP_ML_MODEL_CLASS amsdal_ml.ml_models.openai_model.OpenAIModel LLM model class for MCP server
ML_RETRIEVER_CLASS amsdal_ml.ml_retrievers.openai_retriever.OpenAIRetriever Retriever implementation class
ML_INGESTING_CLASS amsdal_ml.ml_ingesting.openai_ingesting.OpenAIIngesting Ingesting implementation class

Other Settings

Env Variable Default Description
ASYNC_MODE true Enable async mode

Quick Start

from amsdal_ml.ml_ingesting.openai_ingesting import OpenAIIngesting
from amsdal_ml.ml_retrievers.openai_retriever import OpenAIRetriever

# Index a model object
ingesting = OpenAIIngesting(tags=['my-tag'])
records = await ingesting.agenerate_embeddings(my_object)
await ingesting.asave(records, my_object)

# Search
retriever = OpenAIRetriever()
results = await retriever.asimilarity_search('find documents about payments', k=5)

for chunk in results:
    print(chunk.raw_text, chunk.distance)

Architecture

The plugin is organized into these main modules:

Module Description
ml_models LLM interfaces (MLModel, OpenAIModel)
ml_ingesting Embedding generation, document ingestion pipeline
ml_retrievers Semantic search, NL query/create/update/delete executors
agents AI agents (DefaultQAAgent, FunctionalCallingAgent) with tools
mcp_server MCP servers (stdio, HTTP/SSE) with OAuth
mcp_client MCP client implementations for connecting to external servers
fileio File attachment handling and OpenAI file uploads