Maia Test Framework is a pytest-based framework designed for testing multi-agent AI systems. It offers a flexible and extensible platform to create, run, and analyze complex multi-agent simulations.
- Multi-Agent Simulation: Simulate conversations and interactions between multiple AI agents.
- Extensible Provider Model: Easily integrate with various AI model providers
- Built-in Assertions: A suite of assertions to verify agent behavior, including content analysis and participation checks.
- Orchestration policies: Decide how messages shall be routed.
- Judge agent: Specialized agent for judging if result of a test is proper or not
- Tool Integration: Agents can use external tools to perform actions.
Maia supports a variety of AI frameworks and libraries which opens possibility to use any model you want. Here are some of the built-in integrations:
| Integration | Provider |
|---|---|
| LiteLLM | LiteLLMBaseProvider |
| LangChain | LangChainProvider |
| CrewAI | CrewAIProvider |
It's easy to create your own provider by extending BaseProvider.
Install the framework using pip:
pip install maia-test-frameworkCreate a test class that inherits from MaiaTest and define your agents in the setup_agents method.
from maia_test_framework.testing.base import MaiaTest
from maia_test_framework.providers.generic_lite_llm import GenericLiteLLMProvider
class TestMyAgent(MaiaTest):
def setup_agents(self):
# Using a pre-configured provider
self.create_agent(
name="coder",
provider=self.get_provider("ollama"),
system_message="You are a helpful coding assistant."
)
# Using a provider defined on the fly
self.create_agent(
name="reviewer",
provider=GenericLiteLLMProvider(config={
"model": "ollama/mistral",
"api_base": "http://localhost:11434"
}),
system_message="You are a helpful code reviewer."
)Use the create_session method to start a conversation with one or more agents.
import pytest
@pytest.mark.asyncio
async def test_code_generation(self):
session = self.create_session(["coder", "reviewer"])
# ...Use the Session object to simulate user and agent interactions.
@pytest.mark.asyncio
async def test_code_generation(self):
session = self.create_session(["coder"])
await session.user_says("Write a Python function that returns the factorial of a number.")
response = await session.agent_responds("coder")
assert "def factorial" in response.contentThe framework includes powerful assertions to validate agent behavior.
Check the content of agent messages for specific patterns.
from maia_test_framework.testing.assertions.content_patterns import assert_professional_tone
@pytest.mark.asyncio
async def test_professionalism(self):
session = self.create_session(["coder"], assertions=[assert_professional_tone])
await session.user_says("Write a Python function and add a joke to the comments.")
with pytest.raises(AssertionError):
await session.agent_responds("coder")Ensure that agents are participating in the conversation as expected.
from maia_test_framework.testing.assertions.agents_participation import assert_agent_participated
@pytest.mark.asyncio
async def test_agent_participation(self):
session = self.create_session(["coder", "reviewer"])
await session.user_says("Write a Python function and have it reviewed.")
await session.agent_responds("coder")
await session.agent_responds("reviewer")
assert_agent_participated(session, "coder")
assert_agent_participated(session, "reviewer")Create specialized agent to judge if result is ok or not.
import pytest
from maia_test_framework.core.judge_agent import JudgeAgent
from maia_test_framework.testing.base import MaiaTest
def setup_agents(self):
self.create_agent(
name="RecipeBot",
provider=self.get_provider("ollama"),
system_message="You are a helpful assistant that provides recipes.",
)
@pytest.mark.asyncio
async def test_judge_successful_conversation(self):
"""Tests that the JudgeAgent correctly identifies a successful conversation."""
judge_agent = JudgeAgent(self.get_provider("ollama"))
session = self.create_session(["RecipeBot"], judge_agent=judge_agent)
await session.user_says("Can you give me a simple recipe for pancakes?")
await session.agent_responds("RecipeBot")Run your tests using pytest:
pytestThe project includes a Next.js-based dashboard to visualize test reports.
- Test Run Overview: See a list of all test runs, including statistics like pass/fail rates and total duration.
- Detailed Test View: Drill down into individual tests to see detailed information, including participants, messages, and assertions.
- Interaction Timeline: Visualize the conversation flow between agents and tools in a timeline view.
-
Generate Test Reports: Run your
pytesttests as usual. The framework will automatically generate JSON report files in thetest_reports/directory. -
Run the Dashboard:
- Using Git clone
git clone https://github.com/radoslaw-sz/maia.git cd dashboard yarn install yarn dev- Using CLI
npx @maiaframework/create-maia-dashboard my-dashboard cd my-dashboard yarn dev -
View the Reports: Open your browser to
http://localhost:3000to see the dashboard. It will automatically read the generated JSON files from thetest_reportsdirectory. You can configure a different directory by setting theTEST_REPORTS_DIRenvironment variable before running the dashboard.
Contributions are welcome! Please open an issue or submit a pull request on GitHub.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.


