What is Mibo?

Mibo is a testing platform that lets you verify your AI agents and systems are working correctly (without writing a single line of code).

The problem Mibo solves

AI systems are unpredictable. The same question can produce different answers every time. They might call the wrong tools, make up information, or simply give unhelpful responses.

Manual testing doesn’t scale. You can’t click through every possible scenario before each update. And when something breaks, you usually find out from frustrated users, not from a dashboard.

Mibo changes that. It automates the entire testing process so you can catch issues before your users do.

How Mibo works

Mibo follows four simple steps:

Connect your AI system: tell Mibo where it lives and how to talk to it. Mibo supports Custom APIs, Flowise, and n8n.
Describe what to test: use the Test Architect to describe your system’s expected behavior in plain language. The AI generates the test scenarios for you.
Run the tests: Mibo sends real inputs to your system, just like a user would, and collects every response.
Review the results: see quality scores, detailed evaluations, and the Failure Matrix, a breakdown of exactly where in the pipeline things went wrong (routing, arguments, tool execution, or response quality).

Two modes of testing

Mibo gives you two ways to test your AI system:

Active testing lets you run tests on demand. Mibo sends inputs to your system and evaluates the responses. Use this to verify your setup works, validate test cases, and check for regressions after updates.
Passive testing is where Mibo really shines. Your production system sends every real user interaction to Mibo as a trace, and Mibo evaluates it automatically against your test cases. This gives you continuous quality monitoring, including edge cases and real-world scenarios that synthetic tests can never cover.

You can use both: active testing to validate during development, and passive testing to keep an eye on things in production.

Key concepts

Here are a few terms you’ll see throughout Mibo:

Project: Your workspace. Everything in Mibo lives inside a project. Think of it as a folder for one AI system.

Platform: The connection between Mibo and your AI system. It tells Mibo where to send test inputs and how to read the responses. One project can have multiple platforms (e.g., staging and production).

Test Case: A single scenario you want to verify. For example: “If a user asks about store hours, the agent should respond with the correct schedule.” Each test case can include rule-based checks (did the system call the right tool with the right arguments?) and AI-powered checks (is the response helpful and accurate?).

Who is Mibo for?

Teams building AI agents who want to ship with confidence.
Product managers who need visibility into AI quality before releasing updates.
QA teams looking to automate testing instead of clicking through scenarios manually.

The rest of these docs follow the order you’ll actually use Mibo: set up a project, create tests, verify with active testing, then go to production with passive testing.