Passive Testing

Active testing verifies your setup works. Passive testing is where Mibo delivers its real value — your production system sends every real user interaction as a trace, and Mibo evaluates it automatically against your test cases. You get continuous visibility into AI quality, including edge cases and real-world scenarios that synthetic tests can never cover.

How it works

Your AI system handles a real user interaction.
Your system sends the trace data (input, output, and any execution details) to Mibo’s trace ingestion endpoint.
Mibo automatically triggers an execution that evaluates the trace against all active test cases for that platform.
You get the same quality scores, AI Judge evaluations, and Failure Matrix, without Mibo ever touching your system.

Setting it up

Configure your platform

Set up a platform connection in your project. Passive testing works with any platform type: Custom API, Flowise, or n8n.

For Custom API platforms, choose the Push trace mode. This tells Mibo to expect trace data sent separately from your system.
Create an API key

Go to your project settings and create an API key. You’ll use this key to authenticate trace requests.

You can optionally restrict the key to specific platforms. If you do, traces sent with that key will automatically be routed to the right platform.
Create test cases

Add the test cases you want to evaluate against incoming traces. Only active test cases are used; paused or draft test cases are skipped.
Send traces from your system

Have your AI system send trace data to Mibo after each interaction. See the API reference below.

Sending traces

Send a POST request to /public/traces with your trace data.

Request

curl -X POST "https://api.mibo-ai.com/public/traces" \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "data": {
      "text": "We are open Monday to Friday, 9am to 5pm."
    }
  }'

Required fields

x-api-key header: your project API key.
data (object): the trace payload. Must be a non-empty object. The structure inside data depends on your platform type; see the Trace Format Reference for exact schemas per platform.

Optional fields

platformId (UUID): explicitly specify which platform this trace belongs to. Required if your API key has access to multiple platforms and you’re not using metadata for resolution.
externalId (string, max 255 chars): your own identifier for the trace.
metadata (object): additional context stored alongside the trace. Can include platform identifiers for auto-routing (e.g., { "chatflowId": "abc" } for Flowise, { "workflowId": "xyz" } for n8n), environment info, or any other key-value pairs.

Compression

For large traces, you can send gzip-compressed payloads by setting Content-Encoding: gzip.

Platform resolution

Mibo determines which platform a trace belongs to using (in order):

The platformId field in the request body.
If the API key is restricted to a single platform, that platform is used automatically.
Matching metadata fields against platform configurations (e.g., chatflowId for Flowise, workflowId for n8n).

What happens after a trace is received

When Mibo receives a new trace:

The trace data is encrypted and stored.
Mibo checks if the platform has active test cases.
If it does, a passive execution is created and queued automatically.
The worker picks up the execution and evaluates the trace against each test case.
Results appear in your dashboard, just like active test results.

The evaluation happens asynchronously. The trace endpoint responds immediately with a 201, and the evaluation runs in the background.

Use cases

Production monitoring

Connect your system’s logging pipeline to Mibo. Every real user interaction gets evaluated against your test cases, giving you continuous quality visibility without extra API calls.

Post-mortem analysis

Something went wrong in production? Send the interaction trace to Mibo and get a full quality breakdown, including the Failure Matrix, AI Judge scores, and stage-level analysis.

Shadow testing

Evaluate production traffic against new or updated test cases before deploying changes. Catch regressions early by comparing quality scores over time.

Trace Format Reference Exact data structures for each platform type.