AI Assistant endpoints
/ai/* group — assistant queries with Knowledge Base context (RAG via ChromaDB + DeepSeek R1), automatic test-case generation and KB reindexing.
The feature is only available when enabled by the platform admin (PUT /platform/admin/ai/config → enabled: true). When disabled, the endpoints return 403 or AiStatusResponse.enabled=false.
Endpoint table
| Method | Path | Auth | Description |
|---|---|---|---|
| POST | /ai/stream | Session | Streaming response (SSE). |
| POST | /ai/ask | Session | Synchronous response with sources and token usage. |
| GET | /ai/status | Session | Subsystem status (KB, DeepSeek, config). |
| POST | /ai/generate-cases | Session | Generates test cases from XML. |
| POST | /ai/reindex | Session (platform admin) | Forces a KB reindex. |
Authentication: endpoints require an authenticated session (
ensure_authenticated). API keys are not accepted here currently — use JWT or session cookie.
POST /ai/stream (SSE)
Streaming response via Server-Sent Events. Each chunk is a data: …\n\n line. The stream ends with data: [DONE]\n\n.
Request body (AiAskRequest):
{
"query": "How to write a decoder for nginx access logs?",
"context_type": "decoder",
"current_xml": "<decoder name=\"my-nginx\">…</decoder>"
}
query— 2–2000 chars.context_type—"rule","decoder","general"ornull. Filters the KB to be relevant.current_xml— XML being edited (up to 50,000 chars). Included as context to the model.
Response headers:
Content-Type: text/event-stream
Cache-Control: no-cache
X-Accel-Buffering: no
Stream format:
data: {"type":"content","text":"To write a decoder for nginx"}
data: {"type":"content","text":", use the pattern <regex>..."}
data: {"type":"sources","sources":[{"record_uid":"...","title":"Nginx decoder example","relevance_score":0.91}]}
data: [DONE]
Example (curl):
curl -N -X POST "$RF_BASE/ai/stream" \
-H "Authorization: Bearer $JWT" \
-H "Content-Type: application/json" \
-d '{"query":"How to write a decoder for nginx?", "context_type":"decoder"}'
Limits:
- SSE connections per user are limited. Exceeded →
429 sse_connection_limit_exceededwithRetry-After. - Assistant rate limit:
NATIVE_WAZUH_AI_RATE_LIMIT_PER_MINUTE(default 20).
POST /ai/ask
Synchronous, non-streaming version. Returns the full answer + sources + token usage.
Request body — same AiAskRequest as the stream.
Response 200 (AiAskResponse):
{
"answer": "To write a decoder for nginx access logs, use the pattern…",
"sources": [
{
"record_uid": "kb_nginx_decoder_01",
"title": "Nginx access log decoder",
"source_trust": "wazuh-docs",
"relevance_score": 0.91
}
],
"token_usage": {
"prompt_tokens": 1243,
"completion_tokens": 287,
"cached_tokens": 1100
}
}
sources[]— KB documents used as context.token_usage.cached_tokens— tokens reused from the prompt cache (reduces cost).
Example (curl):
curl -X POST "$RF_BASE/ai/ask" \
-H "Authorization: Bearer $JWT" \
-H "Content-Type: application/json" \
-d '{"query":"Explain the <prematch> tag","context_type":"decoder"}' | jq
GET /ai/status
Returns the operational status of the assistant.
Response 200 (AiStatusResponse):
{
"enabled": true,
"vector_store_count": 12847,
"deepseek_reachable": true,
"error": null,
"model": "deepseek-reasoner",
"base_url_display": "https://api.***seek.com/v1",
"max_context_docs": 4,
"rate_limit_per_minute": 20,
"knowledge_base_path": "data/base_conhecimento"
}
enabled— whether the subsystem is on and the KB is loaded.vector_store_count— documents indexed in ChromaDB.deepseek_reachable— live connectivity check with the provider API.base_url_display— provider URL with the middle masked (does not expose the full domain).
When enabled=false, other fields may be zeroed. error is populated when there is an init failure.
POST /ai/generate-cases
Automatically generates test cases from the XML content submitted. The assistant inspects rules/decoders and proposes events + expectations.
Request body (AiGenerateCasesRequest):
{
"rules_xml": "<group name=\"custom\"><rule id=\"100010\">…</rule></group>",
"decoders_xml": "<decoder name=\"custom\">…</decoder>",
"max_cases": 5
}
rules_xml,decoders_xml— up to 200,000 chars each. At least one must have content.max_cases— 1–30, default 10.
Response 200 (AiGenerateCasesResponse):
{
"cases": [
{
"name": "SSH failed password — root user",
"description": "Root-user failed login attempt event",
"event_text": "Jan 10 12:00:01 web01 sshd[1234]: Failed password for root from 10.0.0.5 port 44218 ssh2",
"log_format": "syslog",
"expected_decoder": "sshd",
"expected_rule": "5716",
"expected_fields": { "dstuser": "root", "srcip": "10.0.0.5" },
"expected_rule_ids": ["5716"]
}
],
"token_usage": { "prompt_tokens": 890, "completion_tokens": 412, "cached_tokens": 0 }
}
The client typically:
- Calls
/ai/generate-caseswith the workspace XML. - Shows generated cases in the UI for review.
- Persists approved cases via
POST /platform/projects/{id}/cases.
Errors:
400 bad_request— empty or invalid XML.500 internal_error— generation failure (details indebug_id).
POST /ai/reindex
Forces full reindex of the Knowledge Base. Clears the existing collection and re-imports all JSONL from data/base_conhecimento/.
Auth: session. In practice this operation is done by administrators; may take seconds to minutes depending on KB size.
Response 200:
{ "status": "ok", "indexed_records": 12847 }
or on failure:
{ "status": "error", "error": "…" }
Configuration (administrator)
These endpoints respond according to the configuration in:
GET /platform/admin/ai/config— exposed to admin for reading state.PUT /platform/admin/ai/config— changes API key, model, base URL, rate limit.POST /platform/admin/ai/test— validates connectivity before saving.
These admin endpoints are not documented here — they are internal to platform operation.
Related links
- Analysis endpoints — logtest that can be fed by the generated cases.
- Projects endpoints — case creation.