LocalAGI’s 50% Tool-Call Failure Rate Is an Infrastructure Bug, Not a Model Problem

Published: 2026-06-05 Last verified: 2026-05-09 empirical

Published Fact-checked 2026-05-09 · 0 corrections

LocalAGI’s 50% Tool-Call Failure Rate Is an Infrastructure Bug, Not a Model Problem

What you expect

LocalAGI is a local agent framework backed by LocalAI for inference. When a tool call fails, the natural assumption is model capability: the LLM produced malformed output, or the chosen model (gemma-3-4b, cogito, hermes) is too weak for reliable function calling. Switching to a larger or better-instruction-tuned model should improve reliability.

What actually happens

Six independent infrastructure bugs in LocalAI and LocalAGI compound to produce a ~50% MCP tool-call failure floor. Backend logs show tool execution completing successfully while the HTTP response layer drops or corrupts the result. Switching models does not fix any of them.

Bug 1: Response Serialization Gap (50–70% failure rate)

mudler/LocalAI#7772 — OPEN, reported December 2025.

Backend execution logs confirm tool calls completing and results being retrieved. The LocalAI HTTP response returns "Invalid http method" with no choices and no completion data in 50–70% of cases. Confirmed across gemma-3-4b-it-qat, cogito-v1-preview-qwen-14B, and nousresearch_hermes-4-14b. The defect is in LocalAI’s MCP response path — not in model output.

Bug 2: Zero-Parameter Tool Schema Bug (~99% failure rate)

mudler/LocalAGI#362 — OPEN, reported November 2025.

Tools with no required parameters — list_memory, list_reminders, and similar — fail with most models. The reporter observed successful execution only once across many attempts. Adding a dummy required parameter as a workaround restores functionality, which confirms the bug is in parameter schema encoding at the LocalAGI layer, not in model comprehension of the tool’s purpose.

Bug 3: Streaming Parser Duplication

mudler/LocalAI#9722 — OPEN, reported May 8, 2026.

When streaming /v1/chat/completions with tool calls, the same function call appears at multiple index values. The root cause: two concurrent parsers operate without coordination — a C++ chat-template autoparser and a Go iterative JSON parser both fire independently on the same accumulated content. No deduplication exists between them. This is the most recently reported evidence, confirming the issues persist in the current codebase.

Bug 4: Tool-Choice Grammar Silent Failure (partially fixed)

mudler/LocalAI#9508 — PARTIALLY FIXED (PR #9509 covers one of four call sites), reported April 23, 2026.

Specifying tool_choice: {type: "function", function: {name: "X"}} silently fails to enforce grammar constraints. The model receives the tool list but no constraint, and produces free-text output instead of a tool call. Root cause: four code locations in LocalAI use SetFunctionCallString (which sets mode) where SetFunctionCallNameString (which sets the function name) is required:

core/http/middleware/request.go:620
core/http/endpoints/anthropic/messages.go:883
core/http/endpoints/openai/realtime_model.go:171
core/http/endpoints/openresponsei/responses.go:776

A separate bug causes string-format tool_choice values (e.g., "required") to be silently dropped due to unmarshaling errors without error propagation. Three of the four setter sites remain unfixed as of the issue date.

Bug 5: Empty ToolCalls[] in API Response

mudler/LocalAI#9334 — OPEN, reported April 13, 2026.

With Gemma 4 in LocalAI v4.1.3, tool execution completes and is visible in backend traces, but the API response returns ToolCalls: []. The retry mechanism triggers five times before failing. This is independently confirmed as the same serialization gap as Bug 1 — on a different model — establishing that the response serialization defect is not model-specific.

Bug 6: MCP HTTP Transport No-Recovery

mudler/LocalAGI#418 — OPEN, reported February 16, 2026.

After any temporary network disruption, HTTP-based MCP transport connections enter a permanently broken state. Subsequent tool listing fails with "client is closing: standalone SSE stream: failed to connect: Bad Request". Recovery requires restarting the entire LocalAGI agent. There is no automatic reconnection or connection-state monitoring for HTTP MCP transports.

Additional: Plan Re-Evaluation Nil Crash

mudler/LocalAGI#428 — OPEN, reported February 23, 2026.

When a subtask exhausts retries, the plan re-evaluation callback in plan.go:230 receives a nil pointer reference, triggering a runtime error: invalid memory address or nil pointer dereference. The agent crashes rather than attempting alternative strategies.

What this means for you

A builder diagnosing 50% tool-call failures in LocalAGI is likely chasing the wrong root cause. The failure is not in the model layer — it is in the HTTP response serialization, streaming parser coordination, and connection resilience layers of LocalAI and LocalAGI.

The bugs compound: a request that survives the serialization gap (Bug 1) may still fail due to grammar constraint dropping (Bug 4) or duplicate emission (Bug 3). No single fix resolves the aggregate failure floor. Four of the six bugs (Bugs 1, 4, 5, 6) have no application-layer workaround and require upstream fixes in LocalAI itself.

The most recent confirmed evidence is Issue #9722 from May 8, 2026, showing the streaming parser duplication persists in the current codebase. All evidence comes from development and homelab contexts; production reliability at scale is unconfirmed.

What to do

Diagnose in this order to isolate which bug is affecting your deployment:

Check for zero-parameter tools — if any of your tools have no required parameters, add a dummy required parameter as a temporary workaround (fixes Bug 2).
Disable streaming — set streaming to false and check whether the failure rate drops (Bug 3 is streaming-only).
Check if tool_choice is forced — test with tool_choice: auto to isolate grammar-enforcement failures (Bug 4). If reliability improves, you are hitting Bug 4.
Check backend logs — if logs show successful tool execution but API returns empty or errored results, you are hitting the response serialization gap (Bugs 1 and 5). These have no application-layer workaround; wait for upstream fixes.
For MCP transport failures — any connection drop requires a full agent restart (Bug 6). Design for agent restarts as a normal operational event, not an exception.

If you need reliable tool calling from local models today, consider using Ollama with a model that has strong native function-calling support rather than LocalAI’s MCP layer — Ollama’s tool-call implementation is documented in a separate block (ollama-local-inference.md) and has a different failure surface.

Falsification criterion: This finding would be disproved by evidence that LocalAI’s HTTP response serialization layer correctly returns tool-call results in all cases (i.e., the issue reporters’ backend-confirms-but-HTTP-drops pattern was due to user misconfiguration, not a LocalAI defect), OR by a LocalAI release that resolves Issues #7772, #9334, #9722, and #9508 and shows the aggregate tool-call success rate rising above 80% across multiple models in independent testing.

Evidence

Tool	Version	Evidence	Result
LocalAI	v4.1.3 and earlier (#7772)	source-reviewed	Backend executes tool, HTTP API returns error or empty ToolCalls[]; confirmed across 3+ models
LocalAGI	Nov 2025 – Feb 2026 (#362)	source-reviewed	Zero-param tools fail ~99% of the time; dummy param workaround confirms schema encoding bug
LocalAI streaming	Current (#9722, May 8 2026)	source-reviewed	C++ autoparser + Go JSON parser both fire independently, emitting duplicate tool calls
LocalAI grammar enforcement	Current (#9508, Apr 2026)	source-reviewed	`SetFunctionCallString` used at 4 locations where `SetFunctionCallNameString` required; 3 of 4 unfixed
LocalAI	v4.1.3 (#9334, Apr 2026)	independently-confirmed	Empty ToolCalls[] with Gemma 4 — independently confirms Bug 1 serialization gap across different model
LocalAGI HTTP transport	Feb 2026 (#418)	source-reviewed	Permanent broken state after connection reset; restart required; no auto-reconnect
LocalAGI plan re-eval	Feb 2026 (#428)	source-reviewed	Nil pointer dereference at plan.go:230 on retry exhaustion; agent crashes

Confidence: empirical — 7 GitHub issues reviewed, 2 independently confirming the same serialization gap on different models. All evidence from mudler/LocalAI and mudler/LocalAGI issue trackers (toolmaker-filed, highest-confidence evidence class for behavioral claims).

Strongest case against: LocalAGI is early-stage open-source software maintained by a small team. The ~50% failure rate comes from reporters working in development and homelab contexts — not production telemetry. Some of these bugs may have been fixed in releases not yet reflected in the issue tracker. Bug 8 (FastMCP Accept-header mismatch, #283) was already resolved via PR #318 in October 2025. A builder who keeps dependencies current and disables streaming may see materially better reliability than the ~50% floor. The aggregate figure is a worst-case composition of independent bugs, not a measured end-to-end rate from a standardized benchmark.

Open questions: What is the actual failure rate for non-streaming HTTP-only paths after Bug 3 (streaming-specific) is excluded? Have Bugs 1 and 5 been confirmed to share the same code path, or are they two independent serialization defects? Does the failure rate improve significantly on LocalAI versions after v4.1.3?

Seen different? Contribute your evidence — share a repro or counter-example and we’ll review it against this finding. Reader evidence is what keeps these findings accurate.