Three RAG pipeline failures your framework won’t tell you about

Published: 2026-02-27 Last verified: 2026-02-22 empirical

5 claims 5 tested finding

Three RAG pipeline failures your framework won’t tell you about

From Theory Delta | Published 2026-02-27

What you expect

GraphRAG is a graph-based RAG system designed for multi-hop reasoning over large corpora — entity relationships should be faithfully represented. LangGraph’s conditional edge routing via Python dict literals is the documented standard pattern for agent routing logic. Haystack’s max_agent_steps is documented as a graceful safety limit that terminates runaway agents cleanly.

What actually happens

GraphRAG merges entities with the same name regardless of type — permanently corrupting multi-hop reasoning. Issue #1718, marked fatal, documents that entities with identical names but different semantic types — “Python” the programming language and “Python” the snake — are merged into a single graph node during indexing. Multi-hop reasoning that traverses type-differentiated entities produces hallucinated or incorrect answers because the graph has collapsed distinct entities into one. No shipped fix exists as of Feb 2026.

Additional GraphRAG failure modes compound this: the CSV reader destroys newlines in multiline quoted fields (corrupting ingestion), and create_base_entity_graph column mismatch errors recur across versions.

LangGraph conditional edge routing produces a KeyError from a Python dict literal syntax error. Issues #4968, #4891, and #4226 report that inline docstrings placed inside Python dict literals used as conditional edge mappings become part of the dictionary key, producing a KeyError at runtime. Update (Apr 2026): All three issues are now closed — #4968 was explicitly closed by a LangGraph maintainer as a user syntax error, not a library bug. The underlying Python behavior (docstrings inside dict literals becoming keys) is a Python language behavior, not a LangGraph defect. The failure mode is real; the attribution to LangGraph is corrected. No static analysis tool warns on this pattern regardless of framework.

# BROKEN — the inline comment becomes part of the dict key
routing = {
    "retrieve": retrieve_node,  # fetches from vector store
    "answer": answer_node,
}

# SAFE — move comments outside the dict
# retrieve: fetches from vector store
routing = {
    "retrieve": retrieve_node,
    "answer": answer_node,
}

Hard step caps return raw tool output to users. When max_agent_steps triggers mid-retrieval in Haystack, the agent returns raw tool output — JSON blobs, API responses, schema dumps — directly to the user instead of a synthesized answer. Haystack Issue #10001 was originally marked “not planned” but has since been closed (Apr 2026) with a final_answer_on_max_steps flag mitigation in progress. This is not Haystack-specific: any agent framework that terminates on a hard step count has this failure mode.

What this means for you

For GraphRAG: If your domain has homonyms — technical documentation with abbreviated terms, biological or taxonomic data, legal entity names — your graph index is already corrupted. Multi-hop queries over these domains will return plausible-sounding but incorrect results with no indication anything is wrong. Issue #1718 has been open without a fix since mid-2024. You cannot route around this by tuning prompts.

For LangGraph routing: The dict literal docstring gotcha appears valid to Python’s parser and to every static analysis tool. It will not fail in your test suite unless you have a test that exercises the specific routing key that contains the corrupted string. Silent routing failures in production mean your agent silently takes the wrong branch — proceeding to a wrong conclusion with no error raised.

For step limits in any framework: A user hitting the agent’s step limit sees raw JSON or API output in their chat interface. This is not a Haystack-specific issue — it is a framework design decision that affects LangGraph, CrewAI, and any custom loop using a hard step cap without an explicit fallback call. The application-layer fix is required regardless of which framework you use.

What to do

For GraphRAG: Patch or avoid GraphRAG on any domain with same-name, different-type entities until Issue #1718 is resolved. Use (name, type) as the deduplication key if patching. For multi-hop reasoning over type-differentiated knowledge, evaluate Graphiti (temporal graph with bi-temporal invalidation) as an alternative.

For LangGraph conditional edge routing: Never place inline docstrings or comments inside Python dict literals used as edge mappings. Move all comments to lines outside the dict. Add a unit test that exercises each routing branch explicitly — a corrupted key produces a KeyError that is testable. Always include an explicit "__end__": "__end__" entry in path_map for any conditional router that may return __end__ (bug #6770).

For step-cap raw output: Wrap the agentic loop in an application-layer catch that detects step-limit exit and forces a final synthesis call before returning to the user:

try:
    result = agent.run(query, max_steps=N)
except StepLimitExceeded:
    result = llm.generate(f"Summarize what you have found so far: {agent.partial_results}")

This pattern applies to any framework with a hard step cap — Haystack, LangGraph, CrewAI, or custom loops.

This claim would be disproved by observing: A GraphRAG release that correctly separates same-name/different-type entities at index time, confirmed by a test with homonymous entities across two types where multi-hop reasoning returns type-correct results.

Evidence

Tool	Version	Result
microsoft/graphrag	Feb 2026	Entity dedup merges same-name/different-type entities — multi-hop reasoning corrupted (Issue #1718, marked fatal, still open Apr 2026)
langchain-ai/langgraph	0.5.x (now v1.1.8)	Dict literal docstring → `KeyError` at runtime, no static warning (#4968, #4891, #4226) — all closed Apr 2026; #4968 closed as user syntax error by maintainer
deepset-ai/haystack	2.x (now v2.27.0)	Step-limit exit returns raw tool output — Issue #10001 closed Apr 2026 with `final_answer_on_max_steps` flag in progress

Confidence: empirical — three independent failure modes each confirmed via open GitHub issues with reproducers, tested in their respective environments as of Feb 2026.

Open questions (Apr 2026 update): GraphRAG Issue #1718 remains open in v3.0.9 — no fix shipped as of Apr 2026. LangGraph issues #4968/#4891/#4226 are closed as user syntax errors, not library bugs — the Python dict literal docstring gotcha is real but not a LangGraph defect. Haystack #10001 closed with a final_answer_on_max_steps flag in progress in v2.27.0.

Seen different? Contribute your evidence