Reversal curse

An LLM limitation where a model trained on 'A is B' fails to infer the reverse relation 'B is A'.

What is Reversal curse?

‍

Reversal curse is an LLM limitation where a model trained on "A is B" does not reliably infer the reverse relation "B is A." The term comes from research showing that autoregressive models can memorize forward facts without automatically learning the inverse form. (arxiv.org)

Understanding Reversal curse

‍

In practice, the reversal curse shows up when a model can answer one direction of a factual relation, but fails when the same fact is asked from the opposite angle. For example, if a model learns that one person is the parent of another, it may still miss the child-to-parent query unless it has seen that inverse relation directly in training or context. (arxiv.org)

This matters because LLMs are often used as if they store knowledge symmetrically, but next-token prediction does not guarantee symmetric generalization. The original paper also found that in-context examples can help models infer the reverse relation, which suggests the issue is more about learned generalization than an absolute inability to reason. (arxiv.org)

Key aspects of Reversal curse include:

Directionality: Knowledge learned in one phrasing may not transfer to the reverse phrasing.
Relation sensitivity: The failure is especially visible with biographies, family ties, roles, and other directional facts.
Model behavior: Autoregressive LLMs can appear confident while still missing the inverse fact.
Prompt dependence: In-context examples and careful prompting can sometimes improve performance.
Evaluation value: It is a useful test for whether a model truly learned a relation or just memorized a surface form.

Advantages of Reversal curse

‍

Better diagnostics: It gives teams a concrete way to test relational generalization.
Safer product design: It reminds builders not to assume symmetric recall from model outputs.
Sharper evals: It helps separate memorization from genuine understanding.
Improved prompting: Teams can design prompts and examples that expose weak inverse reasoning.
Training insight: It highlights where data augmentation or fine-tuning may need reverse examples.

Challenges in Reversal curse

‍

Hidden failures: A model may look correct on forward queries and still fail on reverse ones.
Weak transfer: Learning one relation form does not always generalize to the inverse.
Evaluation design: You need paired forward and reverse tests to catch it reliably.
Data coverage: Sparse or one-sided training data makes the issue more likely.
Product risk: Knowledge assistants can give incomplete answers when users ask from a different direction.

Example of Reversal curse in Action

‍

Scenario: a team fine-tunes a support assistant on product knowledge like "X is the successor to Y." The assistant answers forward questions well, but misses reverse questions like "What product did X replace?"

In a test set, the model correctly says that Product B replaced Product A, but when asked which product Product B replaced, it answers vaguely or hallucinates. That is a classic reversal curse pattern, and it is exactly the kind of gap that paired evals should surface early.

How PromptLayer helps with Reversal curse

‍

PromptLayer helps teams track prompt versions, compare outputs, and run evaluations that include both forward and reverse relation checks. That makes it easier to spot asymmetric behavior, keep examples organized, and improve the reliability of LLM workflows over time.

Ready to try it yourself? Sign up for PromptLayer and start managing your prompts in minutes.