Use case Understand a paper

You came across an interesting idea on Twitter — and want to understand it, not just save the link.

Add the source

Paste a link — a tweet, preprint, blog post — and say what you want. Mana finds the sources, adds them to your vault, and writes an overview you can build on.

https://x.com/dimitrispapail/status/2056368948870811746

What did they do? Find the sources, add them, and get me an overview.

Ask questions

With sources and an overview in your vault, ask follow-ups in plain language. Mana answers from what it ingested and writes the result back as a note you can cite.

Context

The Objective

The paper writes ECHO as:

L_ECHO = L_GRPO(A) + λ L_Env(O′)

where:
- A = assistant action/command token positions
- O = all observation token positions
- O′ = terminal-output tokens selected for environment prediction
- L_Env = length-normalized cross-entropy on O′
- λ = weight on world-modeling vs policy optimization

Give a concrete example of how they formulate the environment objective.

Answer

Environment objective (§3.1)

The paper defines the environment-prediction loss as length-normalized cross-entropy on terminal-output tokens OO\mathcal{O}' \subseteq \mathcal{O}:

LEnv(θ;O)=1ZtOlogpθ(xtx<t)L_{\text{Env}}(\theta; \mathcal{O}') = -\frac{1}{Z} \sum_{t \in \mathcal{O}'} \log p_\theta(x_t \mid x_{<t})

where Z=OZ = |\mathcal{O}| normalizes each sequence by its total observation length (the sum runs over O\mathcal{O}'). The joint objective is:

Ltotal=LGRPO+λLEnvL_{\text{total}} = L_{\text{GRPO}} + \lambda L_{\text{Env}}

O\mathcal{O}' is terminal-output tokens only, excluding harness warning prefixes.

Concrete example

Suppose the transcript contains:

  • Action: pytest tests/test_api.py
  • Observation: failing assertion, traceback, failing test name in terminal output

GRPO uses the final task reward to update the command tokens only.

ECHO also trains on the observation tokens: given prior context and the command, the model should assign high likelihood to the traceback pytest actually returned.

The same pattern applies to ls src (predict returned filenames) and cat config.yaml (predict file contents).

What accumulates in your wiki

Each source and follow-up becomes a durable note you can cite — not a chat thread that disappears.

  • ✓ original PDFs and extracted markdown in /raw
  • ✓ overview pages — what they did and why it matters
  • ✓ cited Q&A notes for specific questions
  • ✓ concept pages for methods, terms, and mechanisms
  • ✓ backlinks as you add more sources
  • ✓ a git history of every agent edit

Understand your next link

Paste a tweet, preprint, or PDF and say what you want to know. Mana ingests the source, writes an overview, and saves every follow-up as a cited note in your wiki.