Files
nuonuo/doc/p0_llm_integration.md
Fam Zheng d923aa1e31 NuoNuo: Hippocampal memory module prototype
Hopfield + Hebbian hybrid memory system for LLMs.
Two nights of experiments (16 iterations), validated on LongMemEval (ICLR 2025).

Architecture:
- Single-hop: Two-Stage Hopfield (NN top-20 → softmax settle)
- Multi-hop: Hebbian W matrix with WTA pattern separation
- 64% on LongMemEval (500 questions), retrieval-only, no LLM dependency
- 4ms latency @ 20K memories, ~1GB VRAM

Key findings:
- Hopfield attention solved noise tolerance (20% → 100% vs flat Hebbian)
- WTA pattern separation enables 20K+ capacity
- Multi-hop associative chains (6 hops, CosSim=1.0) — RAG can't do this
- MiniLM-L6 is optimal (discrimination gap > absolute similarity)
- Paraphrase cue augmentation: 55% → 100% on synthetic, 36% → 64% on benchmark
- SNN encoder viable (CosSim 0.99) but not needed for current architecture
2026-04-07 10:37:24 +01:00

1.4 KiB
Raw Blame History

P0: LLM Integration

状态:基础 pipeline 可用LLM Gateway 不通需后续验证

实现

  • llm.py: LLMClient + extract/paraphrase/format 函数
  • 支持 OpenAI-compatible APIfallback 到 heuristic
  • 端到端 pipeline: 对话 → 提取 → embed → store (with augmentation) → recall → context injection

端到端测试结果

5 轮对话存入 7 条记忆24 个 cue entries含 paraphrase augmentation

查询召回结果heuristic paraphrase

查询 正确? 说明
DB performance terrible 正确召回 missing indexes
How to push a new release? 正确召回 blue-green deploy
Redis connection info? 正确召回 port 6379
Login system has a problem 指向 database 而不是 auth
Database backup 正确召回 cron job
Deployment config? 正确召回 GitHub Actions

5/6 正确。失败的 case 是因为 heuristic paraphrase 没有生成 "login" ↔ "auth" 的关联。LLM paraphrase 应该能覆盖。

待解决

  1. LLM Gateway 不通 — 无法验证 LLM 提取和 paraphrase 质量
  2. 重复提取 — heuristic 会对同一对话提取 2 条相似记忆,需要去重
  3. Heuristic paraphrase 质量差 — 机械式替换("issue with X")不如 LLM 生成
  4. Auth→Login 这类语义跳跃 — 只有 LLM paraphrase 或更强 embedding 模型能解决