Files

Fam Zheng d923aa1e31 NuoNuo: Hippocampal memory module prototype

Hopfield + Hebbian hybrid memory system for LLMs.
Two nights of experiments (16 iterations), validated on LongMemEval (ICLR 2025).

Architecture:
- Single-hop: Two-Stage Hopfield (NN top-20 → softmax settle)
- Multi-hop: Hebbian W matrix with WTA pattern separation
- 64% on LongMemEval (500 questions), retrieval-only, no LLM dependency
- 4ms latency @ 20K memories, ~1GB VRAM

Key findings:
- Hopfield attention solved noise tolerance (20% → 100% vs flat Hebbian)
- WTA pattern separation enables 20K+ capacity
- Multi-hop associative chains (6 hops, CosSim=1.0) — RAG can't do this
- MiniLM-L6 is optimal (discrimination gap > absolute similarity)
- Paraphrase cue augmentation: 55% → 100% on synthetic, 36% → 64% on benchmark
- SNN encoder viable (CosSim 0.99) but not needed for current architecture

2026-04-07 10:37:24 +01:00

1.9 KiB

Raw Permalink Blame History

P6: 多轮对话验证

场景

3 天的对话（DB troubleshooting → deployment → monitoring），12 条记忆 + heuristic paraphrase augmentation。

跨会话召回：12/12 (100%)

查询	跨天?	结果
DB is slow again	Day 1	✓ "missing index on created_at"
How big is the users table?	Day 1	✓ "2.3 million rows"
Who can access the database?	Day 1	✓ "Alice, Bob, Charlie"
What Postgres version?	Day 1	✓ "PostgreSQL 15.2"
How to deploy?	Day 2	✓ "blue-green via GitHub Actions"
How to rollback?	Day 2	✓ "switch load balancer"
Who approves deploys?	Day 2	✓ "Alice or David"
Monitoring dashboard?	Day 3	✓ "grafana.internal"
What alerts?	Day 3	✓ "PagerDuty"
DB slow, what index?	Cross	✓ "created_at"
Deploy logs?	Cross	✓ "Loki"
Database monitoring exporter	Cross	✓ "pg_exporter"

全部 similarity=1.0。Hopfield + augmentation 在小规模（12 memories）下完美。

Multi-hop

"database is slow" → hop1: "missing index" → hop2: "missing index" → hop3: "2.3 million rows"

hop2 循环了（指回自己），因为 Hebbian W 里 "missing index" 的最强关联还是它自己（自身的 outer product 贡献最大）。需要在 multi-hop 中加去重：已访问的 memory 不参与下一跳。

Memory 冲突

存了两个版本的 PostgreSQL 版本（15.2 和 16.1）：

Top-1: "Upgraded to 16.1" (sim=1.0) ← 更新的版本排第一
Top-2: "version 15.2" (sim=0.0) ← 旧版本也返回了

当前行为可接受（都返回，新的排前面）。更好的做法：

检测到同 cue 的更新 → 自动替换旧记忆
或标记旧记忆为 "superseded"

待改进

Multi-hop 去重: 已访问的 memory 排除出下一跳候选
Memory update 检测: 同 cue 新值自动覆盖旧值
大规模验证: 12 条是小规模，需要 100+ 条跨 session 的测试

1.9 KiB Raw Permalink Blame History Unescape Escape

P6: 多轮对话验证

场景

跨会话召回：12/12 (100%)

Multi-hop

Memory 冲突

待改进

1.9 KiB

Raw Permalink Blame History