Commit Graph

8 Commits

Author SHA1 Message Date
Fam Zheng
ec1bd7cb25 add gen_voice tool, message timestamps, image multimodal, group chat, whisper STT
- gen_voice: IndexTTS2 voice cloning via tools/gen_voice script, ref audio
  cached on server to avoid re-upload
- Message timestamps: created_at column in messages table, prepended to
  content in API calls so LLM sees message times
- Image understanding: photos converted to base64 multimodal content
  for vision-capable models
- Group chat: independent session contexts per chat_id, sendMessageDraft
  disabled in groups (private chat only)
- Voice transcription: whisper service integration, transcribed text
  injected as [语音消息] prefix
- Integration tests marked #[ignore] (require external services)
- Reference voice asset: assets/ref_voice.mp3
- .gitignore: target/, noc.service, config/state/db files
2026-04-09 20:12:15 +01:00
Fam Zheng
9d5dd4eb16 add cc passthrough, diag tools dump, and search guidance in system prompt
- "cc" prefix messages bypass LLM backend and history, directly invoke claude -p
- diag command now dumps all registered tools and sends as .md file
- system prompt instructs LLM to use spawn_agent for search tasks
- spawn_agent tool description updated to mention search/browser capabilities
2026-04-09 17:59:48 +01:00
Fam Zheng
128f2481c0 add tool calling, SQLite persistence, group chat, image vision, voice transcription
Major features:
- OpenAI function calling with tool call loop (streaming SSE parsing)
- Built-in tools: spawn_agent (async claude -p), agent_status, kill_agent,
  update_scratch, send_file
- Script-based tool discovery: tools/ dir with --schema convention
- Feishu todo management script (tools/manage_todo)
- SQLite persistence: conversations, messages, config, scratch_area tables
- Sliding window context (100 msgs, slide 50, auto-summarize)
- Conversation summary generation via LLM on window slide
- Group chat support with independent session contexts
- Image understanding: multimodal vision input (base64 to API)
- Voice transcription via faster-whisper Docker service
- Configurable persona stored in DB
- diag command for session diagnostics
- System prompt restructured: persona + tool instructions separated
- RUST_BACKTRACE=1 in service, clippy in deploy pipeline
- .gitignore for config/state/db files
2026-04-09 16:38:28 +01:00
Fam Zheng
84ba209b3f add OpenAI-compatible backend, markdown rendering, and sendMessageDraft fix
- Configurable backend: claude (CLI) or openai (API), selected in config.yaml
- OpenAI streaming via SSE with conversation history in memory
- Session isolation: config name included in session UUID
- Markdown to Telegram HTML conversion (pulldown-cmark) for final messages
- Fix sendMessageDraft: skip cursor to preserve monotonic text growth,
  skip empty content chunks from SSE stream
- Simplify Makefile: single deploy target
2026-04-09 10:23:50 +01:00
Fam Zheng
eba7d89006 use sendMessageDraft for native streaming output, fallback to editMessageText
Telegram Bot API 9.3+ sendMessageDraft provides smooth streaming text
rendering without the flickering of repeated edits. Falls back to
editMessageText automatically if the API is unavailable (e.g. older
clients or group chats). Also reduces edit interval from 5s to 3s and
uses 1s interval for draft mode.
2026-04-09 09:35:55 +01:00
Fam Zheng
765ff2c51d support audio/voice/video/video_note media types and improve subprocess error diagnostics 2026-04-09 09:15:46 +01:00
Fam Zheng
4d88e80f1c add streaming responses, file transfer, remote deploy
- Streaming: use claude --output-format stream-json, edit TG message
  every 5s with progress, show tool use status during execution,
  ◎ cursor indicator while processing
- File transfer: download user uploads to ~/incoming/, scan
  ~/outgoing/{sid}/ for new files after claude completes
- Error handling: wrap post-auth logic in handle_inner, all errors
  reply to user instead of silently failing
- Remote deploy: make deploy-hera via SSH, generate service from
  template with dynamic PATH/REPO
- Service: binary installed to ~/bin/noc, WorkingDirectory=%h
- Invoke claude directly instead of ms wrapper
- Session state persisted to disk across restarts
2026-04-05 08:20:32 +01:00
Fam Zheng
db8ff94f7c init: telegram bot bridging messages to claude sessions
Async Rust bot (teloxide + tokio) that:
- Authenticates users per chat with a passphrase (resets daily at 5am)
- Generates deterministic UUID v5 session IDs from chat_id + date
- Pipes messages to `claude -p --session-id/--resume <uuid>`
- Persists auth and session state to disk across restarts
- Deploys as systemd --user service via `make deploy`
2026-04-05 06:56:46 +01:00