add gen_voice tool, message timestamps, image multimodal, group chat, whisper STT

euphon/noc

- gen_voice: IndexTTS2 voice cloning via tools/gen_voice script, ref audio
  cached on server to avoid re-upload
- Message timestamps: created_at column in messages table, prepended to
  content in API calls so LLM sees message times
- Image understanding: photos converted to base64 multimodal content
  for vision-capable models
- Group chat: independent session contexts per chat_id, sendMessageDraft
  disabled in groups (private chat only)
- Voice transcription: whisper service integration, transcribed text
  injected as [语音消息] prefix
- Integration tests marked #[ignore] (require external services)
- Reference voice asset: assets/ref_voice.mp3
- .gitignore: target/, noc.service, config/state/db files

This commit is contained in:

Fam Zheng

2026-04-09 20:12:15 +01:00

parent 9d5dd4eb16

commit ec1bd7cb25

6 changed files with 370 additions and 54 deletions

3

.gitignore vendored

View File

@@ -3,3 +3,6 @@ config.*.yaml
 state.json
 state.*.json
 *.db
 target/
 noc.service

add gen_voice tool, message timestamps, image multimodal, group chat, whisper STT

3 .gitignore vendored Unescape Escape View File

3

.gitignore vendored

View File