- gen_voice: IndexTTS2 voice cloning via tools/gen_voice script, ref audio
cached on server to avoid re-upload
- Message timestamps: created_at column in messages table, prepended to
content in API calls so LLM sees message times
- Image understanding: photos converted to base64 multimodal content
for vision-capable models
- Group chat: independent session contexts per chat_id, sendMessageDraft
disabled in groups (private chat only)
- Voice transcription: whisper service integration, transcribed text
injected as [语音消息] prefix
- Integration tests marked #[ignore] (require external services)
- Reference voice asset: assets/ref_voice.mp3
- .gitignore: target/, noc.service, config/state/db files