TTS-SiliconFlowTTS (#6)

endtower · web-flow · commit ae96523a80f9 · 2025-08-01T12:31:26.000+08:00
* Update intro.md

* Update tts.md

* Update tts.md

* Update tts.md

* Update intro.md

* Update intro.md

* Update tts.md

* Update tts.md

* Update intro.md

* Update intro.md

* Update intro.md

* Update tts.md

* Update tts.md

* Update intro.md

* Update tts.md
diff --git a/docs/intro.md b/docs/intro.md
@@ -64,7 +64,7 @@ sidebar_position: 1
 - 🧠 **广泛的模型支持**：
   - 🤖 **大语言模型 (LLM)**：Ollama、OpenAI（及兼容 API）、Gemini、Claude、Mistral、DeepSeek、智谱、GGUF、LM Studio、vLLM 等。
   - 🎙️ **语音识别 (ASR)**：sherpa-onnx、FunASR、Faster-Whisper、Whisper.cpp、Whisper、Groq Whisper、Azure ASR 等。
-  - 🔊 **语音合成 (TTS)**：sherpa-onnx、pyttsx3、MeloTTS、Coqui-TTS、GPTSoVITS、Bark、CosyVoice、Edge TTS、Fish Audio、Azure TTS、OpenAI TTS (及兼容 API)、SparkTTS 等。
+  - 🔊 **语音合成 (TTS)**：sherpa-onnx、pyttsx3、MeloTTS、Coqui-TTS、GPTSoVITS、Bark、CosyVoice、Edge TTS、Fish Audio、Azure TTS、OpenAI TTS (及兼容 API)、SparkTTS、SiliconFlowTTS等。
 
 - 🔧 **高度可定制**：
   - ⚙️ **简单的模块配置**：通过简单的配置文件修改，即可切换各种功能模块，无需深入代码。
@@ -76,4 +76,4 @@ sidebar_position: 1
 ## 👥 用户评价
 > 感谢开发者把女朋友开源分享出来让大家一起使用
 >
-> 该女友使用次数已达 10w+
+> 该女友使用次数已达 10w+
diff --git a/docs/user-guide/backend/tts.md b/docs/user-guide/backend/tts.md
@@ -250,3 +250,26 @@ uv pip install fish-audio-sdk
 :::tip
 `conf.yaml` 中默认使用的是 neuro-sama 同款语音
 :::
+
+## SiliconFlow TTS（在线、需 API 密钥）  
+硅基流动提供的在线文本转语音服务，支持自定义音频模型和音色配置。  
+
+### 配置步骤  
+1. **上传音频**：  
+   硅基流动目前有FunAudioLLM/CosyVoice2-0.5B，需要上官网上传参考音频，网址如下：
+   https://docs.siliconflow.cn/cn/api-reference/audio/upload-voice。  
+
+3. **填写 `conf.yaml` 配置**：  
+   在配置文件的 `siliconflow_tts` 段落中，按以下格式填写参数（示例）：  
+
+```yaml
+siliconflow_tts:
+  api_url: "https://api.siliconflow.cn/v1/audio/speech"  # 服务端点，固定值
+  api_key: "sk-yourkey"  # 官网获取的API密钥
+  default_model: "FunAudioLLM/CosyVoice2-0.5B"  # 音频模型名称（支持列表见官网）
+  default_voice: "speech:Dreamflowers:aaaaaaabvbbbasdas"  # 音色ID，需在官网上传自定义音色后获取
+  sample_rate: 32000  # 输出采样率，声音异常时可尝试调整（如16000、44100）
+  response_format: "mp3"  # 音频格式（mp3/wav等）
+  stream: true  # 是否启用流式传输
+  speed: 1  # 语速（0.5~2.0，1为默认）
+  gain: 0  # 音量增益
diff --git a/i18n/en/docusaurus-plugin-content-docs/current/intro.md b/i18n/en/docusaurus-plugin-content-docs/current/intro.md
@@ -61,7 +61,7 @@ This project is currently in active development, with many exciting features com
 - 🧠 **Extensive model support**:
   - 🤖 **Large Language Models (LLM)**: Ollama, OpenAI (and any OpenAI-compatible API), Gemini, Claude, Mistral, DeepSeek, Zhipu AI, GGUF, LM Studio, vLLM, etc.
   - 🎙️ **Automatic Speech Recognition (ASR)**: sherpa-onnx, FunASR, Faster-Whisper, Whisper.cpp, Whisper, Groq Whisper, Azure ASR, etc.
-  - 🔊 **Text-to-Speech (TTS)**: sherpa-onnx, pyttsx3, MeloTTS, Coqui-TTS, GPTSoVITS, Bark, CosyVoice, Edge TTS, Fish Audio, Azure TTS, OpenAI TTS (and compatible APIs), SparkTTS, etc.
+  - 🔊 **Text-to-Speech (TTS)**: sherpa-onnx, pyttsx3, MeloTTS, Coqui-TTS, GPTSoVITS, Bark, CosyVoice, Edge TTS, Fish Audio, Azure TTS, OpenAI TTS (and compatible APIs), SparkTTS,SiliconFlowTTS,etc.
 
 - 🔧 **Highly customizable**:
   - ⚙️ **Simple module configuration**: Switch various functional modules through simple configuration file modifications, without delving into the code.
@@ -72,4 +72,4 @@ This project is currently in active development, with many exciting features com
 ## 👥 User Reviews
 > Thanks to the developer for open-sourcing and sharing the girlfriend for everyone to use
 >
-> This girlfriend has been used over 100,000 times
+> This girlfriend has been used over 100,000 times
diff --git a/i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/tts.md b/i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/tts.md
@@ -255,4 +255,29 @@ Since version `v0.2.5`, `api_key.py` has been deprecated. Please make sure to se
 :::
 :::tip
 The default voice used in `conf.yaml` is the same as neuro-sama
-:::
+:::
+
+## SiliconFlow TTS (Online, API Key Required)  
+An online text-to-speech service provided by SiliconFlow, supporting custom audio models and voice configuration.  
+
+
+### Configuration Steps  
+1. **Upload Reference Audio**：  
+   SiliconFlow currently offers models like `FunAudioLLM/CosyVoice2-0.5B`. To use them, upload reference audio via their official platform:  
+   [https://docs.siliconflow.cn/cn/api-reference/audio/upload-voice](https://docs.siliconflow.cn/cn/api-reference/audio/upload-voice)  
+
+
+2. **Fill in `conf.yaml`**：  
+   In the `siliconflow_tts` section of the configuration file, configure parameters as follows (example):  
+
+```yaml
+siliconflow_tts:
+  api_url: "https://api.siliconflow.cn/v1/audio/speech"  # Service endpoint (fixed value)
+  api_key: "sk-yourkey"  # API key obtained from SiliconFlow's official website
+  default_model: "FunAudioLLM/CosyVoice2-0.5B"  # Audio model name (check official docs for supported models)
+  default_voice: "speech:Dreamflowers:aaaaaaabvbbbasdas"  # Voice ID (generated after uploading custom voice on the official site)
+  sample_rate: 32000  # Output sample rate; adjust if audio is distorted (e.g., 16000, 44100)
+  response_format: "mp3"  # Audio format (e.g., mp3, wav)
+  stream: true  # Enable streaming mode
+  speed: 1  # Speaking speed (range: 0.5–2.0; 1 = default)
+  gain: 0  # Volume gain (range: -10–10; 0 = default)