Skip to content

Commit ae96523

Browse files
authored
TTS-SiliconFlowTTS (#6)
* Update intro.md * Update tts.md * Update tts.md * Update tts.md * Update intro.md * Update intro.md * Update tts.md * Update tts.md * Update intro.md * Update intro.md * Update intro.md * Update tts.md * Update tts.md * Update intro.md * Update tts.md
1 parent ba04616 commit ae96523

File tree

4 files changed

+53
-5
lines changed

4 files changed

+53
-5
lines changed

docs/intro.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ sidebar_position: 1
6464
- 🧠 **广泛的模型支持**
6565
- 🤖 **大语言模型 (LLM)**:Ollama、OpenAI(及兼容 API)、Gemini、Claude、Mistral、DeepSeek、智谱、GGUF、LM Studio、vLLM 等。
6666
- 🎙️ **语音识别 (ASR)**:sherpa-onnx、FunASR、Faster-Whisper、Whisper.cpp、Whisper、Groq Whisper、Azure ASR 等。
67-
- 🔊 **语音合成 (TTS)**:sherpa-onnx、pyttsx3、MeloTTS、Coqui-TTS、GPTSoVITS、Bark、CosyVoice、Edge TTS、Fish Audio、Azure TTS、OpenAI TTS (及兼容 API)、SparkTTS
67+
- 🔊 **语音合成 (TTS)**:sherpa-onnx、pyttsx3、MeloTTS、Coqui-TTS、GPTSoVITS、Bark、CosyVoice、Edge TTS、Fish Audio、Azure TTS、OpenAI TTS (及兼容 API)、SparkTTS、SiliconFlowTTS等
6868

6969
- 🔧 **高度可定制**
7070
- ⚙️ **简单的模块配置**:通过简单的配置文件修改,即可切换各种功能模块,无需深入代码。
@@ -76,4 +76,4 @@ sidebar_position: 1
7676
## 👥 用户评价
7777
> 感谢开发者把女朋友开源分享出来让大家一起使用
7878
>
79-
> 该女友使用次数已达 10w+
79+
> 该女友使用次数已达 10w+

docs/user-guide/backend/tts.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,3 +250,26 @@ uv pip install fish-audio-sdk
250250
:::tip
251251
`conf.yaml` 中默认使用的是 neuro-sama 同款语音
252252
:::
253+
254+
## SiliconFlow TTS(在线、需 API 密钥)
255+
硅基流动提供的在线文本转语音服务,支持自定义音频模型和音色配置。
256+
257+
### 配置步骤
258+
1. **上传音频**:
259+
硅基流动目前有FunAudioLLM/CosyVoice2-0.5B,需要上官网上传参考音频,网址如下:
260+
https://docs.siliconflow.cn/cn/api-reference/audio/upload-voice。
261+
262+
3. **填写 `conf.yaml` 配置**:
263+
在配置文件的 `siliconflow_tts` 段落中,按以下格式填写参数(示例):
264+
265+
```yaml
266+
siliconflow_tts:
267+
api_url: "https://api.siliconflow.cn/v1/audio/speech" # 服务端点,固定值
268+
api_key: "sk-yourkey" # 官网获取的API密钥
269+
default_model: "FunAudioLLM/CosyVoice2-0.5B" # 音频模型名称(支持列表见官网)
270+
default_voice: "speech:Dreamflowers:aaaaaaabvbbbasdas" # 音色ID,需在官网上传自定义音色后获取
271+
sample_rate: 32000 # 输出采样率,声音异常时可尝试调整(如16000、44100)
272+
response_format: "mp3" # 音频格式(mp3/wav等)
273+
stream: true # 是否启用流式传输
274+
speed: 1 # 语速(0.5~2.0,1为默认)
275+
gain: 0 # 音量增益

i18n/en/docusaurus-plugin-content-docs/current/intro.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ This project is currently in active development, with many exciting features com
6161
- 🧠 **Extensive model support**:
6262
- 🤖 **Large Language Models (LLM)**: Ollama, OpenAI (and any OpenAI-compatible API), Gemini, Claude, Mistral, DeepSeek, Zhipu AI, GGUF, LM Studio, vLLM, etc.
6363
- 🎙️ **Automatic Speech Recognition (ASR)**: sherpa-onnx, FunASR, Faster-Whisper, Whisper.cpp, Whisper, Groq Whisper, Azure ASR, etc.
64-
- 🔊 **Text-to-Speech (TTS)**: sherpa-onnx, pyttsx3, MeloTTS, Coqui-TTS, GPTSoVITS, Bark, CosyVoice, Edge TTS, Fish Audio, Azure TTS, OpenAI TTS (and compatible APIs), SparkTTS, etc.
64+
- 🔊 **Text-to-Speech (TTS)**: sherpa-onnx, pyttsx3, MeloTTS, Coqui-TTS, GPTSoVITS, Bark, CosyVoice, Edge TTS, Fish Audio, Azure TTS, OpenAI TTS (and compatible APIs), SparkTTS,SiliconFlowTTS,etc.
6565

6666
- 🔧 **Highly customizable**:
6767
- ⚙️ **Simple module configuration**: Switch various functional modules through simple configuration file modifications, without delving into the code.
@@ -72,4 +72,4 @@ This project is currently in active development, with many exciting features com
7272
## 👥 User Reviews
7373
> Thanks to the developer for open-sourcing and sharing the girlfriend for everyone to use
7474
>
75-
> This girlfriend has been used over 100,000 times
75+
> This girlfriend has been used over 100,000 times

i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/tts.md

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,4 +255,29 @@ Since version `v0.2.5`, `api_key.py` has been deprecated. Please make sure to se
255255
:::
256256
:::tip
257257
The default voice used in `conf.yaml` is the same as neuro-sama
258-
:::
258+
:::
259+
260+
## SiliconFlow TTS (Online, API Key Required)
261+
An online text-to-speech service provided by SiliconFlow, supporting custom audio models and voice configuration.
262+
263+
264+
### Configuration Steps
265+
1. **Upload Reference Audio**:
266+
SiliconFlow currently offers models like `FunAudioLLM/CosyVoice2-0.5B`. To use them, upload reference audio via their official platform:
267+
[https://docs.siliconflow.cn/cn/api-reference/audio/upload-voice](https://docs.siliconflow.cn/cn/api-reference/audio/upload-voice)
268+
269+
270+
2. **Fill in `conf.yaml`**:
271+
In the `siliconflow_tts` section of the configuration file, configure parameters as follows (example):
272+
273+
```yaml
274+
siliconflow_tts:
275+
api_url: "https://api.siliconflow.cn/v1/audio/speech" # Service endpoint (fixed value)
276+
api_key: "sk-yourkey" # API key obtained from SiliconFlow's official website
277+
default_model: "FunAudioLLM/CosyVoice2-0.5B" # Audio model name (check official docs for supported models)
278+
default_voice: "speech:Dreamflowers:aaaaaaabvbbbasdas" # Voice ID (generated after uploading custom voice on the official site)
279+
sample_rate: 32000 # Output sample rate; adjust if audio is distorted (e.g., 16000, 44100)
280+
response_format: "mp3" # Audio format (e.g., mp3, wav)
281+
stream: true # Enable streaming mode
282+
speed: 1 # Speaking speed (range: 0.5–2.0; 1 = default)
283+
gain: 0 # Volume gain (range: -10–10; 0 = default)

0 commit comments

Comments
 (0)