Skip to content

[refactor] support building fst online#230

Merged
xingchensong merged 1 commit intomasterfrom
xcsong-refactor
Jun 5, 2024
Merged

[refactor] support building fst online#230
xingchensong merged 1 commit intomasterfrom
xcsong-refactor

Conversation

@xingchensong
Copy link
Member

@xingchensong xingchensong commented Jun 5, 2024

Now we can building fst online !!!

pip install wetextprocessing
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Copyright [2024-06-05] <sxc19@mails.tsinghua.edu.cn, Xingchen Song>

from itn.chinese.inverse_normalizer import InverseNormalizer
from tn.chinese.normalizer import Normalizer as ZhNormalizer
from tn.english.normalizer import Normalizer as EnNormalizer

zh_tn_text = "你好 WeTextProcessing 1.0,船新版本儿,船新体验儿,简直666,9和10"
zh_itn_text = "你好 WeTextProcessing 一点零,船新版本儿,船新体验儿,简直六六六,九和六"
en_tn_text = "Hello WeTextProcessing 1.0, life is short, just use wetext, 666, 9 and 10"
zh_tn_model = ZhNormalizer(remove_erhua=True, overwrite_cache=True)
zh_itn_model = InverseNormalizer(enable_0_to_9=False, overwrite_cache=True)
en_tn_model = EnNormalizer(overwrite_cache=True)
print("中文 TN (去除儿化音,重新在线构图):\n\t{} => {}".format(zh_tn_text, zh_tn_model.normalize(zh_tn_text)))
print("中文ITN (小于10的单独数字不转换,重新在线构图):\n\t{} => {}".format(zh_itn_text, zh_itn_model.normalize(zh_itn_text)))
print("英文 TN (暂时还没有可控的选项,后面会加...):\n\t{} => {}\n".format(en_tn_text, en_tn_model.normalize(en_tn_text)))

zh_tn_model = ZhNormalizer(overwrite_cache=False)
zh_itn_model = InverseNormalizer(overwrite_cache=False)
en_tn_model = EnNormalizer(overwrite_cache=False)
print("中文 TN (复用之前编译好的图):\n\t{} => {}".format(zh_tn_text, zh_tn_model.normalize(zh_tn_text)))
print("中文ITN (复用之前编译好的图):\n\t{} => {}".format(zh_itn_text, zh_itn_model.normalize(zh_itn_text)))
print("英文 TN (复用之前编译好的图):\n\t{} => {}\n".format(en_tn_text, en_tn_model.normalize(en_tn_text)))

zh_tn_model = ZhNormalizer(remove_erhua=False, overwrite_cache=True)
zh_itn_model = InverseNormalizer(enable_0_to_9=True, overwrite_cache=True)
print("中文 TN (不去除儿化音,重新在线构图):\n\t{} => {}".format(zh_tn_text, zh_tn_model.normalize(zh_tn_text)))
print("中文ITN (小于10的单独数字也进行转换,重新在线构图):\n\t{} => {}\n".format(zh_itn_text, zh_itn_model.normalize(zh_itn_text)))

image

@xingchensong xingchensong merged commit 1c3018d into master Jun 5, 2024
@xingchensong xingchensong deleted the xcsong-refactor branch June 5, 2024 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant