Documentation is available at http://bpraneeth.com/docs/deeptranslit
from deeptranslit import DeepTranslit
transliterator = DeepTranslit('hindi')
# Currently hindi (hi) is the only supported language.
transliterator.transliterate('mai mbbs complete karliya')
[('मैं एमबीबीएस कम्प्लीट करलिया', 4), ('मैं एमबीबीएस कंप्लीट करलिया', 4), ('मैं एमबीबी कम्प्लीट करलिया', 4)]
- The dataset is created from various sources (translation corpora (http://www.cfilt.iitb.ac.in/iitb_parallel/), wikipedia parallel titles (extracted locally using some heuristics) etc..).
- Transliteration for low resource languages is not an easy task and will never be perfect. This module is still in beta and not perfect.
- Add local caching for faster transliteration.
- Support for more languages.