mirror of
https://github.com/k2-fsa/sherpa-onnx.git
synced 2026-01-09 07:41:06 +08:00
This PR removes the cppjieba dependency from the sherpa-onnx project by replacing its usage with character-based text processing. The main purpose is to simplify the codebase by eliminating the need for external jieba dictionary files and the cppjieba library. - Replaces jieba-based word segmentation with UTF-8 character-level tokenization - Removes all references to dict_dir and dictDir parameters across APIs - Adds a new CharacterLexicon class to replace JiebaLexicon