mirror of
https://github.com/k2-fsa/sherpa-onnx.git
synced 2026-01-09 07:41:06 +08:00
This PR adds support for Wenet non-streaming CTC models to sherpa-onnx by introducing the SherpaOnnxOfflineWenetCtcModelConfig struct and integrating it across all language bindings and APIs. The implementation follows the same pattern as other CTC model types like Zipformer CTC. - Introduces SherpaOnnxOfflineWenetCtcModelConfig struct with a single model field for the ONNX model path - Adds the new config to SherpaOnnxOfflineModelConfig and updates all language bindings (C++, Pascal, Kotlin, Java, Go, C#, Swift, JavaScript, etc.) - Provides comprehensive examples and tests across all supported platforms and languages
Introduction
This directory contains examples for the JAVA API of sherpa-onnx.
Usage
Non-streaming speaker diarization
./run-offline-speaker-diarization.sh
Streaming Speech recognition
./run-streaming-asr-from-mic-transducer.sh
./run-streaming-decode-file-ctc.sh
./run-streaming-decode-file-ctc-hlg.sh
./run-streaming-decode-file-paraformer.sh
./run-streaming-decode-file-transducer.sh
Non-Streaming Speech recognition
./run-non-streaming-decode-file-dolphin-ctc.sh
./run-non-streaming-decode-file-fire-red-asr.sh
./run-non-streaming-decode-file-moonshine.sh
./run-non-streaming-decode-file-nemo-canary.sh
./run-non-streaming-decode-file-nemo.sh
./run-non-streaming-decode-file-paraformer.sh
./run-non-streaming-decode-file-sense-voice.sh
./run-non-streaming-decode-file-tele-speech-ctc.sh
./run-non-streaming-decode-file-transducer-hotwords.sh
./run-non-streaming-decode-file-transducer.sh
./run-non-streaming-decode-file-whisper-multiple.sh
./run-non-streaming-decode-file-whisper.sh
./run-non-streaming-decode-file-zipformer-ctc.sh
Non-Streaming Speech recognition with homophone replacer
./run-non-streaming-decode-file-sense-voice-with-hr.sh
Non-Streaming text-to-speech
./run-non-streaming-tts-piper-en.sh
./run-non-streaming-tts-coqui-de.sh
./run-non-streaming-tts-vits-zh.sh
Non-Streaming text-to-speech (Play as it is generating)
./run-non-streaming-tts-piper-en-with-callback.sh
Spoken language identification
./run-spoken-language-identification-whisper.sh
Add punctuations to text
The punctuation model supports both English and Chinese.
./run-add-punctuation-zh-en.sh
Audio tagging
./run-audio-tagging-zipformer-from-file.sh
./run-audio-tagging-ced-from-file.sh
Speaker identification
./run-speaker-identification.sh
VAD with a microphone
./run-vad-from-mic.sh
VAD with a microphone + Non-streaming SenseVoice for speech recognition
./run-vad-from-mic-non-streaming-sense-voice.sh
VAD with a microphone + Non-streaming Paraformer for speech recognition
./run-vad-from-mic-non-streaming-paraformer.sh
VAD with a microphone + Non-streaming Whisper tiny.en for speech recognition
./run-vad-from-mic-non-streaming-whisper.sh
VAD (Remove silence)
./run-vad-remove-slience.sh
./run-ten-vad-remove-slience.sh
VAD + Non-streaming Dolphin CTC for speech recognition
./run-vad-non-streaming-dolphin-ctc.sh
VAD + Non-streaming SenseVoice for speech recognition
./run-vad-non-streaming-sense-voice.sh
VAD + Non-streaming Paraformer for speech recognition
./run-vad-non-streaming-paraformer.sh
Keyword spotter
./run-kws-from-file.sh