25 Commits

Author SHA1 Message Date
Fangjun Kuang
aecc39418d
Fix building wheels (#2619) 2025-09-22 16:52:55 +08:00
Fangjun Kuang
7e42ba2c0c
Add various languge bindings for Wenet non-streaming CTC models (#2584)
This PR adds support for Wenet non-streaming CTC models to sherpa-onnx by introducing the SherpaOnnxOfflineWenetCtcModelConfig struct and integrating it across all language bindings and APIs. The implementation follows the same pattern as other CTC model types like Zipformer CTC.

- Introduces SherpaOnnxOfflineWenetCtcModelConfig struct with a single model field for the ONNX model path
- Adds the new config to SherpaOnnxOfflineModelConfig and updates all language bindings (C++, Pascal, Kotlin, Java, Go, C#, Swift, JavaScript, etc.)
- Provides comprehensive examples and tests across all supported platforms and languages
2025-09-10 18:52:18 +08:00
Fangjun Kuang
686b909e2f
Add various language bindings for streaming T-one Russian ASR models (#2576)
This PR adds support for streaming T-one Russian ASR models across various language bindings in the sherpa-onnx library. The changes enable T-one CTC (Connectionist Temporal Classification) model integration by adding new configuration structures and example implementations.

- Introduces OnlineToneCtcModelConfig structures across all language bindings (C, C++, Swift, Java, Kotlin, Go, etc.)
- Adds T-one CTC model support to WASM implementations for both ASR and keyword spotting
- Provides comprehensive example implementations demonstrating T-one model usage in multiple programming languages
2025-09-09 16:51:18 +08:00
Aruxxxi
8fa9f8eac9
feat: add punctuation C++ API (#2510)
Co-authored-by: Aruxxxi <xiangcl@zhisuan.com>
2025-08-19 17:41:35 +08:00
Fangjun Kuang
e6bfab71ad
Add CXX API for KittenTTS (#2469) 2025-08-08 15:45:20 +08:00
Fangjun Kuang
e2b2d5ea57
Add CXX examples for NeMo TDT ASR. (#2363)
# New Features
- Added new example programs demonstrating streaming speech recognition from a microphone using Parakeet-TDT CTC and Zipformer Transducer models with voice activity detection.
- These examples support microphone input via PortAudio and display recognized text incrementally.

# Bug Fixes
- Improved error handling and logic when opening microphone devices in several example programs for more reliable device initialization.

# Chores
- Updated build configuration to include new executable examples when PortAudio support is enabled.
2025-07-09 18:30:42 +08:00
Fangjun Kuang
df4615ca1d
Add C/CXX/JavaScript API for NeMo Canary models (#2357)
This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs 
by adding new Canary configuration structures, updating bindings, extending examples,
and enhancing CI workflows.

- Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS).
- Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime.
- Update examples and CI scripts to demonstrate and test NeMo Canary model usage.
2025-07-07 23:38:04 +08:00
Fangjun Kuang
3bf986d08d
Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
fdda292d5a
Add alsa-based streaming ASR example for sense voice. (#2207) 2025-05-13 19:08:09 +08:00
Fangjun Kuang
b269e5cccc
Add C++ example for real-time ASR with nvidia/parakeet-tdt-0.6b-v2. (#2201) 2025-05-11 16:30:38 +08:00
Fangjun Kuang
028b8f2718
Add C++ example for streaming ASR with SenseVoice. (#2199) 2025-05-11 00:23:32 +08:00
Fangjun Kuang
e51c37eb2f
Add C and CXX API for homophone replacer (#2156) 2025-04-27 22:09:13 +08:00
Fangjun Kuang
da4aad1189
Add C and CXX API for Dolphin CTC models (#2088) 2025-04-02 21:54:20 +08:00
Fangjun Kuang
0703bc1b86
Add CXX API for VAD (#2077) 2025-04-01 14:51:43 +08:00
niansa/tuxifan
9d23606ee6
Allow building repository as CMake subdirectory (#2059)
* Use PROJECT_SOURCE_DIR rather than CMAKE_SOURCE_DIR to allow building as subdirectory

* Also use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR in c/cxx api examples

* Only build examples by default when not building as subdirectory

* Do not suggest building binaries either

---------

Co-authored-by: user <user@mail.tld>
2025-03-29 06:27:59 +08:00
Fangjun Kuang
802119db17
Add CXX API for speech enhancement GTCRN models (#1986) 2025-03-11 17:07:52 +08:00
Fangjun Kuang
1d49dd2fb0
Add CXX API for FireRedAsr (#1872) 2025-02-17 11:46:13 +08:00
Fangjun Kuang
d815204774
Add CXX API for Kokoro TTS 1.0 (#1802) 2025-02-07 14:51:49 +08:00
Fangjun Kuang
8b989a851c
Fix keyword spotting. (#1689)
Reset the stream right after detecting a keyword
2025-01-20 16:41:10 +08:00
Fangjun Kuang
af671e2b63
Add C API for Kokoro TTS models (#1717) 2025-01-16 15:07:26 +08:00
Fangjun Kuang
648903834b
Add CXX API for MatchaTTS models (#1676) 2025-01-03 14:16:36 +08:00
Fangjun Kuang
c5205f08bf
Add an example for computing RTF about streaming ASR. (#1501) 2024-11-01 11:40:13 +08:00
Fangjun Kuang
2ca2985d04
Add C and C++ API for Moonshine models (#1476) 2024-10-26 23:24:46 +08:00
Fangjun Kuang
ceb69ebd94
Add C++ API for non-streaming ASR (#1456) 2024-10-23 16:40:12 +08:00
Fangjun Kuang
effd5ef2be
Add C++ API for streaming ASR. (#1455)
It is a wrapper around the C API.
2024-10-23 12:07:43 +08:00