k2-fsa_sherpa-onnx

mirror of https://github.com/k2-fsa/sherpa-onnx.git synced 2026-01-09 07:41:06 +08:00

Author	SHA1	Message	Date
Fangjun Kuang	aecc39418d	Fix building wheels (#2619 )	2025-09-22 16:52:55 +08:00
Fangjun Kuang	7e42ba2c0c	Add various languge bindings for Wenet non-streaming CTC models (#2584 ) This PR adds support for Wenet non-streaming CTC models to sherpa-onnx by introducing the SherpaOnnxOfflineWenetCtcModelConfig struct and integrating it across all language bindings and APIs. The implementation follows the same pattern as other CTC model types like Zipformer CTC. - Introduces SherpaOnnxOfflineWenetCtcModelConfig struct with a single model field for the ONNX model path - Adds the new config to SherpaOnnxOfflineModelConfig and updates all language bindings (C++, Pascal, Kotlin, Java, Go, C#, Swift, JavaScript, etc.) - Provides comprehensive examples and tests across all supported platforms and languages	2025-09-10 18:52:18 +08:00
Fangjun Kuang	686b909e2f	Add various language bindings for streaming T-one Russian ASR models (#2576 ) This PR adds support for streaming T-one Russian ASR models across various language bindings in the sherpa-onnx library. The changes enable T-one CTC (Connectionist Temporal Classification) model integration by adding new configuration structures and example implementations. - Introduces OnlineToneCtcModelConfig structures across all language bindings (C, C++, Swift, Java, Kotlin, Go, etc.) - Adds T-one CTC model support to WASM implementations for both ASR and keyword spotting - Provides comprehensive example implementations demonstrating T-one model usage in multiple programming languages	2025-09-09 16:51:18 +08:00
Aruxxxi	8fa9f8eac9	feat: add punctuation C++ API (#2510 ) Co-authored-by: Aruxxxi <xiangcl@zhisuan.com>	2025-08-19 17:41:35 +08:00
Fangjun Kuang	e6bfab71ad	Add CXX API for KittenTTS (#2469 )	2025-08-08 15:45:20 +08:00
Fangjun Kuang	e2b2d5ea57	Add CXX examples for NeMo TDT ASR. (#2363 ) # New Features - Added new example programs demonstrating streaming speech recognition from a microphone using Parakeet-TDT CTC and Zipformer Transducer models with voice activity detection. - These examples support microphone input via PortAudio and display recognized text incrementally. # Bug Fixes - Improved error handling and logic when opening microphone devices in several example programs for more reliable device initialization. # Chores - Updated build configuration to include new executable examples when PortAudio support is enabled.	2025-07-09 18:30:42 +08:00
Fangjun Kuang	df4615ca1d	Add C/CXX/JavaScript API for NeMo Canary models (#2357 ) This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs by adding new Canary configuration structures, updating bindings, extending examples, and enhancing CI workflows. - Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS). - Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime. - Update examples and CI scripts to demonstrate and test NeMo Canary model usage.	2025-07-07 23:38:04 +08:00
Fangjun Kuang	3bf986d08d	Support non-streaming zipformer CTC ASR models (#2340 ) This PR adds support for non-streaming Zipformer CTC ASR models across multiple language bindings, WebAssembly, examples, and CI workflows. - Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs - Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js - Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models Model doc is available at https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html	2025-07-04 15:57:07 +08:00
Fangjun Kuang	fdda292d5a	Add alsa-based streaming ASR example for sense voice. (#2207 )	2025-05-13 19:08:09 +08:00
Fangjun Kuang	b269e5cccc	Add C++ example for real-time ASR with nvidia/parakeet-tdt-0.6b-v2. (#2201 )	2025-05-11 16:30:38 +08:00
Fangjun Kuang	028b8f2718	Add C++ example for streaming ASR with SenseVoice. (#2199 )	2025-05-11 00:23:32 +08:00
Fangjun Kuang	e51c37eb2f	Add C and CXX API for homophone replacer (#2156 )	2025-04-27 22:09:13 +08:00
Fangjun Kuang	da4aad1189	Add C and CXX API for Dolphin CTC models (#2088 )	2025-04-02 21:54:20 +08:00
Fangjun Kuang	0703bc1b86	Add CXX API for VAD (#2077 )	2025-04-01 14:51:43 +08:00
niansa/tuxifan	9d23606ee6	Allow building repository as CMake subdirectory (#2059 ) * Use PROJECT_SOURCE_DIR rather than CMAKE_SOURCE_DIR to allow building as subdirectory * Also use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR in c/cxx api examples * Only build examples by default when not building as subdirectory * Do not suggest building binaries either --------- Co-authored-by: user <user@mail.tld>	2025-03-29 06:27:59 +08:00
Fangjun Kuang	802119db17	Add CXX API for speech enhancement GTCRN models (#1986 )	2025-03-11 17:07:52 +08:00
Fangjun Kuang	1d49dd2fb0	Add CXX API for FireRedAsr (#1872 )	2025-02-17 11:46:13 +08:00
Fangjun Kuang	d815204774	Add CXX API for Kokoro TTS 1.0 (#1802 )	2025-02-07 14:51:49 +08:00
Fangjun Kuang	8b989a851c	Fix keyword spotting. (#1689 ) Reset the stream right after detecting a keyword	2025-01-20 16:41:10 +08:00
Fangjun Kuang	af671e2b63	Add C API for Kokoro TTS models (#1717 )	2025-01-16 15:07:26 +08:00
Fangjun Kuang	648903834b	Add CXX API for MatchaTTS models (#1676 )	2025-01-03 14:16:36 +08:00
Fangjun Kuang	c5205f08bf	Add an example for computing RTF about streaming ASR. (#1501 )	2024-11-01 11:40:13 +08:00
Fangjun Kuang	2ca2985d04	Add C and C++ API for Moonshine models (#1476 )	2024-10-26 23:24:46 +08:00
Fangjun Kuang	ceb69ebd94	Add C++ API for non-streaming ASR (#1456 )	2024-10-23 16:40:12 +08:00
Fangjun Kuang	effd5ef2be	Add C++ API for streaming ASR. (#1455 ) It is a wrapper around the C API.	2024-10-23 12:07:43 +08:00

25 Commits