k2-fsa_sherpa-onnx

mirror of https://github.com/k2-fsa/sherpa-onnx.git synced 2026-01-09 07:41:06 +08:00

Author	SHA1	Message	Date
colourmebrad	ef5c23e6c9	exposing online punctuation model support in node-addon-api (#2609 ) * exposing online punctuation model support in node-addon-api * renaming nodejs-addon-examples/test_punctuation.js to test_offline_punctuation.js * adding test_online_punctuation to nodejs-addon-examples and updating CI to run test_offline_punctuation and test_online_punctuation	2025-09-19 23:29:55 +08:00
Fangjun Kuang	bff2691e8c	Add CI tests for dart spoken language identifcation example (#2598 )	2025-09-15 09:28:34 +08:00
Fangjun Kuang	7e42ba2c0c	Add various languge bindings for Wenet non-streaming CTC models (#2584 ) This PR adds support for Wenet non-streaming CTC models to sherpa-onnx by introducing the SherpaOnnxOfflineWenetCtcModelConfig struct and integrating it across all language bindings and APIs. The implementation follows the same pattern as other CTC model types like Zipformer CTC. - Introduces SherpaOnnxOfflineWenetCtcModelConfig struct with a single model field for the ONNX model path - Adds the new config to SherpaOnnxOfflineModelConfig and updates all language bindings (C++, Pascal, Kotlin, Java, Go, C#, Swift, JavaScript, etc.) - Provides comprehensive examples and tests across all supported platforms and languages	2025-09-10 18:52:18 +08:00
Fangjun Kuang	686b909e2f	Add various language bindings for streaming T-one Russian ASR models (#2576 ) This PR adds support for streaming T-one Russian ASR models across various language bindings in the sherpa-onnx library. The changes enable T-one CTC (Connectionist Temporal Classification) model integration by adding new configuration structures and example implementations. - Introduces OnlineToneCtcModelConfig structures across all language bindings (C, C++, Swift, Java, Kotlin, Go, etc.) - Adds T-one CTC model support to WASM implementations for both ASR and keyword spotting - Provides comprehensive example implementations demonstrating T-one model usage in multiple programming languages	2025-09-09 16:51:18 +08:00
Fangjun Kuang	858b5052a2	Add C++ and Python support for T-one streaming Russian ASR models (#2575 ) This PR adds support for T-one streaming Russian ASR models in both C++ and Python APIs. The T-one model is a CTC-based Russian speech recognition model with specific characteristics including float16 state handling, 300ms frame lengths, and 8kHz sampling rate. - Added new OnlineToneCtcModel implementation with specialized processing for T-one models - Integrated T-one support into the existing CTC model pipeline and Python bindings - Added Python example and test scripts for the new functionality	2025-09-09 12:07:34 +08:00
Fangjun Kuang	283d8fed70	Add Swift API for computing speaker embeddings (#2492 )	2025-08-14 20:42:33 +08:00
Fangjun Kuang	353658eabb	Add C# API for KittenTTS (#2477 )	2025-08-08 20:22:05 +08:00
Fangjun Kuang	c726f9d7ee	Add Swift API for KittenTTS (#2476 )	2025-08-08 20:16:33 +08:00
Fangjun Kuang	9f3e70e598	Add Dart API for KittenTTS (#2475 )	2025-08-08 20:14:54 +08:00
Fangjun Kuang	26b0e8162d	Add JavaScript API (WebAssembly) for KittenTTS (#2471 )	2025-08-08 16:31:28 +08:00
Fangjun Kuang	6a61791066	Add JavaScript API (node-addon) for KittenTTS (#2470 )	2025-08-08 16:27:51 +08:00
Fangjun Kuang	9122c9d272	Add Python API for KittenTTS. (#2466 )	2025-08-08 15:01:43 +08:00
Fangjun Kuang	8ab5cba598	Add APIs for Online NeMo CTC models (#2454 )	2025-08-07 09:28:16 +08:00
Fangjun Kuang	0514aeeb0c	Add Swift API for ten-vad (#2387 )	2025-07-12 19:45:42 +08:00
Fangjun Kuang	7f1d71fed3	Add Dart API for ten-vad (#2386 )	2025-07-12 19:41:01 +08:00
Fangjun Kuang	71aea2f19c	Add C# API for ten-vad (#2385 )	2025-07-12 18:39:18 +08:00
Fangjun Kuang	fd9a687ec2	Add Pascal/Go/C#/Dart API for NeMo Canary ASR models (#2367 ) Add support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer. - Define Canary model config in Pascal, Go, C#, Dart and update converter functions - Add SetConfig API for offline recognizer (Pascal, Go, C#, Dart) - Extend CI/workflows and example scripts to test non-streaming Canary decoding	2025-07-10 14:53:33 +08:00
Askars Salimbajevs	f0960342ad	Add LODR support to online and offline recognizers (#2026 ) This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore. - Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id. - Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths. - Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.	2025-07-09 16:23:46 +08:00
Fangjun Kuang	df4615ca1d	Add C/CXX/JavaScript API for NeMo Canary models (#2357 ) This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs by adding new Canary configuration structures, updating bindings, extending examples, and enhancing CI workflows. - Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS). - Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime. - Update examples and CI scripts to demonstrate and test NeMo Canary model usage.	2025-07-07 23:38:04 +08:00
Fangjun Kuang	0e738c356c	Add C++ runtime and Python API for NeMo Canary models (#2352 )	2025-07-07 17:03:49 +08:00
Fangjun Kuang	3bf986d08d	Support non-streaming zipformer CTC ASR models (#2340 ) This PR adds support for non-streaming Zipformer CTC ASR models across multiple language bindings, WebAssembly, examples, and CI workflows. - Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs - Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js - Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models Model doc is available at https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html	2025-07-04 15:57:07 +08:00
Fangjun Kuang	bda427f4b2	Add API to get version information (#2309 )	2025-06-25 00:22:21 +08:00
Fangjun Kuang	d57e4f84de	Add Python API for source separation (#2283 )	2025-06-05 20:44:26 +08:00
Fangjun Kuang	2b2788332e	Add C++ support for UVR models (#2269 )	2025-06-01 17:22:08 +08:00
Fangjun Kuang	99defc5b90	Add nodejs example for parakeet-tdt-0.6b-v2. (#2219 )	2025-05-15 11:27:22 +08:00
Fangjun Kuang	85df96d528	Add Dart API for homophone replacer (#2167 )	2025-04-30 23:15:28 +08:00
Fangjun Kuang	63d01a9534	Add Swift API for homophone replacer. (#2164 )	2025-04-29 18:50:41 +08:00
Fangjun Kuang	51f8824219	Add homonphone replacer example for Python API. (#2161 )	2025-04-29 15:59:34 +08:00
Fangjun Kuang	9d25c90a59	Add JavaScript API (node-addon) for homophone replacer (#2158 )	2025-04-28 20:52:42 +08:00
Fangjun Kuang	a0aef1f6cd	Add JavaScript API (WASM) for homophone replacer (#2157 )	2025-04-28 20:47:49 +08:00
Fangjun Kuang	f64c58342b	Support replacing homonphonic phrases (#2153 )	2025-04-27 15:31:11 +08:00
Fangjun Kuang	be0f382a54	Support Giga AM transducer V2 (#2136 )	2025-04-20 10:15:20 +08:00
Fangjun Kuang	07a5701af6	Add Dart API for Dolphin CTC models (#2095 )	2025-04-03 15:59:38 +08:00
Fangjun Kuang	903e825eba	Add Javascript (node-addon) API for Dolphin CTC models (#2094 )	2025-04-03 15:03:33 +08:00
Fangjun Kuang	639ad1744f	Add Javascript (WebAssembly) API for Dolphin CTC models (#2093 )	2025-04-03 15:02:06 +08:00
Fangjun Kuang	74f402e490	Add Swift API for Dolphin CTC models (#2091 )	2025-04-03 00:03:11 +08:00
Fangjun Kuang	2dc0f91904	Add C# API for Dolphin CTC models (#2089 )	2025-04-02 23:36:22 +08:00
Fangjun Kuang	0de7e1b9f0	Add C++ and Python API for Dolphin CTC models (#2085 )	2025-04-02 19:09:00 +08:00
Fangjun Kuang	0aacf02dd8	Add C++ runtime for vocos (#2014 )	2025-03-17 17:05:15 +08:00
Fangjun Kuang	c972554ad1	Add JavaScript API (wasm) for speech enhancement GTCRN models (#2007 )	2025-03-15 17:41:23 +08:00
Fangjun Kuang	6a97f8adcf	Add JavaScript (node-addon) API for speech enhancement GTCRN models (#1996 )	2025-03-12 15:52:01 +08:00
Fangjun Kuang	fd78a482df	Add Dart API for speech enhancement GTCRN models (#1993 )	2025-03-12 12:39:08 +08:00
Fangjun Kuang	d3e27d5e21	Add C# API for speech enhancement GTCRN models (#1990 )	2025-03-11 18:58:17 +08:00
Fangjun Kuang	c12d1d88c0	Add Swift API for speech enhancement GTCRN models (#1989 )	2025-03-11 18:03:13 +08:00
Fangjun Kuang	5d2d792b1d	Add Python API for speech enhancement GTCRN models (#1978 )	2025-03-10 19:02:17 +08:00
Fangjun Kuang	488a6e687c	Add C++ runtime for speech enhancement GTCRN models (#1977 ) See also https://github.com/Xiaobin-Rong/gtcrn	2025-03-10 18:11:16 +08:00
Fangjun Kuang	1e2328242d	Test using sherpa-onnx as a cmake subproject (#1961 )	2025-03-06 12:12:56 +08:00
Fangjun Kuang	ed922e61b5	Fix publishing pre-built windows libraries (#1905 )	2025-02-21 11:59:27 +08:00
Fangjun Kuang	b5d89d7bcb	Add Dart API for FireRedAsr AED Model (#1877 )	2025-02-17 15:17:08 +08:00
Fangjun Kuang	b03f6e6e8c	Add Swift API for FireRedAsr AED Model (#1876 )	2025-02-17 15:16:23 +08:00

1 2 3 4 5

212 Commits