k2-fsa_sherpa-onnx

mirror of https://github.com/k2-fsa/sherpa-onnx.git synced 2026-01-09 07:41:06 +08:00

Author	SHA1	Message	Date
Fangjun Kuang	b1db3eaa8d	Export Whisper to RK NPU (#2983 )	2026-01-05 19:46:30 +08:00
Fangjun Kuang	13b8b84a89	Add C and CXX API for Google MedASR model (#2946 )	2025-12-29 09:56:08 +08:00
Fangjun Kuang	afa59281c1	Build APKs for MatchaTTS Chinese+English (#2882 )	2025-12-10 15:57:56 +08:00
Fangjun Kuang	8fac37f7d1	Load QNN context binary for faster startup (#2877 )	2025-12-09 17:55:19 +08:00
alex-spacemit	586cd19e22	Add spacemit ort ep for spacemit riscv cpus (#2837 ) This pull request significantly extends the project's hardware compatibility by integrating a dedicated SpacemiT Execution Provider for ONNX Runtime. The changes enable efficient model inference on SpacemiT RISC-V CPUs, leveraging their RVV1.0 capabilities. This involves updates to the build system, new CMake modules for toolchain and ONNX Runtime package handling, and modifications to the core provider and session management logic to recognize and configure the SpacemiT EP.	2025-12-02 14:36:31 +08:00
Fangjun Kuang	d1c458b95d	Add C++ QNN support for Zipformer CTC models. (#2809 )	2025-11-24 18:14:22 +08:00
Fangjun Kuang	16d62b6a08	Add Android demo with QNN (Qualcomm NPU) for SenseVoice ASR (#2803 )	2025-11-20 17:22:07 +08:00
Fangjun Kuang	2fcde7d3c6	Support hotwords with byte level bpe (#2802 )	2025-11-19 18:21:16 +08:00
Fangjun Kuang	1832b35070	Add C# API for Omnilingual ASR CTC models (#2775 )	2025-11-13 15:12:20 +08:00
Fangjun Kuang	c691318b95	Support RK NPU for SenseVoice non-streaming ASR models (#2589 ) This PR adds RK NPU support for SenseVoice non-streaming ASR models by implementing a new RKNN backend with greedy CTC decoding. - Adds offline RKNN implementation for SenseVoice models including model loading, feature processing, and CTC decoding - Introduces export tools to convert SenseVoice models from PyTorch to ONNX and then to RKNN format - Implements provider-aware validation to prevent mismatched model and provider usage	2025-09-12 10:46:38 +08:00
Fangjun Kuang	7e42ba2c0c	Add various languge bindings for Wenet non-streaming CTC models (#2584 ) This PR adds support for Wenet non-streaming CTC models to sherpa-onnx by introducing the SherpaOnnxOfflineWenetCtcModelConfig struct and integrating it across all language bindings and APIs. The implementation follows the same pattern as other CTC model types like Zipformer CTC. - Introduces SherpaOnnxOfflineWenetCtcModelConfig struct with a single model field for the ONNX model path - Adds the new config to SherpaOnnxOfflineModelConfig and updates all language bindings (C++, Pascal, Kotlin, Java, Go, C#, Swift, JavaScript, etc.) - Provides comprehensive examples and tests across all supported platforms and languages	2025-09-10 18:52:18 +08:00
Fangjun Kuang	686b909e2f	Add various language bindings for streaming T-one Russian ASR models (#2576 ) This PR adds support for streaming T-one Russian ASR models across various language bindings in the sherpa-onnx library. The changes enable T-one CTC (Connectionist Temporal Classification) model integration by adding new configuration structures and example implementations. - Introduces OnlineToneCtcModelConfig structures across all language bindings (C, C++, Swift, Java, Kotlin, Go, etc.) - Adds T-one CTC model support to WASM implementations for both ASR and keyword spotting - Provides comprehensive example implementations demonstrating T-one model usage in multiple programming languages	2025-09-09 16:51:18 +08:00
Fangjun Kuang	858b5052a2	Add C++ and Python support for T-one streaming Russian ASR models (#2575 ) This PR adds support for T-one streaming Russian ASR models in both C++ and Python APIs. The T-one model is a CTC-based Russian speech recognition model with specific characteristics including float16 state handling, 300ms frame lengths, and 8kHz sampling rate. - Added new OnlineToneCtcModel implementation with specialized processing for T-one models - Integrated T-one support into the existing CTC model pipeline and Python bindings - Added Python example and test scripts for the new functionality	2025-09-09 12:07:34 +08:00
Fangjun Kuang	e4f48ce6a6	Export models from https://github.com/voicekit-team/T-one to sherpa-onnx (#2571 ) This PR exports models from the T-one repository (https://github.com/voicekit-team/T-one) to sherpa-onnx format, creating a complete pipeline for Russian speech recognition using streaming CTC models. - Adds scripts to download, process, and test T-one models in sherpa-onnx format - Creates GitHub workflow for automated model export and publishing - Updates kaldi-native-fbank dependency to version 1.22.1	2025-09-08 17:22:23 +08:00
Fangjun Kuang	7c9d071ef7	Simplify the usage of our non-Android Java API (#2533 ) This PR simplifies the usage of the non-Android Java API by providing platform-specific JAR files that include native shared libraries, eliminating the need for users to manually manage native dependencies. - Refactored LibraryUtils.java to support multiple library loading methods including extracting from JAR resources - Added build infrastructure to create platform-specific native library JAR files - Introduced debug capabilities and improved error handling for library loading	2025-08-26 20:13:07 +08:00
Fangjun Kuang	e8dd5cd2a0	Split sherpa-onnx Python package (#2521 )	2025-08-25 10:16:58 +08:00
yangjun	6eac1af8ac	Fix ctrl+c may lead to coredump (#2511 )	2025-08-19 18:31:34 +08:00
Fangjun Kuang	bfbd603342	Add Kotlin and Java API for KittenTTS (#2461 )	2025-08-07 22:19:11 +08:00
Fangjun Kuang	6b16c0b864	Export https://github.com/KittenML/KittenTTS to sherpa-onnx (#2456 )	2025-08-07 11:59:40 +08:00
Fangjun Kuang	9d25c90a59	Add JavaScript API (node-addon) for homophone replacer (#2158 )	2025-04-28 20:52:42 +08:00
Fangjun Kuang	eee5575836	Add Kotlin and Java API for Dolphin CTC models (#2086 )	2025-04-02 21:16:14 +08:00
Fangjun Kuang	3420c89883	Export silero_vad v4 to RKNN (#2067 )	2025-03-30 12:00:52 +08:00
cjsdurj	b87fce9a7f	c-api add wave write to buffer. (#1962 ) Co-authored-by: jian.chen03 <jian.chen03@transwarp.io>	2025-03-10 17:21:23 +08:00
ivan provalov	94728bfbee	Fixing Whisper Model Token Normalization (#1904 )	2025-02-21 12:58:01 +08:00
Fangjun Kuang	316424b382	Add C++ and Python API for FireRedASR AED models (#1867 )	2025-02-16 22:45:24 +08:00
Fangjun Kuang	c84a833863	Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795 )	2025-02-06 22:57:13 +08:00
Fangjun Kuang	08cefe8488	Export Kokoro 1.0 to sherpa-onnx (#1788 )	2025-02-05 08:24:43 +08:00
Fangjun Kuang	af671e2b63	Add C API for Kokoro TTS models (#1717 )	2025-01-16 15:07:26 +08:00
Fangjun Kuang	a00d3b4821	Add Java API for Matcha-TTS models. (#1673 )	2025-01-02 15:15:30 +08:00
Fangjun Kuang	3422b9388d	Add Kotlin API for Matcha-TTS models. (#1668 )	2024-12-31 19:20:52 +08:00
Fangjun Kuang	314545f938	Add speaker identification APIs for HarmonyOS (#1607 ) * Add speaker embedding extractor API for HarmonyOS * Add ArkTS API for speaker identification	2024-12-09 19:23:18 +08:00
Fangjun Kuang	bd4b223920	Add Kotlin and Java API for Moonshine models (#1474 )	2024-10-26 22:30:29 +08:00
Fangjun Kuang	d468527f62	C API for speaker diarization (#1402 )	2024-10-09 17:10:03 +08:00
Fangjun Kuang	70165cb42d	Speaker diarization example with onnxruntime Python API (#1395 )	2024-10-06 16:37:29 +08:00
Lim Yao Chong	3bffc24d64	Add Python binding for online punctuation models (#1312 )	2024-09-09 10:26:53 +08:00
Fangjun Kuang	6b8877f185	Downgrade flutter sdk versions. (#1305 )	2024-08-30 11:47:27 +08:00
Fangjun Kuang	65f1c0fab2	Add Pascal API for reading wave files (#1243 )	2024-08-11 22:43:42 +08:00
Fangjun Kuang	94e256244d	Add blank penalty for various language bindings. (#1234 )	2024-08-08 10:43:31 +08:00
Fangjun Kuang	994c3e7c96	Add VAD + Non-streaming ASR example for JavaScript API. (#1170 )	2024-07-26 12:42:08 +08:00
Fangjun Kuang	25f0a10468	Add C++ runtime for SenseVoice models (#1148 )	2024-07-18 22:54:18 +08:00
Fangjun Kuang	dd0ff2ca06	Support onnxruntime 1.18.0 (#906 )	2024-07-10 17:05:26 +08:00
Fangjun Kuang	1fe12c5107	Support the platform iOS for Flutter (#1079 )	2024-07-06 19:43:37 +08:00
Fangjun Kuang	f5e9a162d1	Publish flutter packages for Android (#1074 )	2024-07-04 20:07:07 +08:00
Fangjun Kuang	6e09933d99	Inverse text normalization API for other programming languages (#1019 )	2024-06-17 17:02:39 +08:00
Fangjun Kuang	fd5a0d1e00	Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970 )	2024-06-05 00:26:40 +08:00
Fangjun Kuang	031134b4d4	Add TTS for node-addon-api (#871 )	2024-05-13 19:24:09 +08:00
Fangjun Kuang	17cd3a5f01	Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854 )	2024-05-10 12:15:39 +08:00
Fangjun Kuang	2f9553d838	Begin to add node-addon-api for sherpa-onnx (#826 )	2024-05-03 14:47:40 +08:00
Fangjun Kuang	88202f05bb	Add Java API for audio tagging (#820 )	2024-04-28 22:26:04 +08:00
Fangjun Kuang	f2d074aea9	Fix a bug for offline paraformer (#816 )	2024-04-26 16:40:42 +08:00

1 2

97 Commits