k2-fsa_sherpa-onnx/cxx-api-examples/punctuation-cxx-api.cc
Fangjun Kuang 858b5052a2
Add C++ and Python support for T-one streaming Russian ASR models (#2575)
This PR adds support for T-one streaming Russian ASR models in both C++ and Python APIs. The T-one model is a CTC-based Russian speech recognition model with specific characteristics including float16 state handling, 300ms frame lengths, and 8kHz sampling rate.

- Added new OnlineToneCtcModel implementation with specialized processing for T-one models
- Integrated T-one support into the existing CTC model pipeline and Python bindings
- Added Python example and test scripts for the new functionality
2025-09-09 12:07:34 +08:00

40 lines
1.4 KiB
C++

// cxx-api-examples/punctuation-cxx-api.cc
// Copyright (c) 2025 Xiaomi Corporation
// To use punctuation model:
// wget
// https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2
// tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2
// rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2
#include <iostream>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
int32_t main() {
using namespace sherpa_onnx::cxx; // NOLINT
OfflinePunctuationConfig punctuation_config;
punctuation_config.model.ct_transformer =
"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/"
"model.onnx";
punctuation_config.model.num_threads = 1;
punctuation_config.model.debug = false;
punctuation_config.model.provider = "cpu";
OfflinePunctuation punct = OfflinePunctuation::Create(punctuation_config);
if (!punct.Get()) {
std::cerr
<< "Failed to create punctuation model. Please check your config\n";
return -1;
}
std::string text = "你好吗how are you Fantasitic 谢谢我很好你怎么样呢";
std::string text_with_punct = punct.AddPunctuation(text);
std::cout << "Original text: " << text << std::endl;
std::cout << "With punctuation: " << text_with_punct << std::endl;
return 0;
}