alphacep_vosk-api

mirror of https://github.com/alphacep/vosk-api.git synced 2026-02-04 20:45:41 +08:00

Author	SHA1	Message	Date
Nickolay Shmyrev	7b7d814484	Introduce incremental decoder with confidences in partial results	2022-04-07 01:07:47 +02:00
Nickolay Shmyrev	a57a84f90e	Refactor GPU API to hide the ID and keep it closer to CPU recognizer	2022-03-03 21:09:09 +01:00
Nickolay Shmyrev	79b8395be0	Add NLSML output	2022-02-03 23:08:09 +01:00
Nickolay Shmyrev	d2c11a611f	Read list of files from arguments	2022-01-30 22:57:36 +01:00
Nickolay Shmyrev	72bf210164	Put the demo into main folder	2021-12-24 01:07:38 +01:00
Nickolay Shmyrev	cb0f8e6411	Per-stream wait API	2021-12-23 22:34:47 +01:00
Nickolay Shmyrev	848b2dc753	Expose results in Python	2021-12-17 22:57:00 +01:00
Nickolay Shmyrev	60f0396fe0	Reset lattice on endpoint	2021-12-17 01:13:09 +01:00
Nickolay Shmyrev	344e137a61	Decoding works, results are empty yet	2021-12-13 01:21:59 +01:00
Nickolay Shmyrev	6977be7fb7	Batch recognizer draft	2021-12-12 21:37:44 +01:00
Lars Kiesow	2349e66a97	Subtitles require word times (#607 ) This is a port of the recent addition of commit 7ccf743, adding `KaldiRecognizer.SetWords(True)` to the other examples dealing with subtitles to the WebVTT example. Without this, the example will not work with the most recent `vosk` (0.3.30) Python package.	2021-06-27 18:06:11 +03:00
Lars Kiesow	02ef49f67e	Allow Saving WebVTT (#605 ) This patch is a small extension to the WebVTT example which allows to directly save the WebVTT output to a file like this: ./test_webvtt.py test.wav out.vtt	2021-06-24 19:35:30 +03:00
Lars Kiesow	7cdf8f1d03	Add Python WebVTT Example (#601 ) This patch adds an example for using webvtt-py to generate WebVTT files from Vosk output. This is similar to the SRT example but still very useful for generating an example video subtitle usable in web contexts.	2021-06-23 01:16:33 +03:00
Nickolay Shmyrev	7ccf743bb6	SRT requires word times	2021-06-10 10:37:15 +02:00
Nickolay Shmyrev	75bedfe06d	Add a method to show/hide words and their times	2021-06-07 01:04:37 +02:00
Nickolay Shmyrev	6aa5af7640	Add reset test	2021-05-26 21:22:07 +02:00
Nickolay Shmyrev	499b2f183a	Introduce new API to set speaker model to already initialized recognizer. Introduce a method to reset recognizer results to start from scratch without computation of the result.	2021-05-26 00:46:32 +02:00
Nickolay Shmyrev	f8189685e5	Add max alternatives output	2021-05-19 18:47:25 +02:00
Nickolay Shmyrev	a5a3697b7c	Copy data before queue, original data can be destroyed. Fixes issue #444 Thanks to Alexander Zatvornitsky	2021-03-01 21:25:41 +01:00
Vlad Ki	c6119c4835	test_microphone: AcceptWavform wants bytes	2021-02-08 00:38:28 +02:00
Nickolay Shmyrev	6f2d6d0d69	Proper microphone recognizer with the queue	2021-01-09 23:16:35 +01:00
Nickolay Shmyrev	08c35e84f3	Update demo with spk vector check	2020-12-23 22:21:12 +01:00
Nickolay Shmyrev	dc3d03d742	Make sure we have result field in json	2020-11-29 19:19:50 +01:00
Nickolay Shmyrev	746ff47757	Split long lines in subtitles	2020-11-03 15:13:16 +01:00
Nickolay Shmyrev	7af3e9a334	Add srt example	2020-10-07 13:43:09 +02:00
Nickolay Shmyrev	57cc474c9f	Build bigram language model from grammars	2020-10-04 23:42:50 +02:00
Nickolay Shmyrev	f97383c17f	Update models location	2020-09-22 00:00:24 +02:00
Nickolay Shmyrev	41035485db	Fix x-vectors, now they actually work. Requires new version spk-model-0.4 with whitening transform matrix	2020-09-21 23:31:30 +02:00
Nickolay Shmyrev	1c7b94757d	Don't attempt to resize to empty matrix	2020-07-31 16:45:05 +02:00
Nickolay Shmyrev	722b09eaa4	Ignore words missing in the vocabulary	2020-07-08 01:35:50 +02:00
Nickolay Shmyrev	b81f69d407	Avoid overflow. See issue #128	2020-06-06 10:43:44 +02:00
Nickolay Shmyrev	b1e775c67b	Better speaker identification without silence frames	2020-05-31 23:46:42 +02:00
Nickolay Shmyrev	37fbe1a52b	OSX build	2020-05-08 09:55:15 +03:00
Nickolay Shmyrev	af11bb2361	No need for external dependencies	2020-05-07 21:48:03 +02:00
Nickolay Shmyrev	8da8697c1e	Add ffmpeg test	2020-05-07 21:07:17 +02:00
Nickolay Shmyrev	80219066e9	Expose verbose level in the API	2020-05-01 19:02:57 +02:00
Nickolay Shmyrev	d75bb36131	Fix model download path	2020-04-30 11:04:35 +02:00
Nickolay Shmyrev	86caf526f5	New model layout and model config file with beams for decoding	2020-04-28 01:06:41 +02:00
André Mendes	14b2c13ed6	Fixes folder name instruction for python model on microphone test	2020-03-25 01:01:23 -03:00
Nickolay Shmyrev	095fac1de4	Small unittest	2020-02-25 21:45:27 +01:00
Nickolay Shmyrev	8913b93703	Fix message	2020-02-23 16:05:34 +00:00
Nickolay Shmyrev	f66f7a9290	Added microphone test	2020-02-21 00:08:50 +01:00
Nickolay Shmyrev	378fa3499a	Downsample by default and also check wav file encoding in examples. Thanks to dtreskunov. See for details https://github.com/alphacep/vosk-api/pull/23	2020-02-18 11:45:50 +01:00
Nickolay Shmyrev	01eeec498d	Fix memory access issues	2020-02-17 09:42:50 +00:00
Nickolay Shmyrev	d422dfbddb	Message update	2020-02-16 17:21:38 +01:00
Nickolay Shmyrev	aa91ccf68b	Added speaker identification	2020-02-16 17:04:37 +01:00
Nickolay Shmyrev	a6f6c0409b	Demonstrate grammar compilation on the fly	2020-01-22 23:31:21 +01:00
Nickolay Shmyrev	380fa9c7c8	Added a warning to download the model	2020-01-11 20:24:34 +01:00
Nickolay Shmyrev	78e66149f8	Imported Python bindings and Node bindings	2020-01-02 20:46:14 +01:00

49 Commits