server : add Voice Activity Detection (VAD) support #3246

danbev · 2025-06-13T08:00:24Z

This commit adds support for Voice Activity Detection (VAD) in the server example.

The motivation for this is to enable VAD processing when using whisper-server.

Resolves: #3089

This commit adds support for Voice Activity Detection (VAD) in the server example. The motivation for this is to enable VAD processing when using whisper-server. Resolves: ggml-org#3089

This commit also adds a few missing parameters.

examples/server/server.cpp

This commit fixes a short name conflict whisper-cli for `--vad-min-speech-duration-ms` and `--vad-min-silence-duration-ms` which currently have the same short name `-vsd`. Refs: ggml-org#3246 (review)

This commit fixes a short name conflict whisper-cli for `--vad-min-speech-duration-ms` and `--vad-min-silence-duration-ms` which currently have the same short name `-vsd`. Refs: #3246 (review)

* ggerganov/master: (335 commits) server : add Voice Activity Detection (VAD) support (ggml-org#3246) cli : fix short name conflict for vad options [no ci] (ggml-org#3247) ruby : add .gitignore entries for ext directory (ggml-org#3245) ci : update windows runner to windows-2022 (ggml-org#3242) ruby : add cleaning of library names in dependencies (ggml-org#3241) ggml : fix weak alias win32 (#0) android : fix builds (#0) sync : ggml files : remove old sources (part 2) sync : ggml files : remove old sources talk-llama : sync llama.cpp sync : ggml metal : use less stack memory in FA kernel (llama/14088) ggml-cpu : split arch-specific implementations (llama/13892) cuda : fix device sync on buffer clear (llama/14033) CANN: Simplify the environment variable setting(#13104) sycl: Add reorder to Q6_K mmvq implementation (llama/13885) cuda : fix buffer type check with integrated GPUs (llama/14069) SYCL: Implement few same quantized type copy kernels (llama/13739) ...

danbev added 2 commits June 13, 2025 09:57

server : add Voice Activity Detection (VAD) support

63fb077

This commit adds support for Voice Activity Detection (VAD) in the server example. The motivation for this is to enable VAD processing when using whisper-server. Resolves: ggml-org#3089

server : add VAD parameters to usage in README.md [no ci]

9c829d2

This commit also adds a few missing parameters.

ggerganov approved these changes Jun 13, 2025

View reviewed changes

examples/server/server.cpp Outdated Show resolved Hide resolved

server : fix conflicting short options [no ci]

b8658ad

danbev mentioned this pull request Jun 13, 2025

cli : fix short name conflict for vad options [no ci] #3247

Merged

danbev merged commit 0a4d85c into ggml-org:master Jun 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server : add Voice Activity Detection (VAD) support #3246

server : add Voice Activity Detection (VAD) support #3246

Uh oh!

danbev commented Jun 13, 2025

Uh oh!

Uh oh!

Uh oh!

server : add Voice Activity Detection (VAD) support #3246

server : add Voice Activity Detection (VAD) support #3246

Uh oh!

Conversation

danbev commented Jun 13, 2025

Uh oh!

Uh oh!

Uh oh!