transcribe: add speaker diarization test #12410

sannya-singal · 2025-03-19T10:29:47Z

Motivation

Add a test to Localstack Transcribe with an enabled speaker partitioning/diarization. Speaker partitioning labels the speech from individual speakers in the media file.

This can be achieved by enabling ShowSpeakerLabels in the transcription settings and specifying the MaxSpeakerLabels which is the maximum number of speakers we want to partition in the media file.

Changes

This PR:

adds two AWS validated tests (currently being skipped from Localstack) for the speaker diarization.
adds a manually generated media file tests/aws/files/multi-speaker.wav with 2 speakers.

github-actions · 2025-03-19T11:37:31Z

LocalStack Community integration with Pro

2 files ± 0 2 suites ±0 1h 50m 25s ⏱️ + 1m 38s
4 304 tests +11 3 983 ✅ +11 321 💤 ±0 0 ❌ ±0
4 306 runs +11 3 983 ✅ +11 323 💤 ±0 0 ❌ ±0

Results for commit 74b5289. ± Comparison against base commit 5d2e542.

♻️ This comment has been updated with latest results.

viren-nadkarni

I doubt this test will be useful. Without the actual models used by AWS Transcribe, we will not be able to reproduce the same timing and other numerical values. It's the same reason we can't AWS validate transcription tests. That said, it doesn't hurt to have this in as a reference when diarisation is being added.

tests/aws/services/transcribe/test_transcribe.py

tests/aws/services/transcribe/test_transcribe.snapshot.json

sannya-singal · 2025-03-20T04:58:28Z

Thanks for the review 🙌 I have slightly different viewpoint here, although I agree that we won't be able to reproduce the same timing and other numerical values without the actual models being used by AWS Transcribe, but in my opinion this test could be really helpful in kicking off the speaker diarization implementation in future, where we could atleast know the currently missing attributes and functionality considering the logged AWS responses, also making it easier to understand and implement the feature of atleast having speaker_label and the associated transcripts in the response and having other mocked values.

This could be later modified according to the then implementation and skip the snapshot of certain functionality.

viren-nadkarni

Thanks for addressing the comments. Looks good to merge 👍

sannya-singal added the semver: patch Non-breaking changes which can be included in patch releases label Mar 19, 2025

sannya-singal added this to the 4.3 milestone Mar 19, 2025

sannya-singal self-assigned this Mar 19, 2025

transcribe: add speaker diarization test

71eba5b

sannya-singal force-pushed the speaker-diarization branch from 34b15db to 71eba5b Compare March 19, 2025 10:34

sannya-singal requested a review from viren-nadkarni March 19, 2025 10:44

sannya-singal marked this pull request as ready for review March 19, 2025 10:44

viren-nadkarni reviewed Mar 19, 2025

View reviewed changes

fix reviews

c7b43c8

sannya-singal requested review from dominikschubert and steffyP as code owners March 20, 2025 04:38

raise error if occurred

74b5289

sannya-singal requested a review from viren-nadkarni March 20, 2025 04:58

viren-nadkarni approved these changes Mar 20, 2025

View reviewed changes

sannya-singal merged commit a9e6262 into master Mar 20, 2025
31 checks passed

sannya-singal deleted the speaker-diarization branch March 20, 2025 09:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

transcribe: add speaker diarization test #12410

transcribe: add speaker diarization test #12410

Uh oh!

sannya-singal commented Mar 19, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 19, 2025 •

edited

Loading

Uh oh!

viren-nadkarni left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sannya-singal commented Mar 20, 2025

Uh oh!

viren-nadkarni left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

transcribe: add speaker diarization test #12410

transcribe: add speaker diarization test #12410

Uh oh!

Conversation

sannya-singal commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Uh oh!

github-actions bot commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LocalStack Community integration with Pro

Uh oh!

viren-nadkarni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sannya-singal commented Mar 20, 2025

Uh oh!

viren-nadkarni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sannya-singal commented Mar 19, 2025 •

edited

Loading

github-actions bot commented Mar 19, 2025 •

edited

Loading