feat: Support audio_transcribe with partial ordering #1908

shuoweil · 2025-07-15T18:20:34Z

feat: Support audio transcription with partial ordering
This change also fixes a related issue where Block.join would fail on joins with null indexes when operating in this partial ordering mode.

b/430572560

…s/test_binary_compiler/test_add_numeric/out.sql

This reverts commit 80e5298.

…snapshots/test_binary_compiler/test_add_numeric/out.sql" This reverts commit abc6dae.

This reverts commit 123a50e.

…tity join logic

tswast · 2025-07-16T14:27:01Z

bigframes/core/blocks.py

@@ -2488,6 +2488,11 @@ def join(
            )
            if result is not None:
                return result
+
+            # For block identify joins with null indices, perform cross join


This doesn't seem desirable. If df1 is n rows and df2 is m rows, won't this end up with n x m rows?

tswast · 2025-07-16T14:40:14Z

tests/system/large/blob/test_function.py

+    result = df.to_pandas(ordered=False)
+
+    assert "transcribed_text" in result.columns
+    assert len(result) > 0


The number of rows in result should be exactly equal to the number of rows in audio_mm_df_partial_ordering.

shuoweil requested review from a team as code owners July 15, 2025 18:20

shuoweil requested a review from TrevorBergeron July 15, 2025 18:20

shuoweil self-assigned this Jul 15, 2025

product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jul 15, 2025

shuoweil added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jul 15, 2025

yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jul 15, 2025

shuoweil removed the request for review from TrevorBergeron July 15, 2025 18:36

shuoweil marked this pull request as draft July 15, 2025 18:36

shuoweil force-pushed the shuowei-transcribe-partial-order branch from 78dcbf0 to abc6dae Compare July 15, 2025 21:04

shuoweil added 10 commits July 15, 2025 21:16

handle corner case of null ptr

159ddc0

add testcase

d264db0

snapshot update

cddd769

revert change to tests/unit/core/compile/sqlglot/expressions/snapshot…

8a36137

…s/test_binary_compiler/test_add_numeric/out.sql

Restore out.sql to match main branch

7d799f6

Revert "Restore out.sql to match main branch"

5492808

This reverts commit 80e5298.

Revert "revert change to tests/unit/core/compile/sqlglot/expressions/…

1811b47

…snapshots/test_binary_compiler/test_add_numeric/out.sql" This reverts commit abc6dae.

Revert "snapshot update"

1e439af

This reverts commit 123a50e.

check if both perands hav null indices before applying the block iden…

37a55b0

…tity join logic

change the line sepearation

5560902

shuoweil force-pushed the shuowei-transcribe-partial-order branch from 4b2927f to 5560902 Compare July 15, 2025 21:38

shuoweil requested a review from TrevorBergeron July 15, 2025 22:12

shuoweil marked this pull request as ready for review July 15, 2025 22:12

tswast reviewed Jul 16, 2025

View reviewed changes

shuoweil marked this pull request as draft July 16, 2025 18:12

shuoweil added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jul 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support audio_transcribe with partial ordering #1908

feat: Support audio_transcribe with partial ordering #1908

Uh oh!

shuoweil commented Jul 15, 2025

Uh oh!

tswast Jul 16, 2025

Uh oh!

tswast Jul 16, 2025

Uh oh!

Uh oh!

feat: Support audio_transcribe with partial ordering #1908

Are you sure you want to change the base?

feat: Support audio_transcribe with partial ordering #1908

Uh oh!

Conversation

shuoweil commented Jul 15, 2025

Uh oh!

tswast Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

tswast Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!