Random transcript gets printed/generated when talking to the voice agent implemented using "VoicePipline" . Eg - "Transcription: Kurs." Mind you there is no background noise. #368

rinigarg15 · 2025-03-27T10:07:14Z

Please read this first

Have you read the docs?Agents SDK docs - Yes
Have you searched for related issues? Others may have faced similar issues. - Yes

Describe the bug

A clear and concise description of what the bug is.

Debug information

Agents SDK version: (e.g. v0.0.3)
Python version (e.g. Python 3.10)

Repro steps

 custom_stt_settings = STTModelSettings(
    language="English" 
)

custom_tts_settings = TTSModelSettings(
    voice="alloy"
)

pipeline_config = VoicePipelineConfig(
    stt_settings=custom_stt_settings,
    tts_settings=custom_tts_settings
)


async def main() -> None:
    #Sets up the voice pipeline, microphone audio reading, and keyboard listener concurrently.

    should_send_audio = asyncio.Event()

    pipeline = VoicePipeline(
        workflow=MyWorkflow(on_start=lambda transcription: print(f"Transcription: {transcription}")),
        config=pipeline_config,
        stt_model="gpt-4o-mini-transcribe",

    )
    audio_input = StreamedAudioInput()


    task_pipeline = asyncio.create_task(run_voice_pipeline(pipeline, audio_input))
    task_mic = asyncio.create_task(send_mic_audio(audio_input, should_send_audio))
    task_keyboard = asyncio.create_task(keyboard_listener(should_send_audio))

    await task_keyboard

    task_pipeline.cancel()
    task_mic.cancel()
    await asyncio.gather(task_pipeline, task_mic, return_exceptions=True)

"""This is the Agent"""
agent = Agent(
    name="Assistant",
    instructions=prompt_with_handoff_instructions(
        "You are a Fleet Assistant, an intelligent assistant that transforms raw data into clear and natural language summaries. Your task is to analyze the provided information and convey the key details in a simple, concise, and professional manner.Take inputs for the particluar tool one after the other .Today's date is :24-03-2025.Take input in english and respond in english. Wait for the user to complete their sentences, don't start generating a response after a pause.",
    ),
    model="gpt-4o-mini",
    # handoffs=[spanish_agent],
    tools=[fetch_alerts,fetch_trip_data,getmaintainencereport,Vehicle_Location_Status_lasttrip,getfleethighlights,getvehicleidlestatus],
)

"""This is the WorkFlow class"""
class MyWorkflow(VoiceWorkflowBase):
    def __init__(self, on_start: Callable[[str], None]):
        """
        Args:
            on_start: A callback that is called when the workflow starts. The transcription
                is passed in as an argument.
        """
        self._input_history: list[TResponseInputItem] = []
        self._current_agent = agent
        self._on_start = on_start

    async def run(self, transcription: str) -> AsyncIterator[str]:
        self._on_start(transcription)

        self._input_history.append(
            {
                "role": "user",
                "content": transcription,
            }
        )

        result = Runner.run_streamed(self._current_agent, self._input_history)

        async for chunk in VoiceWorkflowHelper.stream_text_from(result):
            yield chunk

        self._input_history = result.to_input_list()
        self._current_agent = result.last_agent

Ideally provide a minimal python script that can be run to reproduce the bug.

Expected behavior

A clear and concise description of what you expected to happen.

The text was updated successfully, but these errors were encountered:

rm-openai · 2025-03-27T17:42:34Z

@dkundel-openai any ideas? looks like this is real and valid input

rinigarg15 · 2025-03-28T06:03:29Z

Some more examples of random trabscript the VoicePipeline is generating between conversations:-

Transcription: kudich
Transcription: Peder
Transcription: Это!

Bhushan-Shenvi · 2025-04-29T12:19:16Z

@rinigarg15 did you find any solution here?

rinigarg15 · 2025-04-29T12:57:50Z

@rinigarg15 did you find any solution here?

Nope. Open AI team didn't get back with any solution. Haven't tested it out in a while now.

Bhushan-Shenvi · 2025-04-29T13:26:55Z

Oh, we are also experiencing the same in the latest build. Voice is our app's core functionality. @dkundel-openai @rm-openai, any thoughts?

rinigarg15 added the bug Something isn't working label Mar 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Random transcript gets printed/generated when talking to the voice agent implemented using "VoicePipline" . Eg - "Transcription: Kurs." Mind you there is no background noise. #368

Random transcript gets printed/generated when talking to the voice agent implemented using "VoicePipline" . Eg - "Transcription: Kurs." Mind you there is no background noise. #368

rinigarg15 commented Mar 27, 2025 •

edited

Loading

rm-openai commented Mar 27, 2025

Uh oh!

rinigarg15 commented Mar 28, 2025 •

edited

Loading

Uh oh!

Bhushan-Shenvi commented Apr 29, 2025

Uh oh!

rinigarg15 commented Apr 29, 2025

Uh oh!

Bhushan-Shenvi commented Apr 29, 2025

Uh oh!

Random transcript gets printed/generated when talking to the voice agent implemented using "VoicePipline" . Eg - "Transcription: Kurs." Mind you there is no background noise. #368

Random transcript gets printed/generated when talking to the voice agent implemented using "VoicePipline" . Eg - "Transcription: Kurs." Mind you there is no background noise. #368

Comments

rinigarg15 commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Please read this first

Describe the bug

Debug information

Repro steps

Expected behavior

rm-openai commented Mar 27, 2025

Uh oh!

rinigarg15 commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Bhushan-Shenvi commented Apr 29, 2025

Uh oh!

rinigarg15 commented Apr 29, 2025

Uh oh!

Bhushan-Shenvi commented Apr 29, 2025

Uh oh!

rinigarg15 commented Mar 27, 2025 •

edited

Loading

rinigarg15 commented Mar 28, 2025 •

edited

Loading