Skip to content

Random transcript gets printed/generated when talking to the voice agent implemented using "VoicePipline" . Eg - "Transcription: Kurs." Mind you there is no background noise. #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rinigarg15 opened this issue Mar 27, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@rinigarg15
Copy link

rinigarg15 commented Mar 27, 2025

Please read this first

  • Have you read the docs?Agents SDK docs - Yes
  • Have you searched for related issues? Others may have faced similar issues. - Yes

Describe the bug

A clear and concise description of what the bug is.

Debug information

  • Agents SDK version: (e.g. v0.0.3)
  • Python version (e.g. Python 3.10)

Repro steps

 custom_stt_settings = STTModelSettings(
    language="English" 
)

custom_tts_settings = TTSModelSettings(
    voice="alloy"
)

pipeline_config = VoicePipelineConfig(
    stt_settings=custom_stt_settings,
    tts_settings=custom_tts_settings
)


async def main() -> None:
    #Sets up the voice pipeline, microphone audio reading, and keyboard listener concurrently.

    should_send_audio = asyncio.Event()

    pipeline = VoicePipeline(
        workflow=MyWorkflow(on_start=lambda transcription: print(f"Transcription: {transcription}")),
        config=pipeline_config,
        stt_model="gpt-4o-mini-transcribe",

    )
    audio_input = StreamedAudioInput()


    task_pipeline = asyncio.create_task(run_voice_pipeline(pipeline, audio_input))
    task_mic = asyncio.create_task(send_mic_audio(audio_input, should_send_audio))
    task_keyboard = asyncio.create_task(keyboard_listener(should_send_audio))

    await task_keyboard

    task_pipeline.cancel()
    task_mic.cancel()
    await asyncio.gather(task_pipeline, task_mic, return_exceptions=True)

"""This is the Agent"""
agent = Agent(
    name="Assistant",
    instructions=prompt_with_handoff_instructions(
        "You are a Fleet Assistant, an intelligent assistant that transforms raw data into clear and natural language summaries. Your task is to analyze the provided information and convey the key details in a simple, concise, and professional manner.Take inputs for the particluar tool one after the other .Today's date is :24-03-2025.Take input in english and respond in english. Wait for the user to complete their sentences, don't start generating a response after a pause.",
    ),
    model="gpt-4o-mini",
    # handoffs=[spanish_agent],
    tools=[fetch_alerts,fetch_trip_data,getmaintainencereport,Vehicle_Location_Status_lasttrip,getfleethighlights,getvehicleidlestatus],
)

"""This is the WorkFlow class"""
class MyWorkflow(VoiceWorkflowBase):
    def __init__(self, on_start: Callable[[str], None]):
        """
        Args:
            on_start: A callback that is called when the workflow starts. The transcription
                is passed in as an argument.
        """
        self._input_history: list[TResponseInputItem] = []
        self._current_agent = agent
        self._on_start = on_start

    async def run(self, transcription: str) -> AsyncIterator[str]:
        self._on_start(transcription)

        self._input_history.append(
            {
                "role": "user",
                "content": transcription,
            }
        )

        result = Runner.run_streamed(self._current_agent, self._input_history)

        async for chunk in VoiceWorkflowHelper.stream_text_from(result):
            yield chunk

        self._input_history = result.to_input_list()
        self._current_agent = result.last_agent 

Ideally provide a minimal python script that can be run to reproduce the bug.

Expected behavior

A clear and concise description of what you expected to happen.

@rinigarg15 rinigarg15 added the bug Something isn't working label Mar 27, 2025
@rm-openai
Copy link
Collaborator

@dkundel-openai any ideas? looks like this is real and valid input

@rinigarg15
Copy link
Author

rinigarg15 commented Mar 28, 2025

Some more examples of random trabscript the VoicePipeline is generating between conversations:-

  1. Transcription: kudich
  2. Transcription: Peder
  3. Transcription: Это!

@Bhushan-Shenvi
Copy link

@rinigarg15 did you find any solution here?

@rinigarg15
Copy link
Author

@rinigarg15 did you find any solution here?

Nope. Open AI team didn't get back with any solution. Haven't tested it out in a while now.

@Bhushan-Shenvi
Copy link

Oh, we are also experiencing the same in the latest build. Voice is our app's core functionality. @dkundel-openai @rm-openai, any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants