-
Notifications
You must be signed in to change notification settings - Fork 2.3k
feat: #1614 gpt-realtime migration (Realtime API GA) #1646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
examples/realtime/app/server.py
Outdated
# Disable server-side interrupt_response to avoid truncating assistant audio | ||
session_context = await runner.run( | ||
model_config={ | ||
"initial_model_settings": { | ||
"turn_detection": {"type": "semantic_vad", "interrupt_response": False} | ||
} | ||
} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to do this by default? why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explored some changes to make the audio output quality, but they're not related to the gpt-realtime migration. So, I've reverted all of them. I will continue seeing improvements for this example app, but it can be done with a separate pull request.
examples/realtime/app/server.py
Outdated
@@ -93,7 +111,9 @@ async def _serialize_event(self, event: RealtimeSessionEvent) -> dict[str, Any]: | |||
base_event["tool"] = event.tool.name | |||
base_event["output"] = str(event.output) | |||
elif event.type == "audio": | |||
base_event["audio"] = base64.b64encode(event.audio.data).decode("utf-8") | |||
# Coalesce raw PCM and flush on a steady timer for smoother playback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this just a quality improvement? would be nice to make it be a separate PR if so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, same with above (I won't repeat this for the rest)
examples/realtime/app/server.py
Outdated
"type": event.data.type, | ||
} | ||
# Surface useful raw events to the UI with details. | ||
if getattr(event.data, "type", None) == "transcript_delta": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz no getattr
examples/realtime/app/server.py
Outdated
@@ -142,7 +195,8 @@ async def websocket_endpoint(websocket: WebSocket, session_id: str): | |||
if message["type"] == "audio": | |||
# Convert int16 array to bytes | |||
int16_data = message["data"] | |||
audio_bytes = struct.pack(f"{len(int16_data)}h", *int16_data) | |||
# Send little-endian PCM16 to the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did this change as part of the GA?
this is still in progress but will resolve #1614