-
Notifications
You must be signed in to change notification settings - Fork 922
Migrate to orjson (JSON library) for better performance #2016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
🎉 All Contributor License Agreements have been signed. Ready to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR migrates the codebase from Python's standard json
library to orjson
for improved JSON parsing and serialization performance. The migration involves replacing json.loads()
and json.dumps()
calls with their orjson
equivalents across multiple schema registry and serialization modules.
- Replaces all
json
imports withorjson
imports - Updates all
json.loads()
calls toorjson.loads()
- Updates all
json.dumps()
calls toorjson.dumps()
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/confluent_kafka/schema_registry/common/avro.py |
Updates JSON parsing in Avro schema handling |
src/confluent_kafka/schema_registry/_sync/schema_registry_client.py |
Updates JSON serialization in sync schema registry client |
src/confluent_kafka/schema_registry/_sync/json_schema.py |
Updates JSON operations in sync JSON schema handling |
src/confluent_kafka/schema_registry/_sync/avro.py |
Updates JSON parsing in sync Avro schema handling |
src/confluent_kafka/schema_registry/_async/schema_registry_client.py |
Updates JSON serialization in async schema registry client |
src/confluent_kafka/schema_registry/_async/json_schema.py |
Updates JSON operations in async JSON schema handling |
src/confluent_kafka/schema_registry/_async/avro.py |
Updates JSON parsing in async Avro schema handling |
src/confluent_kafka/kafkatest/verifiable_client.py |
Updates JSON serialization in test client |
src/confluent_kafka/avro/serializer/message_serializer.py |
Updates JSON parsing in Avro message serializer |
requirements/requirements-schemaregistry.txt |
Adds orjson dependency |
self.dbg('SEND: %s' % orjson.dumps(d)) | ||
sys.stdout.write('%s\n' % orjson.dumps(d)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
orjson.dumps() returns bytes, not a string. This will cause a TypeError when used with string formatting. Convert to string using .decode() or use orjson.dumps(d).decode().
self.dbg('SEND: %s' % orjson.dumps(d)) | |
sys.stdout.write('%s\n' % orjson.dumps(d)) | |
self.dbg('SEND: %s' % orjson.dumps(d).decode()) | |
sys.stdout.write('%s\n' % orjson.dumps(d).decode()) |
Copilot uses AI. Check for mistakes.
self.dbg('SEND: %s' % json.dumps(d)) | ||
sys.stdout.write('%s\n' % json.dumps(d)) | ||
self.dbg('SEND: %s' % orjson.dumps(d)) | ||
sys.stdout.write('%s\n' % orjson.dumps(d)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
orjson.dumps() returns bytes, not a string. This will cause a TypeError when writing to stdout. Convert to string using .decode() or use orjson.dumps(d).decode().
sys.stdout.write('%s\n' % orjson.dumps(d)) | |
sys.stdout.write('%s\n' % orjson.dumps(d).decode()) |
Copilot uses AI. Check for mistakes.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fangnx , you need to replace orjson.dumps(x)
with orjson.dumps(x).decode('utf-8')
@@ -2,3 +2,4 @@ attrs>=21.2.0 | |||
cachetools>=5.5.0 | |||
httpx>=0.26 | |||
authlib>=1.0.0 | |||
orjson >= 3.10,<4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
orjson >= 3.10,<4 | |
orjson >= 3.10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rayokota! I updated the PR to fix it. I am surprised none of the unit tests caught it
1082565
to
4e55ba7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fangnx , LGTM
What
Checklist
References
JIRA: https://confluentinc.atlassian.net/browse/DGS-21747
Issue: #1177
Test & Review
Open questions / Follow-ups