Skip to content

Conversation

pull[bot]
Copy link

@pull pull bot commented Aug 30, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.3)

Can you help keep this open source service alive? 💖 Please sponsor : )

kaxil and others added 6 commits August 30, 2025 00:29
Remove Task SDK dependencies from airflow-core deserialization by establishing
a schema-based contract between client and server components. This
change enables independent deployment and upgrades while laying the foundation
for multi-language SDK support.

Key Decoupling Achievements:
- Replace dynamic get_serialized_fields() calls with hardcoded class methods
- Add schema-driven default resolution with get_operator_defaults_from_schema()
- Remove OPERATOR_DEFAULTS import dependency from airflow-core
- Implement SerializedBaseOperator class attributes for all operator defaults
- Update _is_excluded() logic to use schema defaults for efficient serialization

Serialization Optimizations:
- Unified partial_kwargs optimization supporting both encoded/non-encoded formats
- Intelligent default exclusion reducing storage redundancy
- MappedOperator.operator_class memory optimization (~90-95% reduction)
- Comprehensive client_defaults system with hierarchical resolution

Compatibility & Performance:
- Significant size reduction for typical DAGs with mapped operators
- Minimal overhead for client_defaults section (excellent efficiency)
- All existing serialized DAGs continue to work unchanged

Technical Implementation:
- Add generate_client_defaults() with LRU caching for optimal performance
- Implement _deserialize_partial_kwargs() supporting dual formats
- Centralized field deserialization eliminating code duplication
- Consolidated preprocessing logic in _preprocess_encoded_operator()
- Callback field preprocessing for backward compatibility

Testing & Validation:
- Added TestMappedOperatorSerializationAndClientDefaults with 9 comprehensive tests
- Parameterized tests for multiple serialization formats
- End-to-end validation of serialization/deserialization workflows
- Backward compatibility validation for callback field migration

This decoupling enables independent deployment/upgrades and provides the
foundation for multi-language SDK ecosystem alongside the Task Execution API.

Part of #45428
The workaround for zmievsa/cadwyn#262 is no longer needed since we now
require `cadwyn>=5.2.1` which fixed the `JsonValue` compatibility issue
with Python versions below 3.12.
It won't always be AirflowException here.  It could be ImportError, or other such things.  Not catching an exception here could cause the dag processor to halt.
@pull pull bot locked and limited conversation to collaborators Aug 30, 2025
@pull pull bot added the ⤵️ pull label Aug 30, 2025
@pull pull bot merged commit 27f60b0 into boost-entropy-python:main Aug 30, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants