-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Bugfix/eventbridge/transformer issue with nested key #11998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/eventbridge/transformer issue with nested key #11998
Conversation
LocalStack Community integration with Pro 2 files ± 0 2 suites ±0 1h 8m 41s ⏱️ - 42m 32s Results for commit bf6335f. ± Comparison against base commit 5399278. This pull request removes 1555 and adds 15 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix!
I believe we might still have a few corner cases to cover, we could discover them with some parametrized unit tests for the is_nested_in_string
function, and maybe even the replace_template_placeholders
one.
I think we're not properly replacing dict
values inside JSON templates string fields, and I'm not 100% certain of the is_nested_in_string
check if the matched value is between quotes with nothing else, and it'd be great to validate those.
This documentation part is pretty useful to try to understand edge cases: https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-transform-target-input.html#eb-transform-input-issues
I think we should have all the cases in this table: https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-transform-target-input.html#eb-transform-target-input-template as part of unit tests for replace_template_placeholders
, as they are pretty well defined. Some of those could maybe be tackled in a follow up, but it'd be good to know how far we support.
Thanks!
@@ -248,3 +248,32 @@ def get_trace_header_encoded_region_account( | |||
return json.dumps({"original_id": original_id, "original_account": source_account_id}) | |||
else: | |||
return json.dumps({"original_account": source_account_id}) | |||
|
|||
|
|||
def is_nested_in_string(template, match) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to get unit tests for this function, as it is quite complex and could break in unexpected ways, and just to validate its limits. Do you think that'd be possible to add to the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added unit test for the function test_is_nested_in_string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks! I'll wait for the pipeline to be green and review the changes today. Thank you!
if is_json_template: | ||
if is_nested_in_string(template, match): | ||
return value # Return raw string for nested placeholders | ||
else: | ||
return to_json_str(value) | ||
if isinstance(value, datetime.datetime): | ||
return event_time_to_time_string(value) | ||
if isinstance(value, dict): | ||
return dict_to_simple_string(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the AWS documentation, it seems that if you set a dict
value inside a string in a json_template
, it will also strip the quotes out of it, like it would do in a regular string templates. It would be good to validate this in AWS.
I think the dict_to_simple_string
should also be used in the is_nested_in_string
condition if the value is a string, because now you'll return a dict
out of the replace function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand you correctly this scenario should be covered by your proposed test cast '{"method": "PUT", "path": "users-service/users/<payload>", "bod": "<payload>"},
since payload
is a dictionary and it is set in the middle part in the string "users-service/users/<payload>"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will interestingly fail for AWS since it will not be correctly transformed since it would result in a string that is not a json string but the system expects a json string
I added a test for this failure also
[ | ||
'{"method": "PUT", "path": "users-service/users/<userId>", "bod": <payload>}', | ||
'"Payload of <payload> with path users-service/users/<userId>"', | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a few cases here, just to test the limit, and some are given by the docs:
'{"method": "PUT", "path": "users-service/users/<payload>", "bod": "<payload>"}
'{"method": "PUT", "path": "users-service/users/<userId>", "bod": "<userId>"}',
'{"method": "PUT", "path": "users-service/users/<userId>", "bod": [<userId>, "hardcoded"]}',
There's also this big one with multiline text in the documentation that raises questions for me:
https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-transform-target-input.html#eb-transform-input-issues
{
"detail": {
"severity": ["HIGH"],
"status": ["ACTIVE"]
},
"detail-type": ["Inspector2 Finding"],
"source": ["inspector2"]
}
Input Path
{
"account": "$.detail.awsAccountId",
"ami": "$.detail.resources[0].details.awsEc2Instance.imageId",
"arn": "$.detail.findingArn",
"description": "$.detail.description",
"instance": "$.detail.resources[0].id",
"platform": "$.detail.resources[0].details.awsEc2Instance.platform",
"region": "$.detail.resources[0].region",
"severity": "$.detail.severity",
"time": "$.time",
"title": "$.detail.title",
"type": "$.detail.type"
}
Input template
"<severity> severity finding <title>"
"Description: <description>"
"ARN: \"<arn>\""
"Type: <type>"
"AWS Account: <account>"
"Region: <region>"
"EC2 Instance: <instance>"
"Platform: <platform>"
"AMI: <ami>"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added something similar to as a test case also with nested lists and target replacement values
): | ||
return False | ||
|
||
return left_quote != -1 and template[left_quote + 1 : right_quote].strip() != match.group(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I fully grasp this condition, does it mean that if we have "<placeholder>"
we want to return False
and add quotes to the value again? This might be worth to test it
@@ -273,3 +273,7 @@ def is_nested_in_string(template, match) -> bool: | |||
return False | |||
|
|||
return left_quote != -1 and template[left_quote + 1 : right_quote].strip() != match.group(0) | |||
|
|||
|
|||
def dict_to_simple_string(d): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed this one too: I think this does not cover all use case if there are nested values here.
It seems what AWS does is stripped all quotes from the JSON string, it might be easier that way.
Because what if you have a nested dict values here? I don't think that works, same if v
is a list with strings inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test case for nested dicts in string transformer '{"method": "PUT", "nested": {"level1": {"level2": {"level3": "users-service/users/<userId>"} } }, "bod": "<userId>"}',
7b32f39
to
1ca23e1
Compare
0b5aec2
to
29cfd7f
Compare
a9edd1a
to
f96f363
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, quite a lot of coverage now! Thank you for addressing the comments, I know it was a lot 😅 I think this is really great, thanks for fixing this issue! This was quite the rabbit hole...
I only have a small nit regarding the log statement dumping JSON.
I've done a run for -ext, and it was all green, so we should be on the safe side! LGTM! And congrats for this quite big addition! 😄
Motivation
InputTransformer map values from the event to keys defined in InputPathMap and these can then be used in InputTemplate to replace placeholders.
These placeholders can also be in a continuous string "string/" this was failing in localstack as raised here #11126
Changes