Skip to content

Modified reserved word field names included in field masks. #227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BenRKarl opened this issue Jun 29, 2021 · 7 comments · Fixed by #228
Closed

Modified reserved word field names included in field masks. #227

BenRKarl opened this issue Jun 29, 2021 · 7 comments · Fixed by #228
Assignees
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@BenRKarl
Copy link

I'm not sure whether this is an issue with this library, the generator or the field mask helper, but it seems to be incorrect behavior that's blocking for Ads users, unless there's a workaround or other solution I'm not aware of:

Environment details

  • Programming language: Python
  • OS: gLinux
  • Language runtime version: 3.7.0
  • Package version: 1.18.1

Steps to reproduce

Assuming we have a protobuf definition like the following:

message HasType {
  string type = 1;
}

We'll have a generated Python class similar to:

class HasType(proto.Message):
    type_ = proto.Field(proto.STRING, number=1,)

Where type has been modified to type_.

Normally setting type_ isn't a problem, the issue is that this modified field name gets leaked into field masks, which are revoked by the API because it doesn't match the protobuf definition. This prevents users from executing update requests where the type field on any object has been modified.

Here's a short script to reproduce:

from google.api_core import protobuf_helpers
import proto

class HasType(proto.Message):
    type_ = proto.Field(proto.STRING, number=1,)

ht = HasType()
ht.type_ = "has type"

fm = protobuf_helpers.field_mask(None, ht._pb)

print(fm.paths[0])
# >>> "type_"
@BenRKarl
Copy link
Author

Related bug in the Ads client library: googleads/google-ads-python#454

@software-dov
Copy link
Contributor

Turning type into type_ happens in the generator in wrappers.Field as a name collision resolution with python keywords and builtins. This is necessary because fields are defined as class attributes in the corresponding message. For example, the following is a syntax error:

class BadMessage(proto.Message):
    class = proto.Field(proto.STRING, number=1)

The type moniker is a builtin and not a keyword, which means that we can technically redefine it. There are multiple places where we use this builtin in the generated surface, though, and others are used as well, so we need to generally disambiguate them all.

Possible solutions:

  1. A combination of hacks in proto-plus and awareness in gapic-generator-python where we can define a field with a "proto" name that is used to describe the generated underlying proto descriptor.
    e.g.
class BetterMessage(proto.Message):
    # strawman name
    type_ = proto.Field(proto.STRING, number=1, disambiguated_name="type")

bm = BetterMessage()
bm.type_ = "has type"

fm = protobuf_helpers.field_mask(None, bm._pb)
assert fm.paths[0] == "type"

Pros:

  • It solves the issue at hand
  • If done correctly has no false negative or false positive

Cons:

  • Requires careful coordination between the two codebases
  • Technically a breaking change
  • Mismatch between the proto-plus name of a field and the vanilla proto name of the field is a wart
  1. Hack the field mask helper to remove trailing underscores, either unconditionally or only if the result would be a reserved word

Pros:

  • Quick and easy fix with no coordination required
  • Removes the proto-plus vs vanilla proto field name ambiguity

Cons:

  • Potential for false positives
  • This could easily blow up

What do you think?

@BenRKarl
Copy link
Author

Option 1 feels like the sensible long-term option. How long do you think it would take to implement this? If needed we could explore some interim solutions in our library.

@software-dov
Copy link
Contributor

I'm not 100% sure that option 1 is feasible. I'd want a day or so to run experiments indicating that the proposed change works the way I want it to, then some time to run a formal design through other people. Given those two things, I anticipate implementation to be trivial.
Once the proto-plus changes are merged, updating gapic-generator-python should be straightforward and fast.

@BenRKarl
Copy link
Author

Sounds good - can you let me know how the design/review process goes so that we can decide whether we need a temporary workaround in the library?

@busunkim96
Copy link
Contributor

My guess is that _ is stripped as part of the snake_case -> camelCase transformation during serialization.
https://github.com/googleapis/proto-plus-python/blob/360d24783db123f01703640202459eb922cf4974/proto/message.py#L306-L316

It looks like this is the reason we're able to have different proto-plus and vanilla proto names?

I cannot actually find a docstring for SerializeToString, but I see that _ is only removed with preserving_proto_field_name disabled for the json and dict conversion methods.

https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html

import json

import proto

class HasType(proto.Message):
    type_ = proto.Field(proto.STRING, number=1,)


ht = HasType(type_="has type")

print(HasType.to_json(ht, preserving_proto_field_name=False))  # {"type": "has type"}
print(HasType.to_json(ht, preserving_proto_field_name=True))  # {"type_": "has type"}

Are there rules around proto field naming that we could take advantage of to more safely pursue option 2?

@busunkim96
Copy link
Contributor

Recapping discussion in email thread:

@lukesneeringer pointed out that we can safely strip away trailing _, since it's not possible for fields to have trailing _ at the proto level (Creating a field like this generates an error from the protobuf runtime).

The easiest solution is to modify the field mask logic in google-api-core to strip away trailing _ before doing the comparison. This also doesn't require taking any breaking changes.

@busunkim96 busunkim96 transferred this issue from googleapis/proto-plus-python Jul 14, 2021
@busunkim96 busunkim96 added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jul 14, 2021
@busunkim96 busunkim96 self-assigned this Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants