-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Implement an allow_null_bytes argument to CharField (default True) #6068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
def test_allow_null_bytes(self): | ||
field = serializers.CharField(allow_null_bytes=True) | ||
for value in ('\0', 'foo\0', '\0foo', 'foo\0foo'): | ||
field.run_validation('\0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, these tests are using the hardcoded value instead of the one provided in the for loop.
|
||
for value in ('\0', 'foo\0', '\0foo', 'foo\0foo'): | ||
with pytest.raises(serializers.ValidationError) as exc_info: | ||
field.run_validation('\0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
Hi @jleclanche - thanks for the contribution. My gut reaction is that this is straightforward enough to be implemented as a validator function and doesn't need to exist in core. That said, I'd be interested in what others have to say. |
Absolutely. (Especially vs introducing a new kwarg.) |
@carltongibson @rpkilby Part of the reason I went this route rather than adding a validator was to potentially discuss making this default behaviour for CharField. As most databases refuse null bytes in varchar and text, and most serializers.CharField instances end up in a database one way or another, I suspect most serializers in the wild out there do not correctly validate against null bytes. |
I’d be okay with us defaulting to False for 3.9. Anyone else? |
Also we’re using six.text_type multiple times over - it’d be nice if we could finesse that a little. |
Fixed the tests and rebased on master. |
Django ships with a validator for this: If we’re going to do this we should use that.
My initial response would be to advise doing just that in a subclass. If not, i.e. if there’s a case for us doing this too, then do we really have to have the kwarg? It seems we’d be saying “you really should do this but we we're not going to turn it on for you”. (In this case people needing the old behaviour would need the subclass. But isn’t the point that we should do this ourselves just that you wouldn’t need the old behaviour...?) I guess I suspect most people are never hitting this, or we’d have a million reports over the years of unexpected 500s. |
I'd be +1 for following Django's lead here and just adding the validator by default and leaving off the kwarg option. The breaking change in 3.9 would also be acceptable to me, given how minor it is. What pushes me towards adding this as a default behavior is:
If users need to override, they can pop the validator from the list (note: we should probably provide a terse example of that in the changelog). |
GitHub doesn't allow you to reopen a closed PR after the branch has been rebased. @jleclanche - could you submit a new PR? |
I'll work on the changes and create a new PR in the next few hours. |
Great! Thanks. |
@rpkilby Just for clarification, should I be reusing django's ProhibitNullCharactersValidator or will I be reimplementing it in drf? I'm not too familiar with validators as is. |
Django Validators are compatible with DRF, so I would use ProhibitNullCharactersValidator. We already use a few of them. django-rest-framework/rest_framework/fields.py Lines 16 to 18 in a027791
|
One thing to note is that importing the validator will fail for Django 1.11, as it only exists in Django 2.0 and later. Two ways to handle this:
|
Oh hum. I suppose I'm okay validating this only on Django 2.0+, seeing as it would actually follow Django's behaviour then. Thoughts? |
I would be fine with that, as it matches Django's behavior. It also means less code to cleanup later. |
Sounds good.
Either way would be acceptable. |
Followup in #6073 |
Description
NULL bytes within strings are forbidden in most database engines (tested against Postgres). Serializers should be able to catch this out of the box as it's a very common use case for serializers to be fed straight into a database.
The argument
allow_null_bytes
, if set to False, will reject any string containing\0
NULLs inside of it with its own error.I initially called it
allow_nulls
, but that got confusing as there already is anallow_null
argument.I set the value to
True
by default so as not to break backwards compatibility. I am of the opinion however that the sane behaviour is to set it to False by default.