Skip to content

Implement an allow_null_bytes argument to CharField (default True) #6068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

jleclanche
Copy link
Contributor

Description

NULL bytes within strings are forbidden in most database engines (tested against Postgres). Serializers should be able to catch this out of the box as it's a very common use case for serializers to be fed straight into a database.

The argument allow_null_bytes, if set to False, will reject any string containing \0 NULLs inside of it with its own error.

I initially called it allow_nulls, but that got confusing as there already is an allow_null argument.

I set the value to True by default so as not to break backwards compatibility. I am of the opinion however that the sane behaviour is to set it to False by default.

def test_allow_null_bytes(self):
field = serializers.CharField(allow_null_bytes=True)
for value in ('\0', 'foo\0', '\0foo', 'foo\0foo'):
field.run_validation('\0')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, these tests are using the hardcoded value instead of the one provided in the for loop.


for value in ('\0', 'foo\0', '\0foo', 'foo\0foo'):
with pytest.raises(serializers.ValidationError) as exc_info:
field.run_validation('\0')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@rpkilby
Copy link
Member

rpkilby commented Jul 6, 2018

Hi @jleclanche - thanks for the contribution. My gut reaction is that this is straightforward enough to be implemented as a validator function and doesn't need to exist in core. That said, I'd be interested in what others have to say.

@carltongibson
Copy link
Collaborator

...this is straightforward enough to be implemented as a validator function...

Absolutely. (Especially vs introducing a new kwarg.)

@jleclanche
Copy link
Contributor Author

@carltongibson @rpkilby Part of the reason I went this route rather than adding a validator was to potentially discuss making this default behaviour for CharField.

As most databases refuse null bytes in varchar and text, and most serializers.CharField instances end up in a database one way or another, I suspect most serializers in the wild out there do not correctly validate against null bytes.

@tomchristie
Copy link
Member

I’d be okay with us defaulting to False for 3.9. Anyone else?

@tomchristie
Copy link
Member

Also we’re using six.text_type multiple times over - it’d be nice if we could finesse that a little.

@jleclanche
Copy link
Contributor Author

Fixed the tests and rebased on master.

@carltongibson
Copy link
Collaborator

Django ships with a validator for this:

https://github.com/django/django/blob/38e904e26576abffed3c257a1a741e1641c6f2de/django/core/validators.py#L518

If we’re going to do this we should use that.

forms.Charfield applies this validator by default, without a kwarg option:

https://github.com/django/django/blob/38e904e26576abffed3c257a1a741e1641c6f2de/django/forms/fields.py#L218

My initial response would be to advise doing just that in a subclass.

If not, i.e. if there’s a case for us doing this too, then do we really have to have the kwarg? It seems we’d be saying “you really should do this but we we're not going to turn it on for you”.

(In this case people needing the old behaviour would need the subclass. But isn’t the point that we should do this ourselves just that you wouldn’t need the old behaviour...?)

I guess I suspect most people are never hitting this, or we’d have a million reports over the years of unexpected 500s.

@rpkilby
Copy link
Member

rpkilby commented Jul 8, 2018

I'd be +1 for following Django's lead here and just adding the validator by default and leaving off the kwarg option. The breaking change in 3.9 would also be acceptable to me, given how minor it is.

What pushes me towards adding this as a default behavior is:

  1. Django supports this.
  2. As @jleclanche stated, this is the correct thing to do from a database perspective.

If users need to override, they can pop the validator from the list (note: we should probably provide a terse example of that in the changelog).

@rpkilby
Copy link
Member

rpkilby commented Jul 8, 2018

Fixed the tests and rebased on master.

GitHub doesn't allow you to reopen a closed PR after the branch has been rebased. @jleclanche - could you submit a new PR?

@jleclanche
Copy link
Contributor Author

I'll work on the changes and create a new PR in the next few hours.

@rpkilby
Copy link
Member

rpkilby commented Jul 8, 2018

Great! Thanks.

@jleclanche
Copy link
Contributor Author

@rpkilby Just for clarification, should I be reusing django's ProhibitNullCharactersValidator or will I be reimplementing it in drf? I'm not too familiar with validators as is.

@rpkilby
Copy link
Member

rpkilby commented Jul 8, 2018

Django Validators are compatible with DRF, so I would use ProhibitNullCharactersValidator. We already use a few of them.

from django.core.validators import (
EmailValidator, RegexValidator, URLValidator, ip_address_validators
)

@rpkilby
Copy link
Member

rpkilby commented Jul 8, 2018

One thing to note is that importing the validator will fail for Django 1.11, as it only exists in Django 2.0 and later. Two ways to handle this:

  • You can add the validator to CharField for only Django 2.0 and above.
  • You can attempt to import the validator in compat, then define it if the import fails. eg,
    try:
        from django.core.validators import ProhibitNullCharactersValidator
    except ImportError:  # Django 1.11
        class ProhibitNullCharactersValidator(object):
            # Copied from Django 
            ...

@jleclanche
Copy link
Contributor Author

Oh hum.

I suppose I'm okay validating this only on Django 2.0+, seeing as it would actually follow Django's behaviour then. Thoughts?

@rpkilby
Copy link
Member

rpkilby commented Jul 8, 2018

I would be fine with that, as it matches Django's behavior.

It also means less code to cleanup later.

@tomchristie
Copy link
Member

I'd be +1 for following Django's lead here and just adding the validator by default and leaving off the kwarg option. The breaking change in 3.9 would also be acceptable to me, given how minor it is.

Sounds good.

I suppose I'm okay validating this only on Django 2.0+, seeing as it would actually follow Django's behaviour then. Thoughts?

Either way would be acceptable.

@jleclanche
Copy link
Contributor Author

Followup in #6073

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants