Skip to content

[Form] Disallow transform dates beyond the year 9999 #25781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

curry684
Copy link
Contributor

@curry684 curry684 commented Jan 13, 2018

Fixes #14727

Q A
Branch? 2.7
Bug fix? yes
New feature? no
BC breaks? not really
Deprecations? no
Tests pass? yes
Fixed tickets #14727
License MIT

Explicitly locked out submission of dates beyond December 31st 9999 in forms as PHP is highly incapable of consistently handling such dates. Before this patch dates were randomly transformed or mangled.

Technically there is a BC break as this will now cause validation to fail on input that was accepted before, but it was mangled. Not my call but I prefer the rejection over data corruption:

// Old behavior
$transformer = new DateTimeToLocalizedStringTransformer('UTC', 'UTC', null, null, \IntlDateFormatter::GREGORIAN, 'yyyy-MM-dd');
$result = $transformer->reverseTransform('20107-03-21');

// $result is now 2007-03-21

Copy link
Contributor

@ostrolucky ostrolucky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Artificial limit of max timestamp with a constant is not a right approach. To keep being compatible with environments where PHP correctly parses even such high values, this fix should instead improve detection when PHP is unable to correctly parse such date and only then throw such exception.

You posted several code snippets in referenced issue which give you pointers how to do that (exception thrown / returned false)

@curry684
Copy link
Contributor Author

curry684 commented Jan 13, 2018

@ostrolucky those code snippets are the result of an hour of research convincing me that this is the only viable implementation. Every other attempt at solving the issue resulted in test cases breaking because, at its core, all of PHP simply cannot support 5 digit years. The DateTimeToStringTransformer being patched here actually turned out to be the only loophole that has not yet been patched to conform to strict 4-digit years by regexp or other means. The reason it was not breaking here and thus causing invalid results is that IntlDateTransformer appears to be the only PHP facility capable of parsing 5-digit years, and the original author of this class never foresaw that, expecting it to crash like all other functions.

The problem in this very specific case is that there's no earlier detection mechanism. The Intl extension is out of our control, and this is the localized transformer class. It may be parsing Klingon Star Dates for all we know. What we do know is that it'll get unreliable once we put it in a \DateTime class, yet we cannot detect it there anymore - it'll just be 2007 instead of 20107.

So no, I do not believe there is a better alternative than the current fix. All others, like regexping, will boil down to slower implementations to the same effect - we have to act on the timestamp coming out of Intl. Integer comparison is just fastest.

@ostrolucky
Copy link
Contributor

Here you go https://3v4l.org/qK9BQ

Support for such dates should be now just a matter of using this snippet here

@curry684
Copy link
Contributor Author

That snippet does not solve the issue, and that is that the DateTimeToLocalizedStringTransformer needs to strip the time in a specific timezone aware fashion. Believe me, it's so trivial it was just about my first attempt, and it breaks 3 unit tests concerning DST-specific overflows:

$dateTime = new \DateTime(gmdate('Y-m-d', $timestamp), new \DateTimeZone($this->outputTimezone));

Does not equal the much cleaner:

$dateTime = \DateTime::createFromFormat('U', $timestamp, new \DateTimeZone($this->outputTimezone));
$dateTime->setTime(0, 0, 0);

Because gmdate inherently corrects for DST.

I'm not contesting there are other, prettier ways to fundamentally fix the behavior, but it would be a far more invasive change than what's being done now, and that's a bad idea imo to just fix a simple bug in 2.7.

Please keep in mind that all other DateTime transformers are already implicitly broken for 5+ digit years, and that the DateValidator and related classes do a regexp check to ensure the year consists of 4 digits. There is no point in doing a better fix here than what I proposed. There is no functional difference between doing a timestamp-compare with a well-defined and -commented constant, or doing a preg_match('/(\d{4})-(\d{2})-(\d{2})/', ...) which we do elsewhere in this very code. The timestamp compare is just much faster and cleaner.

@ostrolucky
Copy link
Contributor

ostrolucky commented Jan 15, 2018

Ok how about this? Passes the unit tests and still preserves 5 digit year

@@ -125,16 +125,16 @@
         }
 
         try {
-            if ($dateOnly) {
-                // we only care about year-month-date, which has been delivered as a timestamp pointing to UTC midnight
-                $dateTime = new \DateTime(gmdate('Y-m-d', $timestamp), new \DateTimeZone($this->outputTimezone));
-            } else {
-                // read timestamp into DateTime object - the formatter delivers a timestamp
-                $dateTime = new \DateTime(sprintf('@%s', $timestamp));
-            }
+            $dateTime = new \DateTime(sprintf('@%s', $timestamp));
+            $timezone = new \DateTimeZone($this->outputTimezone);
             // set timezone separately, as it would be ignored if set via the constructor,
             // see http://php.net/manual/en/datetime.construct.php
-            $dateTime->setTimezone(new \DateTimeZone($this->outputTimezone));
+            $dateTime->setTimezone($timezone);
+
+            if ($dateOnly) {
+                // supplied timestamp in UTC midnight has been shifted by timezone, we need to revert that
+                $dateTime->sub(\DateInterval::createFromDateString($timezone->getOffset($dateTime).' seconds'));
+            }
         } catch (\Exception $e) {
             throw new TransformationFailedException($e->getMessage(), $e->getCode(), $e);
         }

other DateTime transformers are already implicitly broken for 5+ digit years

That's kinda right. That might be battles for another pull requests if someone creates issue for them similarly as has been created for this one.

DateValidator and related classes do a regexp check to ensure the year consists of 4 digits

DateValidator skips the check if it receives DateTimeInterface instance already. That poses another problem, but again, that's a battle for different time

IMO suporting > 4 digit years might be useful in science and we should strive to support it if we can. There is also no limit on year digits for HTML5 date input, Firefox lets me input there whichever year I want, so there shouldn't be such artificial limit in backend either.

There aren't much requests for such support (for now) but giving up right away by throwing exception is not a way. This PR might be good opportunity to start supporting it and lead the way for future support.

edit: found this relevant wiki article https://en.wikipedia.org/wiki/Year_10,000_problem, this problem is interesting

@curry684
Copy link
Contributor Author

That might be battles for another pull requests if someone creates issue for them similarly as has been created for this one.

But it would be a functional change, so master instead of 2.7 😉 The core point of this PR is to fix 2 pretty nasty bugs if you enter 20107-03-12 in a date field:

  • If the validation of the form fails the field is then rerendered with 2007-03-12 filled in, likely not noticed by the user, causing corruption on fixing the other validation errors.
  • If the validation passes the form field contains 2007-03-12, which will then happily be persisted in the database and cause corruption.

I don't think we disagree on any of the core points, I think we definitely should support 5 and even 6 digit years in Symfony. But right now we don't and functionality is broken. This PR fixes the data corruption in 2.7 in the least invasive way possible. The fundamental fixes should go in master.

@curry684
Copy link
Contributor Author

Btw this line from that Wikipedia link is interesting:

ISO 8601 specifies that years be written with four digits, but allows for extension to five or more digits, with prior agreement between the parties exchanging the information.

Limiting to 4 digits is technically adhering to standards right now.

@ostrolucky
Copy link
Contributor

ostrolucky commented Jan 15, 2018

But it would be a functional change, so master instead of 2.7 😉

Not really. If someone raises bug report about date input mangled by transformer (as was the issue this PR fixes), we can fix that by supporting > 4 digits.

The core point of this PR is to fix 2 pretty nasty bugs if you enter 20107-03-12 in a date field:

My patch fixes those

ISO 8601 specifies that years be written with four digits, but allows for extension to five or more digits, with prior agreement between the parties exchanging the information.

My interpretation is that year is written with four digits initially, but you are free to add more of them. Like HTML 5 date input does.

But ok, this should be agreed on in RFC

@xabbuh
Copy link
Member

xabbuh commented Jan 15, 2018

this looks reasonable to me

@ogizanagi @HeahDude What do you think?

@HeahDude
Copy link
Contributor

Sounds reasonable for me too. A proper support for 5 and greater digits should have a dedicated feature PR.

@ogizanagi
Copy link
Contributor

I agree. Let's do this (and create a new issue for handling 5+ digits)

@curry684
Copy link
Contributor Author

Already done at #25790 😄

@@ -22,6 +22,9 @@
*/
class DateTimeToLocalizedStringTransformer extends BaseDateTimeTransformer
{
// Midnight of last day of 9999 to prevent 5 digit years from being transformed
const MAX_TIMESTAMP = 253370674800;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the last day of the year 9998. Shouldn't this be 253402297199 (or at least 253402210800)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm you're right, I must've c/p'ed the wrong line in my tests haha. I did consciously go for UTC midnight of the 31st, as there are +13h timezones and this means it will never be the year 10000 at any (currently known) timezone for the given timestamp. The correct value would be 253402214400 which is 9999-12-31T00:00:00+00:00.

@nicolas-grekas nicolas-grekas added this to the 2.7 milestone Jan 16, 2018
@@ -22,6 +22,9 @@
*/
class DateTimeToLocalizedStringTransformer extends BaseDateTimeTransformer
{
// Midnight of last day of 9999 to prevent 5 digit years from being transformed
const MAX_TIMESTAMP = 253402214400;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a private static variable so that we won't have to maintain a new public const.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just placed inlined as it is used only once (a private static would force the compiler to consider it mutable)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, sucks that we don't have private const back in 2.7.

@fabpot
Copy link
Member

fabpot commented Jan 17, 2018

Thank you @curry684.

fabpot added a commit that referenced this pull request Jan 17, 2018
…y684)

This PR was squashed before being merged into the 2.7 branch (closes #25781).

Discussion
----------

[Form] Disallow transform dates beyond the year 9999

Fixes #14727

| Q             | A
| ------------- | ---
| Branch?       | 2.7
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | not really
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #14727
| License       | MIT

Explicitly locked out submission of dates beyond December 31st 9999 in forms as PHP is highly incapable of consistently handling such dates. Before this patch dates were randomly transformed or mangled.

Technically there is a BC break as this will now cause validation to fail on input that was *accepted* before, but it was mangled. Not my call but I prefer the rejection over data corruption:
```
// Old behavior
$transformer = new DateTimeToLocalizedStringTransformer('UTC', 'UTC', null, null, \IntlDateFormatter::GREGORIAN, 'yyyy-MM-dd');
$result = $transformer->reverseTransform('20107-03-21');

// $result is now 2007-03-21
```

Commits
-------

70cc969 [Form] Disallow transform dates beyond the year 9999
@fabpot fabpot closed this Jan 17, 2018
This was referenced Jan 29, 2018
@curry684 curry684 deleted the issue-14727 branch January 29, 2018 10:53
This was referenced Jan 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants