Skip to content

Remove source expression deduplication. #499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 24, 2022

Conversation

lgarron
Copy link
Contributor

@lgarron lgarron commented Oct 19, 2022

This PR removes dedup_source_list and replaces it with a simple .uniq call. This resolves #491, which is only the latest in a series of ongoing issues with source expression deduplication.

secure_headers has had this feature since 2015 that deduplicates redundant URL source expressions. For example, if *.github.com is listed as a source expression for a given directive, then the addition of example.github.com would have no effect, and so the latter can be safely removed by secure_headers to save bytes.

Unfortunately, this implementation has had various bugs due to the use of "impedance mismatched" APIs like URI1 and File.fnmatch2. For example, it made incorrect assumptions about source expression schemes, leading to the following series of events:

  • 2017-03: A bug was reported and confirmed
  • 2022-04: The bug was finally fixed by @keithamus (a Hubber) in 2022 due to our use of web sockets.
  • 2022-06: This fix in turn triggered a new bug with source expressions like data:.
  • 2022-06: An external contributor submitted a fix for the bew bug, but this still doesn't address some of the "fast and loose" semantic issues of the underlying implementation.
  • 2022-08: @lgarron drafted a new implementation that semantically parses and compares source expressions based on the specification for source expressions.
    • This implementation already proved to have some value in early testing, as its stricter validation caught an issue in github.com's CSP. However, it would take additional work to make this implementation fully aware of CSP syntax (e.g. not allowing URL source expressions in a source directive when only special keywords are allowed, and vice-versa), and it relies on a new regex-based implementation of source expression parsing that may very well lead to more subtle bugs.

In effect, this is a half feature whose maintenance cost has outweighed its functionality:

  • The relevant code has suffered from continued bugs, described as above.
  • Deduplication is purely a "nice-to-have" — it is not necessary for the security or correct functionality of secure_headers.
  • It was introduced by @oreoshake (the then-maintainer) without explanation in 2015, never "officially" documented. We have no concrete data on whether it has any performance impact on any real apps — for all we know, uncached deduplication calculations might even cost more than the saved header bytes.
  • Further, in response to the first relevant bug, @oreoshake himself said:

I've never been a fan of the deduplication based on * anyways. Maybe we should just rip that out.

Like people trying to save a few bytes can optimize elsewhere.

So this PR completely removes the functionality. If we learn of a use case where this was very important (and the app somehow can't preprocess the list before passing it to secure_headers), we can always resume consideration of one of:

Footnotes

  1. Which allows wildcards in domains but not for ports, as it is not designed to parse URL source expressions.

  2. Which has general glob matching that is not designed for URL source expressions either.

@lgarron lgarron force-pushed the lgarron/remove-source-expression-dedup branch from 8097494 to 70af13b Compare October 24, 2022 18:01
@lgarron lgarron changed the base branch from main to lgarron/update-ci October 24, 2022 18:01
@lgarron lgarron force-pushed the lgarron/remove-source-expression-dedup branch from 70af13b to cd73b02 Compare October 24, 2022 18:01
Base automatically changed from lgarron/update-ci to main October 24, 2022 19:01
lgarron added a commit that referenced this pull request Oct 24, 2022
Ruby 2.5 has been failing on CI since
#499 and is no longer
supported.
@lgarron lgarron merged commit b6ef2ed into main Oct 24, 2022
@lgarron lgarron deleted the lgarron/remove-source-expression-dedup branch October 24, 2022 19:03
lgarron added a commit that referenced this pull request Oct 24, 2022
Release notes:

- CSP: Remove source expression deduplication. (@lgarron) #499
@lgarron lgarron mentioned this pull request Oct 24, 2022
lgarron added a commit that referenced this pull request Oct 24, 2022
Release notes:

- CSP: Remove source expression deduplication. (@lgarron) #499
lgarron added a commit that referenced this pull request Oct 24, 2022
Release notes:

- CSP: Remove source expression deduplication. (@lgarron)
#499
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URI::InvalidURIError: Invalid data URI
3 participants