Remove source expression deduplication. #499
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR removes
dedup_source_list
and replaces it with a simple.uniq
call. This resolves #491, which is only the latest in a series of ongoing issues with source expression deduplication.secure_headers
has had this feature since 2015 that deduplicates redundant URL source expressions. For example, if*.github.com
is listed as a source expression for a given directive, then the addition ofexample.github.com
would have no effect, and so the latter can be safely removed bysecure_headers
to save bytes.Unfortunately, this implementation has had various bugs due to the use of "impedance mismatched" APIs like
URI
1 andFile.fnmatch
2. For example, it made incorrect assumptions about source expression schemes, leading to the following series of events:@keithamus
(a Hubber) in 2022 due to our use of web sockets.data:
.@lgarron
drafted a new implementation that semantically parses and compares source expressions based on the specification for source expressions.github.com
's CSP. However, it would take additional work to make this implementation fully aware of CSP syntax (e.g. not allowing URL source expressions in a source directive when only special keywords are allowed, and vice-versa), and it relies on a new regex-based implementation of source expression parsing that may very well lead to more subtle bugs.In effect, this is a half feature whose maintenance cost has outweighed its functionality:
secure_headers
.@oreoshake
(the then-maintainer) without explanation in 2015, never "officially" documented. We have no concrete data on whether it has any performance impact on any real apps — for all we know, uncached deduplication calculations might even cost more than the saved header bytes.@oreoshake
himself said:So this PR completely removes the functionality. If we learn of a use case where this was very important (and the app somehow can't preprocess the list before passing it to
secure_headers
), we can always resume consideration of one of:Footnotes
Which allows wildcards in domains but not for ports, as it is not designed to parse URL source expressions. ↩
Which has general glob matching that is not designed for URL source expressions either. ↩