Skip to content

[RFC] Add RFC 3986 and WHATWG compliant URL parsing support #14461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

kocsismate
Copy link
Member

@kocsismate kocsismate commented Jun 3, 2024

2nd take after the failed experiment with #11315

RFC: https://wiki.php.net/rfc/url_parsing_api

@@ -3715,7 +3715,6 @@ function uniqid(string $prefix = "", bool $more_entropy = false): string {}

/**
* @return int|string|array<string, int|string>|null|false
* @compile-time-eval
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed for benchmarking purposes


static void cleanup_parser(void)
{
if (++URL_G(urls) % 500 == 0) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach is copy-pasted from lexbor/lexbor#206

@kocsismate kocsismate force-pushed the ext-url2 branch 5 times, most recently from 823e4c6 to e65deb7 Compare June 6, 2024 08:23
@kocsismate kocsismate changed the title Add ext/url based on Lexbor [RFC] Add ext/url based on Lexbor Jun 12, 2024
@kocsismate kocsismate force-pushed the ext-url2 branch 2 times, most recently from d0b9a3d to b9293a2 Compare January 25, 2025 22:42
@kocsismate kocsismate force-pushed the ext-url2 branch 2 times, most recently from 20f2844 to 0d2d2c3 Compare February 6, 2025 08:06
Copy link
Member

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Did not look into any details, yet.

@@ -141,6 +141,7 @@ static void user_shutdown_function_dtor(zval *zv);
static void user_tick_function_dtor(user_tick_function_entry *tick_function_entry);

static const zend_module_dep standard_deps[] = { /* {{{ */
ZEND_MOD_REQUIRED("uri")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If uri is required, it should not require the --enable-uri flag to build it. It should always be built, like ext/random.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good catch!

@@ -32,13 +32,13 @@ class TestSoapClient extends SoapClient {

$client = new TestSoapClient(__DIR__.'/bug38067.wsdl',
array('encoding' => 'ISO-8859-1'));
$str = 'test: �';
$str = 'test: �';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unintended change? Did you editor treat this file as UTF-8 instead of ISO-8859-1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's surely an editor issue.

@@ -604,7 +604,7 @@ public function __getLastRequestHeaders(): ?string {}
public function __getLastResponseHeaders(): ?string {}

/** @tentative-return-type */
public function __doRequest(string $request, string $location, string $action, int $version, bool $oneWay = false): ?string {}
public function __doRequest(string $request, string $location, string $action, int $version, bool $oneWay = false, ?string $uriParserClass = null): ?string {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the number of tests that you changed, is this a breaking change? If so, it needs to be noted in the RFC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's a good idea

[Enable URI support])])

if test "$PHP_URI" != "no"; then
PHP_LEXBOR_CFLAGS="-I$abs_srcdir/ext/dom/lexbor -DLEXBOR_STATIC -I$abs_srcdir/ext/uri/uriparser/include"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively makes ext/dom a dependency of ext/uri. Should Lexbor be moved somewhere else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a WIP behavior from the early days of the PR. Once I manage to make the RFC approved, I'll create a lexbor "extension" (similarly to mysqlnd).

Comment on lines +24 to +25
PHP_ADD_EXTENSION_DEP(uri, dom)
PHP_NEW_EXTENSION(uri, [php_uri.c php_uri_common.c php_lexbor.c php_uriparser.c $URIPARSER_SOURCES], $ext_shared,,$PHP_LEXBOR_CFLAGS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PHP_ADD_EXTENSION_DEP(uri, dom)
PHP_NEW_EXTENSION(uri, [php_uri.c php_uri_common.c php_lexbor.c php_uriparser.c $URIPARSER_SOURCES], $ext_shared,,$PHP_LEXBOR_CFLAGS)
PHP_NEW_EXTENSION(uri, [php_uri.c php_uri_common.c php_lexbor.c php_uriparser.c $URIPARSER_SOURCES], $ext_shared,,$PHP_LEXBOR_CFLAGS)
PHP_ADD_EXTENSION_DEP(uri, dom)

For this macro to be effective, it should be called after the PHP_NEW_EXTENSION.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'll take care of this once the RFC is accepted :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants