Skip to content

[JsonPath] Add FrameworkBundle integration #60083

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: 7.3
Choose a base branch
from

Conversation

alexandre-daubois
Copy link
Member

@alexandre-daubois alexandre-daubois commented Mar 29, 2025

Q A
Branch? 7.3
Bug fix? no
New feature? no
Deprecations? no
Issues -
License MIT

First, I'd like to rename JsonCrawlerInterface to CrawlerInterface. As stated in the early feedback of the PR, the parsed data explored with the JSONPath syntax could be various (explained in the first lines of #60083 (comment)). It seems better to remove Json from CrawlerInterface, the latter describing an interface capable of crawling arbitrary data structures with JsonPath.

Also, JsonCrawler now takes data from the withRawData() method. What's the idea behind the factory? The JSONPath RFC allows you to create new custom functions to filter your results. I'd like to create a new #[AsJsonPathFunction] attribute allowing to register custom functions. To have the best DX, we'll probably want to inject a JsonCrawler\CrawlerInterface in our services with custom functions already registered thanks to the attribute and a container call to a future withCustomFunctions() method. This is currently not possible this prototype method because the JsonCrawler object is stateful.

I didn't include a CHANGELOG line as the component is not yet released. Does it need one?

@carsonbot carsonbot added this to the 7.3 milestone Mar 29, 2025
@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from 6e4e714 to faddb61 Compare March 29, 2025 17:22
@GromNaN
Copy link
Member

GromNaN commented Mar 29, 2025

On second thought, maybe a factory is what is needed, to split the responsability between reading the source and executing a JSON path query.

I prefer JsonCrawlerInterface as it was, because it doesn't constrain the format of the JSON input. If you want to make an implementation that reads nested arrays or even objects (with PropertyAccessor?), it's possible. When the data is an argument to the find method, that restricts the way it can work.

So I suggest a JsonCrawlerFactory class used as a service, with a method (parse?) that takes the JSON and returns a JsonCrawler instance, injecting the functions.

class JsonCrawlerFactory implements JsonCrawlerFactoryInterface
{
    public function parse(string|resource $data): JsonCrawlerInterface
    {
        $crawler = new JsonCrawler($data, $this->functions);
    }
}

$jsonPathFactory->parse('{"foo":true}')->find('$.foo');

@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Mar 29, 2025

If we want to be consistent and imagine that the input format can be a nested array, maybe the name JsonCrawlerInterface isn't right (since we're no longer always crawling a JSON string).

I like the factory idea, however I'm a bit conflicted. If we just look at the current interface without the implementation, I don't find it particularly clear where the data comes from, and this necessarily implies a statefull implementation (with all the drawbacks that implies). That said, I really like how it looks in the code snippet you shared.

Maybe we shouldn't put the annotation for $data on the interface but only on our implementation? Implementations would be free to check whatever type they want in their logic then. As stated above, this could mean that we should rename the interface to CrawlerInterface, defining a class capable to use JsonPath syntax to query any type of data if someone wants to.

@GromNaN
Copy link
Member

GromNaN commented Mar 29, 2025

SQLite JSONB, PostgresSQL JSONB and MongoDB BSON are other JSON-like binary formats that could be retrieved (raw) from the a database and queried with JSONPath. The same goes for Yaml documents. So abstracting the format of the input document is nice to have.

If we just look at the current interface without the implementation, I don't find it particularly clear where the data comes from

That's exactly the purpose of this interface: show what you can do with an object, no matter the way it's made.

and this necessarily implies a statefull implementation (with all the drawbacks that implies

The factory is a stateless service; it has the Json Path custom functions that will be injected. The crawler is an object, not a service: it's legitimate for it to contain a state.

Maybe we shouldn't put the annotation for $data on the interface but only on our implementation? Implementations would be free to check whatever type they want in their logic then.

This would makes the interface less useful, since not all implementations of the interface accept the same arguments. You can't easily replace one implementation with another.

As stated above, this could mean that we should rename the interface to CrawlerInterface.

This new name you're proposing is perfect. And the JsonCrawler class keeps its name: it's the implementation of the CrawlerInterface for JSON. The interface is in the JsonPath namespace, so there is no doubt of its context without repeating "Json" in its name.

@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from faddb61 to 66e2429 Compare March 30, 2025 15:08
@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Mar 30, 2025

Thank you for this complete feedback! I just implemented what we discussed. The factory is a bit empty for now, but this will be our extension point as soon as I have some time to implement custom functions 🙂 Feedback welcome about the factory method's name, I'm open to better suggestions

@alexandre-daubois alexandre-daubois changed the title [JsonPath] Make JsonCrawler stateless [JsonPath] Introduce JsonCrawlerFactory Mar 30, 2025
@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from 66e2429 to dabee70 Compare April 7, 2025 06:48
GromNaN added a commit that referenced this pull request Apr 10, 2025
… (alexandre-daubois)

This PR was merged into the 7.3 branch.

Discussion
----------

[JsonPath] Add two utils methods to `JsonPath` builder

| Q             | A
| ------------- | ---
| Branch?       | 7.3
| Bug fix?      | no
| New feature?  | yes
| Deprecations? | no
| Issues        | -
| License       | MIT

Small DX improvements that goes with #60105 and #60083.

This PR adds two new methods, `first()` and `last()`, added to JsonPath builder. This voluntary reminds methods from the DomCrawler component. The goal is not to add every possible method, but I think `first()` and `last()` are common enough to be added.

I also propose to rename `anyIndex()` to `all()`.

```php
$path = new JsonPath();

// Get the first user of the collection
$path = $path->key('users')->first();
```

```php
$path = new JsonPath();

// Get the last user of the collection
$path = $path->key('users')->last();
```

```php
$path = new JsonPath();

// Get all users of the collection
$path = $path->key('users')->all();
```

Commits
-------

3bc3559 [JsonPath][DX] Add utils methods to `JsonPath` builder
Copy link
Member

@nicolas-grekas nicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the FWB integration in the same go;

Copy link
Member

@nicolas-grekas nicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not happy with the factory vocabulary and design pattern.
What about using the prototype pattern instead? Aka wither methods?
Then we'd just need one class, that could be used to derivate configured crawlers.

@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from dabee70 to b2037ac Compare April 19, 2025 14:09
@alexandre-daubois alexandre-daubois changed the title [JsonPath] Introduce JsonCrawlerFactory [JsonPath] Add FrameworkBundle integration Apr 19, 2025
@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Apr 19, 2025

I reworked the PR to remove the factory and use a prototype approach. I added withRawData(), so CrawlerInterface becomes injectable. Is it better to you? It solves the initial problem, and I'll be able to add a container call to a future withCustomFunctions() method with a tagged iterator in a following PR.

I'm open for suggestions about the naming. I noted the suggestion here, but nor create or crawl seems right here: we don't create the crawler, and we don't crawl (yet) the JSON string/resource at this point.

Also, FWB integration added 👍

@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from b2037ac to f01f693 Compare April 19, 2025 14:18
@OskarStark
Copy link
Contributor

Initialize?

@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Apr 22, 2025

@OskarStark why not, but I'm not sure it's a perfect match: wither methods can be "semantically" called multiple times, where calling "initialize" multiple times could feel odd. Maybe initializeWith() is a compromise?

Copy link
Member

@nicolas-grekas nicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something doesn't work here:

  • if the method isn't in the interface, then there's no point in adding it, since autowiring won't allow using it if proper IoC is used
  • then, if we add it to the interface, the input format has to be a JSON string/stream, which invalidates the reasoning that lead to the renaming of the interface

So to me, if we want to make the interface an autowirable type, it has to be named JsonCrawlerInterface, and the method has to be part of it.

Or did I miss anything?

OR, we need two interfaces; JsonCrawlerInterface extends CrawlerInterface 🤔

@alexandre-daubois
Copy link
Member Author

I think we should have both interface, that's a nice idea. This would solve the point you raise and still allow Jérôme's idea.

@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch 3 times, most recently from b927fe3 to 8c8f92f Compare April 22, 2025 08:07
@stof
Copy link
Member

stof commented Apr 22, 2025

The suggested JsonCrawlerInterface looks weird to me.

Using it means you will have a property in your class holding a JsonCrawlerInterface on which you should never call the find() method (as it would not have any data to actually find something). And you would call $crawler = $this->crawler->fromJson($data) to get a working crawler.
To me, the factory being a separate class was making the code a lot more understandable.

@nicolas-grekas
Copy link
Member

nicolas-grekas commented Apr 22, 2025

Then maybe the too interfaces should be split:

  • JsonCrawlerInterface with only a fromJson(): CrawlerInterface - that's the factory, but named in a more DX-friendly way
  • CrawlerInterface, with the find method (and NOT autowirable)

@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch 3 times, most recently from 06c26af to bc396df Compare April 22, 2025 14:53
@alexandre-daubois
Copy link
Member Author

alexandre-daubois commented Apr 22, 2025

Thanks, I updated the PR to have two interfaces. JsonCrawlerInterface is autowirable in FWB.

Fabbot complains about the TypeError message format. I'd say it is not valid, PHP internal type errors doesn't include type quoting.

@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from bc396df to 84b225d Compare April 22, 2025 15:50
@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from 84b225d to 18761a4 Compare April 23, 2025 08:16
@alexandre-daubois alexandre-daubois force-pushed the json-path-misc-improvement branch from 18761a4 to 929ffa0 Compare April 23, 2025 08:19
@@ -22,10 +19,7 @@
interface JsonCrawlerInterface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is our factory and the interface used to be used in service injection.

One of the next step is adding something like withCustomFunctions() that gets a tagged iterator of invokable classes to use in JsonPath filters (the RFC allows custom functions in filters).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder though if that doesn't mean that we should rather split the crawler and it's factory. Currently both concerned are mixed in one class which also means that the crawler needs to allow being constructed without data while that shouldn't be needed if the factory was a different class, right?

Copy link
Member Author

@alexandre-daubois alexandre-daubois Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. We discussed a bit about it with @nicolas-grekas. We agreed that we're fine having everything in one class as well as having a no-op crawler when created without data (i.e. returning an empty array for all paths). Is there a limitation that you see with this implementation maybe?

I went for the separate factory class at first, but I actually like the fact that if you need a crawler in your service, you can inject JsonCrawlerInterface directly instead of having a factory that then creates the crawler.

Copy link
Member

@GromNaN GromNaN Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the 2 interfaces implemented by the same class is not a problem when using interfaces for typing.
But I'm not convinced by the name of JsonCrawlerInterface.

I would prefer CrawlerFromJsonInterface which reuses the method name and gives the intention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer JsonCrawlerInterface - it feels less convoluted to me, and is explicit about the intended use case - querying json documents.

I also prefer one class to implement the two interfaces. Optimizing the constructor doesn't look worth the split to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern with mixing responsibilities in one class is that you could have a crawler in an intermediate state while building and then accidentally pass around the wrong crawler instance which would then behave differently than what you did expect. By separating the crawler and its factory (or probably rather a builder) PHP's type system would save you from such a mistake.

/**
* @param resource|string|array $data
*/
public function fromJson(mixed $data): CrawlerInterface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making the method static? And I would question if we really need an interface for it? And if so, if it's name wouldn't better be JsonCrawlerFactoryInterface.

Copy link
Member Author

@alexandre-daubois alexandre-daubois Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the method static wouldn't allow us to plug properly with DI and allow further steps like this one.

The "factory" name is not used per this comment and this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, we can have static factories in the DI.

But my main question here is about the aim of that method. In fact, such code will look weird to me:

$foo = (new JsonCrawler('"something"'))
    ->fromJson('"otherThing"');

Instead, static method looks more logical to me (and still allows you to use the builder pattern):

$foo = JsonCrawler::fromJson('"otherThing"')
    >withCustomFunctions(...);

Copy link
Member Author

@alexandre-daubois alexandre-daubois Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your last snippet, you call withCustomFunctions() yourself, but the goal is to have everything wired correctly so custom functions are automatically registered with an attribute, and you "never" call the function yourself if you use FWB. I'm not sure we'd be able to achieve this with static factories (or any static behavior being used)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to combine static factories and service configurators maybe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hell no, not configurators.
A static factory would miss the point of configuring transformer functions.

But all this makes me think we're thinking out of the blue. Let's not do anything until we actually do have such transformers.

If we have some lessons to take from this PR to prepare the next steps, let's keep that, but let's remove the autowiring/DI part to the moment where we'll really need it.

use Symfony\Component\JsonPath\JsonCrawler;
use Symfony\Component\JsonPath\JsonCrawlerInterface;

return static function (ContainerConfigurator $container) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add a functional test to make sure that the services are properly registered and usable? (You can find such tests in Symfony\Bundle\FrameworkBundle\Tests\Functional

/**
* @param resource|string|array $data
*/
public function fromJson(mixed $data): CrawlerInterface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, we can have static factories in the DI.

But my main question here is about the aim of that method. In fact, such code will look weird to me:

$foo = (new JsonCrawler('"something"'))
    ->fromJson('"otherThing"');

Instead, static method looks more logical to me (and still allows you to use the builder pattern):

$foo = JsonCrawler::fromJson('"otherThing"')
    >withCustomFunctions(...);

@alexandre-daubois
Copy link
Member Author

Discussions makes me wonder: is there a design flaw maybe in the current state of the crawler? The number of comments on the PR makes me wonder so.
I took a look at ExpressionLanguage. We could take inspiration from it maybe. ExpressionLanguage takes providers (ExpressionFunctionProviderInterface[]) through the constructor.

This could be the same for JsonCrawler, with an array $functions or something similar provided through the constructor. Then we could also change the current JsonCrawler::find($path) so it accepts the data directly, changing the signature to JsonPath::find($path, mixed $data) (args order can be discussed 🙂). We can keep an interface describing the find() method, this is a nice extension point as pointed by Jérôme.

Status: Needs work

@carsonbot carsonbot changed the title [JsonPath] Add FrameworkBundle integration Add FrameworkBundle integration Apr 24, 2025
@carsonbot carsonbot changed the title Add FrameworkBundle integration [JsonPath] Add FrameworkBundle integration Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants