Skip to content

[DomCrawler] Add support for XPath expression evaluation #19430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 2, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions src/Symfony/Component/DomCrawler/Crawler.php
Original file line number Diff line number Diff line change
Expand Up @@ -592,6 +592,36 @@ public function html()
return $html;
}

/**
* Evaluates an XPath expression.
*
* Since an XPath expression might evaluate to either a simple type or a \DOMDoneList,
* this method will return either an array of simple types or a new Crawler instance.
*
* @param string $xpath An XPath expression
*
* @return array|Crawler An array of evaluation results or a new Crawler instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning an array of a Crawler makes it difficult to work with this method in a fluent interface. This would be the first method returning a Crawler or something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought twice before defining the return type. I also don't like it but the DOMXPath::evaluate() returns a simple type value or DOMNodeList. The result depends on the type of xpath query. I figured that the end user knows what it's gonna be since he wrote the query.

Not sure how we could solve it differently (other than defining two methods and verifying results, or forbidding one type of queries).

*/
public function evaluate($xpath)
{
if (null === $this->document) {
throw new \LogicException('Cannot evaluate the expression on an uninitialized crawler.');
}

$data = array();
$domxpath = $this->createDOMXPath($this->document, $this->findNamespacePrefixes($xpath));

foreach ($this->nodes as $node) {
$data[] = $domxpath->evaluate($xpath, $node);
}

if (isset($data[0]) && $data[0] instanceof \DOMNodeList) {
return $this->createSubCrawler($data);
}

return $data;
}

/**
* Extracts information from the list of nodes.
*
Expand Down
45 changes: 45 additions & 0 deletions src/Symfony/Component/DomCrawler/Tests/CrawlerTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -1061,6 +1061,51 @@ public function testCountOfNestedElements()
$this->assertCount(1, $crawler->filter('li:contains("List item 1")'));
}

public function testEvaluateReturnsTypedResultOfXPathExpressionOnADocumentSubset()
{
$crawler = $this->createTestCrawler();

$result = $crawler->filterXPath('//form/input')->evaluate('substring-before(@name, "Name")');

$this->assertSame(array('Text', 'Foo', 'Bar'), $result);
}

public function testEvaluateReturnsTypedResultOfNamespacedXPathExpressionOnADocumentSubset()
{
$crawler = $this->createTestXmlCrawler();

$result = $crawler->filterXPath('//yt:accessControl/@action')->evaluate('string(.)');

$this->assertSame(array('comment', 'videoRespond'), $result);
}

public function testEvaluateReturnsTypedResultOfNamespacedXPathExpression()
{
$crawler = $this->createTestXmlCrawler();
$crawler->registerNamespace('youtube', 'http://gdata.youtube.com/schemas/2007');

$result = $crawler->evaluate('string(//youtube:accessControl/@action)');

$this->assertSame(array('comment'), $result);
}

public function testEvaluateReturnsACrawlerIfXPathExpressionEvaluatesToANode()
{
$crawler = $this->createTestCrawler()->evaluate('//form/input[1]');

$this->assertInstanceOf(Crawler::class, $crawler);
$this->assertCount(1, $crawler);
$this->assertSame('input', $crawler->first()->nodeName());
}

/**
* @expectedException \LogicException
*/
public function testEvaluateThrowsAnExceptionIfDocumentIsEmpty()
{
(new Crawler())->evaluate('//form/input[1]');
}

public function createTestCrawler($uri = null)
{
$dom = new \DOMDocument();
Expand Down