Skip to content

DOMNameSpaceNode parent not unset when namespace gets unlinked. #12850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nielsdos opened this issue Dec 2, 2023 · 5 comments
Closed

DOMNameSpaceNode parent not unset when namespace gets unlinked. #12850

nielsdos opened this issue Dec 2, 2023 · 5 comments

Comments

@nielsdos
Copy link
Member

nielsdos commented Dec 2, 2023

Description

The following code:

<?php

$dom = new DOMDocument;
$dom->loadXML("<container xmlns=\"urn:foo\"/>");
$container = $dom->documentElement;
$ns = $container->getAttributeNode("xmlns");

var_dump($ns);

$container->removeAttributeNs("urn:foo", "");

var_dump($ns);

Resulted in this output:

first dump: DOMNameSpaceNode with parent node non-NULL (okay)
second dump: DOMNameSpaceNode with parent node non-NULL (not okay)

But I expected this output instead:

first dump: DOMNameSpaceNode with parent node non-NULL
second dump: DOMNameSpaceNode with parent node NULL

In versions prior to 8.1 the parent node was always NULL.
Mainly creating the issue so I don't forget about this.

PHP Version

8.1+

Operating System

No response

nielsdos added a commit to nielsdos/php-src that referenced this issue Dec 5, 2023
…s unlinked

The problem is that the namespace nodes are exposed as fake nodes
because of their different underlying data structure, which means that
the parent link isn't actually updated when it gets removed; and we
don't have a simple way of tracking it.

So just perform a check for the node type and check whether the
namespace declaration associated with the fake node is still in
existence.
nielsdos added a commit to nielsdos/php-src that referenced this issue Dec 7, 2023
…s unlinked

The problem is that the namespace nodes are exposed as fake nodes
because of their different underlying data structure, which means that
the parent link isn't actually updated when it gets removed; and we
don't have a simple way of tracking it.

So just perform a check for the node type and check whether the
namespace declaration associated with the fake node is still in
existence.
nielsdos added a commit to nielsdos/php-src that referenced this issue Dec 8, 2023
…s unlinked

The problem is that the namespace nodes are exposed as fake nodes
because of their different underlying data structure, which means that
the parent link isn't actually updated when it gets removed; and we
don't have a simple way of tracking it.

So just perform a check for the node type and check whether the
namespace declaration associated with the fake node is still in
existence.
@SVGAnimate
Copy link
Contributor

Hello,

First of all, thank you nielsdos for your courage. DOM extension is not an easy task.

It's so strange... I would have bet that $container->removeAttributeNs("urn:foo", ""); returns false (but no not at all).
And that it was necessary to write $container->removeAttributeNs("http://www.w3.org/2000/xmlns/", ""); or
$container->removeAttribute("xmlns"); to remove/detach the attribute xmlns (but no)

In any case according to the documentation, removeAttributeNode() expect a DOMAttr object and never a DOMNameSpaceNode


Maybe it's me who doesn't understand anything about DOM

Since XML declarations(xmlns="urn:foo") and XML properties(at="value") are accessible in the same way with DOM getAttribute* functions.

So here it is how it should work for me :

<container xmlns="urn:foo" xmlns:my="urn:my" my:at="value">
  <child my:at="val"></child>
</container>
$dom = new DOMDocument;
$dom->loadXML("<container xmlns=\"urn:foo\" xmlns:my=\"urn:my\"><child my:at=\"val\"></child></container>");
$container = $dom->documentElement;
$child = $dom->documentElement->childNodes->item(0);;

$fooUrn    = $container->getAttribute("xmlns");
$fooUrn    = $container->getAttributeNs("http://www.w3.org/2000/xmlns/", "");
$defaultNs = $container->getAttributeNode("xmlns");
$defaultNs = $container->getAttributeNodeNs("http://www.w3.org/2000/xmlns/", "");

$myUrn = $container->getAttribute("xmlns:my");
$myUrn = $container->getAttributeNs("http://www.w3.org/2000/xmlns/", "my");
$myNs  = $container->getAttributeNode("xmlns:my");
$myNs  = $container->getAttributeNodeNs("http://www.w3.org/2000/xmlns/", "my");

$nodeAt = $child->getAttributeNodeNs("urn:my", "at");

//var_dump(...);

// 3 ways to remove my:at="val"
//$ret = $child->removeAttribute("my:at");
//$ret = $child->removeAttributeNode($nodeAt);
//$ret = $child->removeAttributeNs("urn:my", "my");

// 3 ways to remove xmlns="urn:foo"
//$ret = $container->removeAttribute("xmlns"); 
//$ret = $container->removeAttributeNode($defaultNs); 
//$ret = $container->removeAttributeNs("http://www.w3.org/2000/xmlns/", "");


// 3 ways to remove xmlns:my="urn:my"
//$ret = $container->removeAttribute("xmlns:my"); 
//$ret = $container->removeAttributeNode($myNs); 
//$ret = $container->removeAttributeNs("urn:my", "at");

var_dump($ret);

echo $dom->saveXML();

The xmlNs.context field was added on June 16, 2006.

DOMNameSpaceNode could use this field to store the xml parent node/doc, and keep _private to link to the zend_object

PS : I would go for a new extension, but my wife is calling me ...
https://www.youtube.com/watch?v=AbSehcT19u0

@nielsdos
Copy link
Member Author

Hey

First of all, thank you nielsdos for your courage. DOM extension is not an easy task.

It certainly isn't 😅

It's so strange... I would have bet that $container->removeAttributeNs("urn:foo", ""); returns false (but no not at all).
And that it was necessary to write $container->removeAttributeNs("http://www.w3.org/2000/xmlns/", ""); or
$container->removeAttribute("xmlns"); to remove/detach the attribute xmlns (but no)

Yeah the current behaviour is indeed completely wrong. In fact, the namespace handling is totally whack :/

Thanks for sharing your code sample, might be useful as a reference, it is indeed how it's supposed to work.

I'm tempted to leave this issue as-is. I'm currently researching the possibility of making ext-dom spec compliant in an opt-in way. For background, we currently have new classes DOM\HTMLDocument and DOM\XMLDocument as a result of my RFC. The old DOMDocument class remains as-is. My idea is that by using the new classes the user switches to a "spec-compliant" mode. In spec-compliant mode attributes and namespaces would behave correctly (i.e. there would be no weird DOMNameSpaceNode class). This would allow for a migration path for users while not forcing BC breaks down their throat. I'm currently still in the early stages by listing all the current spec bugs in a document by cross-checking the spec vs our implementation.

The xmlNs.context field was added on June 16, 2006.
DOMNameSpaceNode could use this field to store the xml parent node/doc, and keep _private to link to the zend_object

Perhaps, although we have to maintain this after cloning etc so libxml probably makes it a bit harder...

PS : I would go for a new extension, but my wife is calling me ...
https://www.youtube.com/watch?v=AbSehcT19u0

😂

@medabkari
Copy link

@nielsdos This issue is mentioned as fixed by the opt-in compliance RFC or am i missing something?

@nielsdos
Copy link
Member Author

@nielsdos This issue is mentioned as fixed by the opt-in compliance RFC or am i missing something?

It's no longer an issue in spec compliance mode because it doesn't have the class at fault. For old DOM the issue is still here. I'm uncertain about how to fix this in old DOM, which is why this issue is still open.

@nielsdos
Copy link
Member Author

I looked into this again.
The problem is the design of the class, it serves two purposes really:

  • Acts as a namespace node from the XPath specification
  • Represents an internal namespace in a tree

For the first use-case, we do want to keep the parent node because XPath basically demands that a namespace node is tied to the particular parent that it was scoped upon when the query was executed.

For the second use-case, i.e. the one highlighted in this issue, we would ideally unlink from the parent if the namespace gets unset. However, this can't be implemented in an efficient way as far as I can tell.
Given that the new classes have solved this issue by design, I think this is a wontfix. At least I don't see a way to fix this.

@nielsdos nielsdos closed this as not planned Won't fix, can't repro, duplicate, stale Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants