The Document Object Model (DOM) is a representation — a model — of a document and its content. [DOM3CORE] The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM.
This specification defines the language represented in the DOM by
features together called DOM5 HTML. DOM5 HTML consists of DOM Core
Document
nodes and DOM Core Element
nodes, along
with text nodes and other content.
Elements in the DOM represent things; that is, they have intrinsic meaning, also known as semantics.
For example, an ol
element
represents an ordered list.
In addition, documents and elements in the DOM host APIs that extend the DOM Core APIs, providing new features to application developers using DOM5 HTML.
Every XML and HTML document in an HTML UA is represented by a
Document
object. [DOM3CORE]
Document
objects are assumed to be XML documents unless they are flagged as being HTML documents when they are created. Whether a document is
an HTML document or an XML document affects the
behavior of certain APIs, as well as a few CSS rendering rules. [CSS21]
A Document
object created by the createDocument()
API on the DOMImplementation
object is initially an XML
document, but can be made into an HTML document by calling document.open()
on it.
All Document
objects (in user agents implementing this
specification) must also implement the HTMLDocument
interface, available using
binding-specific methods. (This is the case whether or not the document in
question is an HTML document
or indeed whether it contains any HTML
elements at all.) Document
objects must also implement
the document-level interface of any other namespaces found in the document
that the UA supports. For example, if an HTML implementation also supports
SVG, then the Document
object must implement HTMLDocument
and SVGDocument
.
Because the HTMLDocument
interface is now obtained
using binding-specific casting methods instead of simply being the primary
interface of the document object, it is no longer defined as inheriting
from Document
.
interface HTMLDocument {
// Resource metadata management
[PutForwards=href] readonly attribute Location location;
readonly attribute DOMString URL;
attribute DOMString domain;
readonly attribute DOMString referrer;
attribute DOMString cookie;
readonly attribute DOMString lastModified;
readonly attribute DOMString compatMode;
attribute DOMString charset;
readonly attribute DOMString characterSet;
readonly attribute DOMString defaultCharset;
readonly attribute DOMString readyState;
// DOM tree accessors
attribute DOMString title;
attribute DOMString dir;
attribute HTMLElement body;
readonly attribute HTMLCollection images;
readonly attribute HTMLCollection embeds;
readonly attribute HTMLCollection plugins;
readonly attribute HTMLCollection links;
readonly attribute HTMLCollection forms;
readonly attribute HTMLCollection anchors;
readonly attribute HTMLCollection scripts;
NodeList getElementsByName(in DOMString elementName);
NodeList getElementsByClassName(in DOMString classNames);
// Dynamic markup insertion
attribute DOMString innerHTML;
HTMLDocument open();
HTMLDocument open(in DOMString type);
HTMLDocument open(in DOMString type, in DOMString replace);
Window open(in DOMString url, in DOMString name, in DOMString features);
Window open(in DOMString url, in DOMString name, in DOMString features, in boolean replace);
void close();
void write(in DOMString text);
void writeln(in DOMString text);
// Interaction
readonly attribute Element activeElement;
boolean hasFocus();
// Commands
readonly attribute HTMLCollection commands;
// Editing
attribute boolean designMode;
boolean execCommand(in DOMString commandId);
boolean execCommand(in DOMString commandId, in boolean showUI);
boolean execCommand(in DOMString commandId, in boolean showUI, in DOMString value);
boolean queryCommandEnabled(in DOMString commandId);
boolean queryCommandIndeterm(in DOMString commandId);
boolean queryCommandState(in DOMString commandId);
boolean queryCommandSupported(in DOMString commandId);
DOMString queryCommandValue(in DOMString commandId);
Selection getSelection();
};
Since the HTMLDocument
interface holds methods and attributes related to a number of disparate
features, the members of this interface are described in various different
sections.
User agents must raise a security exception
whenever any of the members of an HTMLDocument
object are accessed by
scripts whose effective script origin is not the
same as the
Document
's effective script origin.
The URL
attribute
must return the document's address.
The referrer
attribute must
return either the URI of the active document of the
source browsing context at the time the navigation
was started (that is, the page which navigated the browsing context
to the current document), or the empty string if there is no such
originating page, or if the UA has been configured not to report referrers
in this case, or if the navigation was initiated for a hyperlink with a noreferrer
keyword.
In the case of HTTP, the referrer
DOM attribute will match the Referer
(sic) header that was sent when fetching the
current page.
Typically user agents are configured to not report referrers
in the case where the referrer uses an encrypted protocol and the current
page does not (e.g. when navigating from an https:
page to an http:
page).
The cookie
attribute represents the cookies of the resource.
On getting, if the sandboxed
origin browsing context flag is set on the browsing context of the document, the user agent
must raise a security exception. Otherwise, it
must return the same string as the value of the Cookie
HTTP header it would include if fetching the
resource indicated by the document's address over HTTP, as
per RFC 2109 section 4.3.4. [RFC2109]
On setting, if the sandboxed origin browsing
context flag is set on the browsing context
of the document, the user agent must raise a security
exception. Otherwise, the user agent must act as it would when
processing cookies if it had just attempted to fetch the document's
address over HTTP, and had received a response with a
Set-Cookie
header whose value was the specified value, as per
RFC 2109 sections 4.3.1, 4.3.2, and 4.3.3. [RFC2109]
Since the cookie
attribute is accessible across frames,
the path restrictions on cookies are only a tool to help manage which
cookies are sent to which parts of the site, and are not in any way a
security feature.
The lastModified
attribute,
on getting, must return the date and time of the Document
's
source file's last modification, in the user's local timezone, in the
following format:
All the numeric components above, other than the year, must be given as two digits in the range U+0030 DIGIT ZERO to U+0039 DIGIT NINE representing the number in base ten, zero-padded if necessary.
The Document
's source file's last modification date and
time must be derived from relevant features of the networking protocols
used, e.g. from the value of the HTTP Last-Modified
header of the document, or from metadata in the file system for local
files. If the last modification date and time are not known, the attribute
must return the string 01/01/1970 00:00:00
.
A Document
is always set to one of three modes: no quirks mode, the default; quirks
mode, used typically for legacy documents; and limited quirks mode, also known as "almost standards"
mode. The mode is only ever changed from the default by the HTML parser, based on the presence, absence, or value
of the DOCTYPE string.
The compatMode
DOM attribute
must return the literal string "CSS1Compat
" unless
the document has been set to quirks mode by the HTML parser, in which case it must instead return the
literal string "BackCompat
".
As far as parsing goes, the quirks I know of are:
Documents have an associated character encoding. When a Document
object is created, the document's character
encoding must be initialized to UTF-16. Various algorithms during page
loading affect this value, as does the charset
setter. [IANACHARSET]
The charset
DOM attribute must,
on getting, return the preferred MIME name of the document's character encoding. On setting, if the
new value is an IANA-registered alias for a character encoding, the document's character encoding must be set to that
character encoding. (Otherwise, nothing happens.)
The characterSet
DOM
attribute must, on getting, return the preferred MIME name of the document's character encoding.
The defaultCharset
DOM
attribute must, on getting, return the preferred MIME name of a character
encoding, possibly the user's default encoding, or an encoding associated
with the user's current geographical location, or any arbitrary encoding
name.
Each document has a current document readiness.
When a Document
object is created, it must have its current document readiness set to the string
"loading". Various algorithms during page loading affect this value. When
the value is set, the user agent must fire a simple
event called readystatechanged
at the
Document
object.
The readyState
DOM attribute
must, on getting, return the current document
readiness.
The nodes representing HTML elements in the DOM must implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification. This includes XHTML elements in XML documents, even when those documents are in another context (e.g. inside an XSLT transform).
The basic interface, from which all the HTML
elements' interfaces inherit, and which must be used by elements that
have no additional requirements, is the HTMLElement
interface.
interface HTMLElement : Element { // DOM tree accessors NodeList getElementsByClassName(in DOMString classNames); // dynamic markup insertion attribute DOMString innerHTML; // metadata attributes attribute DOMString id; attribute DOMString title; attribute DOMString lang; attribute DOMString dir; attribute DOMString className; readonly attribute DOMTokenList classList; readonly attribute DOMStringMap dataset; // interaction attribute boolean irrelevant; attribute long tabIndex; void click(); void focus(); void blur(); void scrollIntoView(); void scrollIntoView(in boolean top); // commands attribute HTMLMenuElement contextMenu; // editing attribute boolean draggable; attribute DOMString contentEditable; readonly attribute DOMString isContentEditable; // styling readonly attribute CSSStyleDeclaration style; // data templates attribute DOMString template; readonly attribute HTMLDataTemplateElement templateElement; attribute DOMString ref; readonly attribute Node refNode; attribute DOMString registrationMark; readonly attribute DocumentFragment originalContent; // event handler DOM attributes attribute EventListener onabort; attribute EventListener onbeforeunload; attribute EventListener onblur; attribute EventListener onchange; attribute EventListener onclick; attribute EventListener oncontextmenu; attribute EventListener ondblclick; attribute EventListener ondrag; attribute EventListener ondragend; attribute EventListener ondragenter; attribute EventListener ondragleave; attribute EventListener ondragover; attribute EventListener ondragstart; attribute EventListener ondrop; attribute EventListener onerror; attribute EventListener onfocus; attribute EventListener onkeydown; attribute EventListener onkeypress; attribute EventListener onkeyup; attribute EventListener onload; attribute EventListener onmessage; attribute EventListener onmousedown; attribute EventListener onmousemove; attribute EventListener onmouseout; attribute EventListener onmouseover; attribute EventListener onmouseup; attribute EventListener onmousewheel; attribute EventListener onresize; attribute EventListener onscroll; attribute EventListener onselect; attribute EventListener onstorage; attribute EventListener onsubmit; attribute EventListener onunload; };
As with the HTMLDocument
interface, the HTMLElement
interface holds methods and attributes related to a number of disparate
features, and the members of this interface are therefore described in
various different sections of this specification.
Some DOM attributes are defined to reflect a particular content attribute. This means that on getting, the DOM attribute returns the current value of the content attribute, and on setting, the DOM attribute changes the value of the content attribute to the given value.
If a reflecting DOM attribute is a DOMString
attribute
whose content attribute is defined to contain a URI, then on getting, the
DOM attribute must return the value of the content attribute, resolved to
an absolute URI, and on setting, must set the content attribute to the
specified literal value. If the content attribute is absent, the DOM
attribute must return the default value, if the content attribute has one,
or else the empty string.
If a reflecting DOM attribute is a DOMString
attribute
whose content attribute is defined to contain one or more URIs, then on
getting, the DOM attribute must split the content attribute on spaces and return the
concatenation of each token URI, resolved to an absolute URI, with a
single U+0020 SPACE character between each URI; if the content attribute
is absent, the DOM attribute must return the default value, if the content
attribute has one, or else the empty string. On setting, the DOM attribute
must set the content attribute to the specified literal value.
If a reflecting DOM attribute is a DOMString
whose content
attribute is an enumerated attribute, and the
DOM attribute is limited to only known values,
then, on getting, the DOM attribute must return the conforming value
associated with the state the attribute is in (in its canonical case), or
the empty string if the attribute is in a state that has no associated
keyword value; and on setting, if the new value case-insensitively matches
one of the keywords given for that attribute, then the content attribute
must be set to the conforming value associated with the state that the
attribute would be in if set to the given new value, otherwise, if the new
value is the empty string, then the content attribute must be removed,
otherwise, the setter must raise a SYNTAX_ERR
exception.
If a reflecting DOM attribute is a DOMString
but doesn't
fall into any of the above categories, then the getting and setting must
be done in a transparent, case-preserving manner.
If a reflecting DOM attribute is a boolean attribute, then on getting the DOM attribute must return true if the attribute is set, and false if it is absent. On setting, the content attribute must be removed if the DOM attribute is set to false, and must be set to have the same value as its name if the DOM attribute is set to true. (This corresponds to the rules for boolean content attributes.)
If a reflecting DOM attribute is a signed integer type
(long
) then, on getting, the content attribute must be parsed
according to the
rules for parsing signed integers, and if that is successful, the
resulting value must be returned. If, on the other hand, it fails, or if
the attribute is absent, then the default value must be returned instead,
or 0 if there is no default value. On setting, the given value must be
converted to the shortest possible string representing the number as a valid integer in base ten and then that string must be
used as the new content attribute value.
If a reflecting DOM attribute is an unsigned integer type
(unsigned long
) then, on getting, the content attribute must
be parsed according to the rules for parsing unsigned integers, and if
that is successful, the resulting value must be returned. If, on the other
hand, it fails, or if the attribute is absent, the default value must be
returned instead, or 0 if there is no default value. On setting, the given
value must be converted to the shortest possible string representing the
number as a valid non-negative integer in base ten
and then that string must be used as the new content attribute value.
If a reflecting DOM attribute is an unsigned integer type
(unsigned long
) that is limited to only
positive non-zero numbers, then the behavior is similar to the
previous case, but zero is not allowed. On getting, the content attribute
must first be parsed according to the rules for parsing unsigned
integers, and if that is successful, the resulting value must be
returned. If, on the other hand, it fails, or if the attribute is absent,
the default value must be returned instead, or 1 if there is no default
value. On setting, if the value is zero, the user agent must fire an
INDEX_SIZE_ERR
exception. Otherwise, the given value must be
converted to the shortest possible string representing the number as a valid non-negative integer in base ten and then that
string must be used as the new content attribute value.
If a reflecting DOM attribute is a floating point number type
(float
) and the content attribute is defined to contain a
time offset, then, on getting, the content attribute must be parsed
according to the
rules for parsing time offsets, and if that is successful, the
resulting value, in seconds, must be returned. If that fails, or if the
attribute is absent, the default value must be returned, or the
not-a-number value (NaN) if there is no default value. On setting, the
given value, interpreted as a time offset in seconds, must be converted to
a string using the time offset serialization
rules, and that string must be used as the new content attribute
value.
If a reflecting DOM attribute is a floating point number type
(float
) and it doesn't fall into one of the earlier
categories, then, on getting, the content attribute must be parsed
according to the rules for parsing floating point number values, and
if that is successful, the resulting value must be returned. If, on the
other hand, it fails, or if the attribute is absent, the default value
must be returned instead, or 0.0 if there is no default value. On setting,
the given value must be converted to the shortest possible string
representing the number as a valid floating point
number in base ten and then that string must be used as the new
content attribute value.
If a reflecting DOM attribute is of the type DOMTokenList
, then on getting it must
return a DOMTokenList
object
whose underlying string is the element's corresponding content attribute.
When the DOMTokenList
object
mutates its underlying string, the content attribute must itself be
immediately mutated. When the attribute is absent, then the string
represented by the DOMTokenList
object is the empty string; when the object mutates this empty string, the
user agent must first add the corresponding content attribute, and then
mutate that attribute instead. DOMTokenList
attributes are always
read-only. The same DOMTokenList
object must be returned every time for each attribute.
If a reflecting DOM attribute has the type HTMLElement
, or an interface that descends
from HTMLElement
, then, on
getting, it must run the following algorithm (stopping at the first point
where a value is returned):
document.getElementById()
method would find if it was
passed as its argument the current value of the corresponding content
attribute.
On setting, if the given element has an id
attribute, then the content attribute must be set
to the value of that id
attribute. Otherwise, the DOM attribute must be set to the empty string.
The HTMLCollection
, HTMLFormControlsCollection
,
and HTMLOptionsCollection
interfaces
represent various lists of DOM nodes. Collectively, objects implementing
these interfaces are called collections.
When a collection is created, a filter and a root are associated with the collection.
For example, when the HTMLCollection
object for the document.images
attribute is created, it is associated with a filter that selects only
img
elements, and rooted at the root of
the document.
The collection then represents a live view of the subtree rooted at the collection's root, containing only nodes that match the given filter. The view is linear. In the absence of specific requirements to the contrary, the nodes within the collection must be sorted in tree order.
The rows
list is not in tree order.
An attribute that returns a collection must return the same object every time it is retrieved.
The HTMLCollection
interface
represents a generic collection of elements.
interface HTMLCollection { readonly attribute unsigned long length; [IndexGetter] Element item(in unsigned long index); [NameGetter] Element namedItem(in DOMString name); };
The length
attribute must
return the number of nodes represented by the
collection.
The item(index)
method must return the indexth node in the collection. If there is no indexth node in the collection, then the method must return
null.
The namedItem(key)
method must return the first node in the
collection that matches the following requirements:
a
, applet
, area
,
form
, img
, or object
element with a name
attribute equal to key,
or,
id
attribute equal to key.
(Non-HTML elements, even if they have IDs, are not searched for the
purposes of namedItem()
.)
If no such elements are found, then the method must return null.
The HTMLFormControlsCollection
interface represents a collection of form controls.
interface HTMLFormControlsCollection { readonly attribute unsigned long length; [IndexGetter] HTMLElement item(in unsigned long index); [NameGetter] Object namedItem(in DOMString name); };
The length
attribute must return the number of nodes represented by the collection.
The item(index)
method must return the indexth node in the collection. If there is no indexth node in the collection, then the method must return
null.
The namedItem(key)
method must act according to the
following algorithm:
id
attribute or a name
attribute equal to key, then return that node and
stop the algorithm.
id
attribute or a name
attribute equal to key,
then return null and stop the algorithm.
NodeList
object representing a live
view of the HTMLFormControlsCollection
object, further filtered so that the only nodes in the
NodeList
object are those that have either an id
attribute or a name
attribute equal to key.
The nodes in the NodeList
object must be sorted in tree order.
NodeList
object.
The HTMLOptionsCollection
interface
represents a list of option
elements.
interface HTMLOptionsCollection { attribute unsigned long length; [IndexGetter] HTMLOptionElement item(in unsigned long index); [NameGetter] Object namedItem(in DOMString name); };
On getting, the length
attribute
must return the number of nodes represented by the
collection.
On setting, the behavior depends on whether the new value is equal to,
greater than, or less than the number of nodes represented by the collection at that time. If the
number is the same, then setting the attribute must do nothing. If the new
value is greater, then n new option
elements with no attributes and no child nodes must be appended to the
select
element on which the HTMLOptionsCollection
is rooted,
where n is the difference between the two numbers (new
value minus old value). If the new value is lower, then the last n nodes in the collection must be removed from their parent
nodes, where n is the difference between the two
numbers (old value minus new value).
Setting length
never removes or adds any
optgroup
elements, and never adds new children to existing
optgroup
elements (though it can remove children from them).
The item(index)
method must return the indexth node in the collection. If there is no indexth node in the collection, then the method must return
null.
The namedItem(key)
method must act according to the
following algorithm:
id
attribute or a name
attribute equal to key, then return that node and
stop the algorithm.
id
attribute or a name
attribute equal to key,
then return null and stop the algorithm.
NodeList
object representing a live
view of the HTMLOptionsCollection
object,
further filtered so that the only nodes in the NodeList
object are those that have either an id
attribute or a name
attribute equal to key. The nodes in the NodeList
object must be
sorted in tree order.
NodeList
object.
We may want to add add()
and
remove()
methods here too because IE implements
HTMLSelectElement and HTMLOptionsCollection on the same object, and so
people use them almost interchangeably in the wild.
The DOMTokenList
interface
represents an interface to an underlying string that consists of an unordered set of unique space-separated tokens.
Which string underlies a particular DOMTokenList
object is defined when the
object is created. It might be a content attribute (e.g. the string that
underlies the classList
object is the class
attribute), or it might
be an anonymous string (e.g. when a DOMTokenList
object is passed to an
author-implemented callback in the datagrid
APIs).
[Stringifies] interface DOMTokenList { readonly attribute unsigned long length; [IndexGetter] DOMString item(in unsigned long index); boolean has(in DOMString token); void add(in DOMString token); void remove(in DOMString token); boolean toggle(in DOMString token); };
The length
attribute must return the number of unique tokens that result
from splitting the
underlying string on spaces.
The item(index)
method must split the underlying string on
spaces, sort the resulting list of tokens by Unicode
codepoint,
remove exact duplicates, and then return the indexth
item in this list. If index is equal to or greater
than the number of tokens, then the method must return null.
The has(token)
method must run the following
algorithm:
INVALID_CHARACTER_ERR
exception and stop the algorithm.
The add(token)
method must run the following
algorithm:
INVALID_CHARACTER_ERR
exception and stop the algorithm.
DOMTokenList
object's
underlying string then stop the algorithm.
DOMTokenList
object's underlying string
is not the empty string and the last character of that string is not a space character, then append a U+0020 SPACE character
to the end of that string.
DOMTokenList
object's
underlying string.
The remove(token)
method must run the following
algorithm:
INVALID_CHARACTER_ERR
exception and stop the algorithm.
The toggle(token)
method must run the following
algorithm:
INVALID_CHARACTER_ERR
exception and stop the algorithm.
DOMTokenList
object's
underlying string then remove the given token from the underlying
string, and stop the algorithm, returning false.
DOMTokenList
object's underlying string
is not the empty string and the last character of that string is not a space character, then append a U+0020 SPACE character
to the end of that string.
DOMTokenList
object's
underlying string.
Objects implementing the DOMTokenList
interface must stringify to the object's
underlying string representation.
The DOMStringMap
interface
represents a set of name-value pairs. When a DOMStringMap
object is instanced, it is
associated with three algorithms, one for getting values from names, one
for setting names to certain values, and one for deleting names.
The names of the methods on this interface are temporary and will be fixed when the Web IDL / "Language Bindings for DOM Specifications" spec is ready to handle this case.
interface DOMStringMap { [NameGetter] DOMString XXX1(in DOMString name); [NameSetter] void XXX2(in DOMString name, in DOMString value); [XXX] boolean XXX3(in DOMString name); };
The XXX1(name)
method must call the algorithm for
getting values from names, passing name as the name,
and must return the corresponding value, or null if name has no corresponding value.
The XXX2(name, value)
method must
call the algorithm for setting names to certain values, passing name as the name and value as the
value.
The XXX3(name)
method must call the algorithm for
deleting names, passing name as the name, and must
return true.
DOM3 Core defines mechanisms for checking for interface support, and for obtaining implementations of interfaces, using feature strings. [DOM3CORE]
A DOM application can use the hasFeature(feature, version)
method of the
DOMImplementation
interface with parameter values "HTML
" and "5.0
" (respectively) to determine
whether or not this module is supported by the implementation. In addition
to the feature string "HTML
", the feature string
"XHTML
" (with version string "5.0
") can
be used to check if the implementation supports XHTML. User agents should
respond with a true value when the hasFeature
method is queried with these
values. Authors are cautioned, however, that UAs returning true might not
be perfectly compliant, and that UAs returning false might well have
support for features in this specification; in general, therefore, use of
this method is discouraged.
The values "HTML
" and "XHTML
" (both with version "5.0
") should also
be supported in the context of the getFeature()
and
isSupported()
methods, as defined by DOM3 Core.
The interfaces defined in this specification are not always
supersets of the interfaces defined in DOM2 HTML; some features that were
formerly deprecated, poorly supported, rarely used or considered
unnecessary have been removed. Therefore it is not guaranteed that an
implementation that supports "HTML
"
"5.0
" also supports "HTML
"
"2.0
".
The html
element of a document is
the document's root element, if there is one and it's an html
element, or null otherwise.
The head
element of a document is
the first head
element that is a child of
the html
element, if there is one,
or null otherwise.
The title
element of a document is
the first title
element in the document
(in tree order), if there is one, or null otherwise.
The title
attribute must, on
getting, run the following algorithm:
If the root element is an svg
element in the "http://www.w3.org/2000/svg
"
namespace, and the user agent supports SVG, then the getter must return
the value that would have been returned by the DOM attribute of the same
name on the SVGDocument
interface.
Otherwise, it must return a concatenation of the data of all the child
text nodes of the title
element, in tree order, or
the empty string if the title
element is null.
On setting, the following algorithm must be run:
If the root element is an svg
element in the "http://www.w3.org/2000/svg
"
namespace, and the user agent supports SVG, then the setter must defer
to the setter for the DOM attribute of the same name on the
SVGDocument
interface. Stop the algorithm here.
title
element is null
and the head
element is null, then
the attribute must do nothing. Stop the algorithm here.
title
element is null,
then a new title
element must be
created and appended to the head
element.
title
element (if any) must all be removed.
Text
node whose data is the new value being
assigned must be appended to the title
element.
The title
attribute on the HTMLDocument
interface should shadow the
attribute of the same name on the SVGDocument
interface when
the user agent supports both HTML and SVG.
The body element of a document is the first
child of the html
element that is
either a body
element or a
frameset
element. If there is no such element, it is null. If
the body element is null, then when the specification requires that events
be fired at "the body element", they must instead be fired at the
Document
object.
The body
attribute, on getting, must return the body
element of the document (either a body
element, a frameset
element, or
null). On setting, the following algorithm must be run:
body
or
frameset
element, then raise a
HIERARCHY_REQUEST_ERR
exception and abort these steps.
replaceChild()
method had been called
with the new value and the
incumbent body element as its two arguments respectively, then abort
these steps.
The images
attribute must return an HTMLCollection
rooted at the
Document
node, whose filter matches only img
elements.
The embeds
attribute must return an HTMLCollection
rooted at the
Document
node, whose filter matches only embed
elements.
The plugins
attribute must
return the same object as that returned by the embeds
attribute.
The links
attribute must return an HTMLCollection
rooted at the
Document
node, whose filter matches only a
elements with href
attributes and area
elements with href
attributes.
The forms
attribute must return an HTMLCollection
rooted at the
Document
node, whose filter matches only form
elements.
The anchors
attribute must
return an HTMLCollection
rooted at the Document
node, whose filter matches only
a
elements with name
attributes.
The scripts
attribute must
return an HTMLCollection
rooted at the Document
node, whose filter matches only
script
elements.
The getElementsByName(name)
method a string name, and must return a live NodeList
containing all the a
, applet
, button
, form
,
iframe
,
img
, input
, map
, meta
,
object
,
select
, and textarea
elements in that document
that have a name
attribute whose value is
equal to the name
argument.
The getElementsByClassName(classNames)
method takes a string that
contains an unordered set of unique space-separated
tokens representing classes. When called, the method must return a
live NodeList
object containing all the elements in the
document that have all the classes specified in that argument, having
obtained the classes by splitting a string on spaces. If there are no tokens specified
in the argument, then the method must return an empty
NodeList
.
The getElementsByClassName()
method on the HTMLElement
interface must return a live NodeList
with the nodes that the
HTMLDocument
getElementsByClassName()
method
would return when passed the same argument(s), excluding any elements that
are not descendants of the HTMLElement
object on which the method was
invoked.
HTML, SVG, and MathML elements define which classes they are in by
having an attribute in the per-element partition with the name class
containing a space-separated list of classes to
which the element belongs. Other specifications may also allow elements in
their namespaces to be labeled as being in specific classes. UAs must not
assume that all attributes of the name class
for
elements in any namespace work in this way, however, and must not assume
that such attributes, when used as global attributes, label other elements
as being in specific classes.
Given the following XHTML fragment:
<div id="example"> <p id="p1" class="aaa bbb"/> <p id="p2" class="aaa ccc"/> <p id="p3" class="bbb ccc"/> </div>
A call to
document.getElementById('example').getElementsByClassName('aaa')
would return a NodeList
with the two paragraphs
p1
and p2
in it.
A call to getElementsByClassName('ccc bbb')
would
only return one node, however, namely p3
. A call to
document.getElementById('example').getElementsByClassName('bbb ccc ')
would return the same thing.
A call to getElementsByClassName('aaa,bbb')
would return
no nodes; none of the elements above are in the "aaa,bbb" class.
The dir
attribute on the HTMLDocument
interface is defined along
with the dir
content
attribute.
The document.write()
family of methods and
the innerHTML
family of DOM attributes enable script authors to dynamically insert
markup into the document.
bz argues that innerHTML should be called something else on XML documents and XML elements. Is the sanity worth the migration pain?
Because these APIs interact with the parser, their behavior varies depending on whether they are used with HTML documents (and the HTML parser) or XHTML in XML documents (and the XML parser). The following table cross-references the various versions of these APIs.
document.write()
| innerHTML
| |
---|---|---|
For documents that are HTML documents | document.write() in HTML
| innerHTML in HTML
|
For documents that are XML documents | document.write() in XML
| innerHTML
in XML
|
Regardless of the parsing mode, the document.writeln(...)
method
must call the document.write()
method with the same
argument(s), and then call the document.write()
method with, as its
argument, a string consisting of a single line feed character (U+000A).
The open()
method comes in several variants with different numbers of arguments.
When called with two or fewer arguments, the method must act as follows:
Let type be the value of the first argument, if
there is one, or "text/html
" otherwise.
Let replace be true if there is a second argument and it has the value "replace", and false otherwise.
If the document has an active parser
that isn't a script-created parser, and
the insertion point associated with that
parser's input stream is not undefined (that is,
it does point to somewhere in the input stream), then the
method does nothing. Abort these steps and return the
Document
object on which the method was invoked.
This basically causes document.open()
to be ignored when it's called
in an inline script found during the parsing of data sent over the
network, while still letting it have an effect when called
asynchronously or on a document that is itself being spoon-fed using
these APIs.
onbeforeunload, onunload, reset timers, empty event queue, kill any pending transactions, XMLHttpRequests, etc
If the document has an active parser, then stop that parser, and throw away any pending content in the input stream. what about if it doesn't, because it's either like a text/plain, or Atom, or PDF, or XHTML, or image document, or something?
Remove all child nodes of the document.
Change the document's character encoding to UTF-16.
Create a new HTML parser and associate it with
the document. This is a script-created
parser (meaning that it can be closed by the document.open()
and
document.close()
methods, and that the
tokeniser will wait for an explicit call to document.close()
before emitting an end-of-file token).
If type does not have the value
"text/html
", then act as if the
tokeniser had emitted a pre
element
start tag, then set the HTML parser's tokenisation stage's content model flag to PLAINTEXT.
If replace is false, then:
Document
's
History
object
Document
Document
object, as well as the state of the document at
the start of these steps. (This allows the user to step backwards in
the session history to see the page before it was blown away by the
document.open()
call.)
Finally, set the insertion point to point at just before the end of the input stream (which at this point will be empty).
Return the Document
on which the method was invoked.
We shouldn't hard-code text/plain
there. We
should do it some other way, e.g. hand off to the section on
content-sniffing and handling of incoming data streams, the part that
defines how this all works when stuff comes over the network.
When called with three or more arguments, the open()
method on the
HTMLDocument
object must call the
open()
method on the
Window
interface of the object returned
by the defaultView
attribute
of the DocumentView
interface of the HTMLDocument
object, with the same
arguments as the original call to the open()
method, and return whatever that method
returned. If the defaultView
attribute of the DocumentView
interface of the HTMLDocument
object is null, then the
method must raise an INVALID_ACCESS_ERR
exception.
The close()
method must do nothing if there is no script-created parser associated with the
document. If there is such a parser, then, when the method is called, the
user agent must insert an explicit "EOF"
character at the insertion point of the
parser's input stream.
In HTML, the document.write(...)
method must act as follows:
If the insertion point is undefined, the
open()
method
must be called (with no arguments) on the document
object. The insertion point will point at just before the end
of the (empty) input stream.
The string consisting of the concatenation of all the arguments to the method must be inserted into the input stream just before the insertion point.
If there is a script that will execute as soon as the parser resumes, then the method must now return without further processing of the input stream.
Otherwise, the tokeniser must process the characters that were
inserted, one at a time, processing resulting tokens as they are
emitted, and stopping when the tokeniser reaches the insertion point or
when the processing of the tokeniser is aborted by the tree construction
stage (this can happen if a script
start tag token is emitted by the tokeniser).
If the document.write()
method was called
from script executing inline (i.e. executing because the parser parsed a
set of script
tags), then this is a
reentrant invocation of the parser.
Finally, the method must return.
In HTML, the innerHTML
DOM attribute of all
HTMLElement
and HTMLDocument
nodes returns a serialization
of the node's children using the HTML syntax.
On setting, it replaces the node's children with new nodes that result
from parsing the given value. The formal definitions follow.
On getting, the innerHTML
DOM attribute must return the
result of running the HTML fragment serialization
algorithm on the node.
On setting, if the node is a document, the innerHTML
DOM
attribute must run the following algorithm:
If the document has an active parser, then stop that parser, and throw away any pending content in the input stream. what about if it doesn't, because it's either like a text/plain, or Atom, or PDF, or XHTML, or image document, or something?
Remove the children nodes of the Document
whose innerHTML
attribute is being set.
Create a new HTML parser, in its initial state,
and associate it with the Document
node.
Place into the input stream for the HTML parser just created the string being assigned
into the innerHTML
attribute.
Start the parser and let it run until it has consumed all the
characters just inserted into the input stream. (The
Document
node will have been populated with elements and a
load
event will have
fired on its body
element.)
Otherwise, if the node is an element, then setting the innerHTML
DOM
attribute must cause the following algorithm to run instead:
Invoke the HTML fragment parsing
algorithm, with the element whose innerHTML
attribute is being set as the
context element, and the string being assigned into
the innerHTML
attribute as the input. Let new children be the result
of this algorithm.
Remove the children of the element whose innerHTML
attribute is being set.
Let target document be the ownerDocument
of the Element
node whose
innerHTML
attribute is being set.
Set the ownerDocument
of all the nodes in new children to the target document.
Append all the new children nodes to the node
whose innerHTML
attribute is being set,
preserving their order.
script
elements inserted
using innerHTML
do not execute when they are
inserted.
In an XML context, the document.write()
method
must raise an INVALID_ACCESS_ERR
exception.
On the other hand, however, the innerHTML
attribute is indeed
usable in an XML context.
In an XML context, the innerHTML
DOM attribute on HTMLElement
s must return a string in the
form of an internal
general parsed entity, and on HTMLDocument
s must return a string in the
form of a document
entity. The string returned must be XML namespace-well-formed and must
be an isomorphic serialization of all of that node's child nodes, in
document order. User agents may adjust prefixes and namespace declarations
in the serialization (and indeed might be forced to do so in some cases to
obtain namespace-well-formed XML). If any of the elements in the
serialization are in no namespace, the default namespace in scope for
those elements must be explicitly declared as the empty
string.
[XML] [XMLNS]
If any of the following cases are found in the DOM being serialized, the
user agent must raise an INVALID_STATE_ERR
exception:
Document
node with no child element nodes.
DocumentType
node that has an external subset public
identifier or an external subset system identifier that contains both a
U+0022 QUOTATION MARK ('"') and a U+0027 APOSTROPHE ("'").
Attr
node, Text
node,
CDATASection
node, Comment
node, or
ProcessingInstruction
node whose data contains characters
that are not matched by the XML Char
production. [XML]
CDATASection
node whose data contains the string "]]>
".
Comment
node whose data contains two adjacent U+002D
HYPHEN-MINUS (-) characters or ends with such a character.
ProcessingInstruction
node whose target name is the
string "xml
" (case insensitively).
ProcessingInstruction
node whose target name contains a
U+003A COLON (":").
ProcessingInstruction
node whose data contains the
string "?>
".
These are the only ways to make a DOM unserializable. The DOM
enforces all the other XML constraints; for example, trying to set an
attribute with a name that contains an equals sign (=) will raised an
INVALID_CHARACTER_ERR
exception.
On setting, in an XML context, the innerHTML
DOM attribute on HTMLElement
s and HTMLDocument
s must run the following
algorithm:
The user agent must create a new XML parser.
If the innerHTML
attribute is being set on an
element, the user agent must feed the parser just created
the string corresponding to the start tag of that element, declaring all
the namespace prefixes that are in scope on that element in the DOM, as
well as declaring the default namespace (if any) that is in scope on
that element in the DOM.
The user agent must feed the parser just created the
string being assigned into the innerHTML
attribute.
If the innerHTML
attribute is being set on an
element, the user agent must feed the parser the string
corresponding to the end tag of that element.
If the parser found a well-formedness error, the attribute's setter
must raise a SYNTAX_ERR
exception and abort these steps.
The user agent must remove the children nodes of the node whose innerHTML
attribute is being set.
If the attribute is being set on a Document
node, let
new children be the children of the document,
preserving their order. Otherwise, the attribute is being set on an
Element
node; let new children be the
children of the document's root element, preserving their order.
If the attribute is being set on a Document
node, let
target document be that Document
node.
Otherwise, the attribute is being set on an Element
node;
let target document be the ownerDocument
of that Element
.
Set the ownerDocument
of all the nodes in new children to the target document.
Append all the new children nodes to the node
whose innerHTML
attribute is being set,
preserving their order.
script
elements inserted
using innerHTML
do not execute when they are
inserted.
For HTML documents, and for HTML elements in HTML documents, certain APIs defined in DOM3 Core become case-insensitive or case-changing, as sometimes defined in DOM3 Core, and as summarized or required below. [DOM3CORE].
This does not apply to XML documents or to elements that are not in the HTML namespace despite being in HTML documents.
Element.tagName
, Node.nodeName
, and Node.localName
These attributes return tag names in all uppercase and attribute names in all lowercase, regardless of the case with which they were created.
Document.createElement()
The canonical form of HTML markup is all-lowercase; thus, this method will lowercase the argument before creating the requisite element. Also, the element created must be in the HTML namespace.
This doesn't apply to Document.createElementNS()
. Thus, it is possible, by
passing this last method a tag name in the wrong case, to create an
element that claims to have the tag name of an element defined in this
specification, but doesn't support its interfaces, because it really has
another tag name not accessible from the DOM APIs.
Element.setAttributeNode()
When an Attr
node is set on an HTML element, it must have its name
lowercased before the element is affected.
This doesn't apply to Document.setAttributeNodeNS()
.
Element.setAttribute()
When an attribute is set on an HTML element, the name argument must be lowercased before the element is affected.
This doesn't apply to Document.setAttributeNS()
.
Document.getElementsByTagName()
and Element.getElementsByTagName()
These methods (but not their namespaced counterparts) must compare the given argument case-insensitively when looking at HTML elements, and case-sensitively otherwise.
Thus, in an HTML document with nodes in multiple namespaces, these methods will be both case-sensitive and case-insensitive at the same time.
Document.renameNode()
If the new namespace is the HTML namespace, then the new qualified name must be lowercased before the rename takes place.