28.06.2022
HTML tree view. WDH: DHTML - Document Object Model
The DOM API is not particularly complex, but before we discuss programming with the DOM, there are some DOM architecture issues to be understood.
Representing documents as trees
HTML documents have a hierarchical structure, represented in the DOM as a tree. Tree nodes represent different types of document content. First of all, the tree view of an HTML document contains nodes representing elements or tags such as and
And nodes representing lines of text. An HTML document can also contain nodes representing HTML comments.1 Consider the following
a simple HTML document.
Sample Document
An HTML DocumentThis is a simple document.
For those not yet familiar with tree structures in computer programming, it is helpful to know that they borrow terminology from family trees. The node located directly above a given node is called the parent of that node. Nodes that are one level below another node are children of that node.
Nodes that are at the same level and have the same parent are called siblings. Nodes located at any
the number of levels below another node are its children. Parent, grandparent and any other nodes located above a given node are its ancestors.
Nodes
The DOM tree structure is a tree of Node objects of various types. The Node1 interface defines properties and methods for navigating and manipulating the tree. The childNodes property of a Node object returns a list of child nodes; the firstChild, lastChild, nextSibling, previousSibling, and parentNode properties provide a way to traverse the tree nodes. Methods such as appendChild(), removeChild(), replaceChild(), and insertBefore() allow you to add nodes to the document tree and remove them.
Node types
Node types in the document tree are represented by special subinterfaces of the Node interface. Any Node object has a nodeType property that determines the type of the node. If the nodeType property of a node is equal to, for example, the constant Node.ELEMENT_NODE, then the Node object is also an Element object, and you can use all the methods and properties defined by the Element interface with it.
This object's document-Element property refers to an Element object representing the document's root element. For HTML documents, this is a tag that is explicitly or implicitly present in the document. (In addition to the root element, a Document node can have other child elements, such as Comment objects.)
The bulk of the DOM tree consists of Element objects representing tags such as and , and Text objects representing lines of text. If the document parser stores comments, those comments are represented in the tree as DOM Comment objects.
Attributes
Element attributes (for example, the src and width attributes of a tag ) can be read, set, and removed using the getAttribute(), set-Attribute(), and removeAttribute() methods of the Element interface.
Another, less convenient way to work with attributes is the getAttribute-Node() method, which returns an Attr object representing the attribute and its value. (One reason for choosing this less convenient technology is that the Attr interface has a specified property, which allows you to determine whether a given attribute is explicitly specified in the document or has a default value.) Note, however, that Attr objects are not present in the element's childNodes array and are not directly part of the document tree like the Element and Text nodes.
The DOM specification allows Attr nodes to be accessed through the attributes array of the Node interface, but Microsoft Internet Explorer defines a different, incompatible attributes array, making it impossible to use this array in a portable manner.
DOM HTML API
The DOM standard is designed to work with both XML and HTML. The basic DOM API—the Node, Element, Document, and other interfaces—is relatively universal and applies to both types of documents. The DOM standard also includes interfaces specific to HTML documents. HTMLDocument is an HTML-specific subinterface of the Document interface, and HTMLElement is an HTML-specific subinterface of the Element interface. In addition, the DOM defines tag-specific interfaces for many HTML elements. These interfaces, such as HTMLBodyElement and HTMLTitleElement, typically define a set of properties that reflect the attributes of a given HTML tag. The HTMLDocument interface defines various document properties and methods that were supported by browsers before the W3C standard. These include the location property, the forms array, and the write() method.
The HTMLElement interface defines the id, style, title, lang, dir, and className properties, which provide convenient access to the values of the id, style, title, lang, dir, and class attributes that can be used with all HTML tags.
HTML tags do not accept any attributes other than the six just listed, and therefore are fully represented by the HTMLElement interface.
For all other HTML tags in the part of the DOM specification related to
to HTML, special interfaces are defined. For many HTML tags
these interfaces do nothing other than provide a set of properties corresponding to HTML attributes. For example, tegu
- there is a corresponding HTMLU ListElement interface, and the tag has a corresponding HTMLBodyElement interface. Because these interfaces simply define properties that are standardized in HTML, they are not documented in detail in this book.
You can safely assume that the HTMLElement object representing a particular HTML tag has properties for each of that tag's standard attributes (see naming conventions in the next section). Note that the DOM standard defines properties for HTML attributes for the convenience of script writers. A common (and perhaps preferred) way to read and set attribute values is provided by the getAttribute() and setAttribute() methods of the Element object. Some of the interfaces described in the HTML DOM define additional properties or methods other than those corresponding to HTML attribute values. For example, the HTMLInputElement interface defines the focus() and blur() methods, and the HTMLFormElement interface defines the submit() and reset() methods and the length property. Such methods and properties were generally present before the DOM standardization and were made part of the standard for backward compatibility with accepted programming practice. Such interfaces are documented in the W3C DOM Reference (Part V). Additionally, information about the "best practice" portions of these interfaces can be found in Part IV of the Client-side JavaScript Reference, although this information is often listed under the name used before DOM standardization, such as HTMLFormElement and HTMLInputElement in the Reference for client-side JavaScript" are described in the "Form" and "Input" sections.
Naming conventions for HTML
When working with HTML-specific parts of the DOM standard, there are some simple naming conventions to keep in mind. HTML interface-specific property names begin with lowercase letters. If the property name consists of more than one word, the first letters of the second and subsequent words are capitalized. Thus, the maxlength attribute of the tag is translated into the maxLength property of the HTMLInputElement interface.
When an HTML attribute name conflicts with a JavaScript keyword, the name is prefixed with "html" to resolve the conflict. For example, the for attribute of a tag is translated into the htmlFor property of the HTMLLabelElement interface. The exception to this rule is the class attribute (which can be specified for any HTML element) - it is translated into the className1 property of the HTMLElement interface.
DOM levels and capabilities
There are two versions, or two "levels", of the DOM standard. DOM Level 1 was standardized in October 1998. It defines basic DOM interfaces such as Node, Element, Attr, and Document, as well as various HTML-specific interfaces. DOM Level 2 was standardized in November 2000.2 In addition to some changes to the core interfaces, this version of the DOM was greatly expanded by defining standard APIs for working with document events and cascading style sheets (CSS), as well as providing additional tools for working with continuous areas of documents. At the time of writing this book working group The W3C's DOM standardizes DOM Level 3. Additionally, you may sometimes see references to DOM Level 0. This term does not refer to any formal standard, but serves as an informal reference to the common document object model facilities implemented in Netscape and the Internet Explorer before standardization by the W3C consortium. With DOM Level 2, the standard became modular. The core module, which defines the basic tree structure of a document using (among others) the Document, Node, Element, and Next interfaces, is the only module that is required. All other modules are optional and may or may not be supported, depending on the implementation. A web browser's DOM implementation obviously needs to support the HTML module, since web documents are written in HTML. Browsers that support CSS style sheets typically support the StyleSheets and CSS modules, since CSS styles play a key role in Dynamic HTML programming. Likewise, since most interesting JavaScript programs require event handling, we can expect web browsers to support the Events module of the DOM specification.
Unfortunately, the Events module was only recently defined by the DOM Level 2 specification and was not widely supported at the time of this writing.
DOM compliance
At the time of this writing, there is no browser that fully complies with the DOM standard. Recent Mozilla releases have come the closest to this, and full DOM Level 2 compatibility is a goal of the Mozilla project. The Netscape 6.1 browser complies with most of the important Level 2 modules, while Netscape 6.0 has fairly good compatibility, but with some gaps. Internrt Explorer 6 is mostly compatible (with at least one unfortunate exception) with DOM Level 1, but does not support many Level 2 modules, in particular the Events module. Internet Explorer 5 and 5.5 have significant compatibility gaps, but support key DOM Level 1 methods well enough to run most of the examples in this chapter. The Macintosh version of IE has significantly more comprehensive DOM support than IE 5 for Windows. Besides Mozilla, Netscape, Internet Explorer and several other browsers offer at least partial DOM support. The number of browsers available has become too large, and changes in standards support are happening too quickly, for this book to attempt to state definitively what DOM features a particular browser supports. Therefore, you will have to rely on other sources of information to determine the compatibility of the DOM implementation in any given browser. One source of compatibility information is the implementation itself. In a compliant implementation, the implementation property of the Document object refers to a DOMImplementation object that defines a method called hasFeature(). This method (if it exists) can be used to obtain information about whether a specific DOM standard module (or features) is supported. For example, you can determine whether a web browser's DOM implementation supports basic DOM Level 1 interfaces for working with HTML documents using the following code:
If (document.implementation &&
document.implementation.hasFeature &&
document.implementation.hasFeature("html", "1.0")) (
// Browser claims to support basic Level 1 interfaces
// and HTML interfaces
}
The hasFeature() method takes two arguments: the first is the name of the module being checked, and the second is the version number as a string. It returns true if the specified version of the specified module is supported.
For example, if hasFeature() indicates that the MouseEvents module is supported, this implies that the UIEvents module is also supported, which in turn implies support for the Events, Views, and Core modules. In Internet Explorer 6 (on Windows), hasFeature() returns true only for the "HTML" module and version "1.0". It does not report compatibility with any other modules
In Netscape 6.1, hasFeature() returns true for most module names and version numbers, with the exceptions of the Traversal and Mutation-Events modules. The method returns false for Core and CSS2 version 2.0 modules, indicating incomplete compatibility (even though support for these modules is very good).
This book documents the interfaces that make up all DOM modules. The Core, HTML, Traversal and Range modules are covered in this chapter. The StyleSheets, CSS, and CSS2 modules are covered in Chapter 18, and the various event-related modules (except MutationEvents) are covered in Chapter 19. Part V, The W3C DOM Reference, contains Full description all modules.
The hasFeature() method is not completely reliable. As noted above, IE 6 reports Level 1 compatibility with HTML tools, even though there are some compatibility issues. On the other hand, Netscape 6.1 reports incompatibility with Level 2 Core, even though this browser is almost compatible with this module. In both cases, more detailed information is needed about what exactly is compatible and what is not. But the volume of this information is too large and too variable to be included in a printed publication.
Those who are active in web development no doubt already know or will soon learn about many browser-specific compatibility details. There are also resources on the Internet that may be helpful. Most importantly, the W3C (in collaboration with the US National Institute of Standards and Technology) is working to create an open source toolkit for testing DOM implementations. On
As of this writing, development of the test suite is just beginning, but it should provide an invaluable means of fine-grained testing of DOM implementation compatibility. Details can be found at http://www.w3c.org/DOM/Test/.
The Mozilla organization has several test suites for various standards, including DOM Level 1 (available at http://www.mozilla.org/qualitybrowser_sc.html). Netscape has published a test suite that includes some tests for DOM Level 2 (available at http://developer.netscape.com/evangelism/tools/testsuites/). Netscape also published a biased (and outdated) DOM compatibility comparison of early Mozilla releases and IE 5.5 (available at http://home.netscape.com/browsers/future/standards.html). Finally, you can also find compatibility and compliance information on independent sites on the Internet. One worth mentioning site is published by Peter-Paul Koch. A link to the DOM compatibility table can be found on its main JavaScript page (http://www.xs4all.nl/~ppk/js/).
Internet Explorer DOM Compatibility
Since IE is the most widely used web browser, a few special notes about its compatibility with the DOM specifications are in order here. IE 5 and later support basic Level 1 and HTML features well enough to run the examples in this chapter, and support key Level 2 CSS features well enough to run most of the examples8. Unfortunately, IE 5, 5.5 and 6 do not support the Events module from DOM Level 2, even though Microsoft was involved in defining this module and had plenty of time to implement it in IE 6. As we will see in Chapter 19, processing Events play a key role in client-side JavaScript, and IE's lack of support for a standard event-handling model makes it difficult to develop rich client-side web applications. Although IE 6 claims (through its hasFeature() method) to support DOM Level 1 core and HTML interfaces, in fact this support is not complete. The most glaring issue you're likely to encounter is a small but annoying one: IE doesn't support node type constants defined in the Node interface. Recall that every node in a document has a nodeType property that specifies the type of that node. The DOM specification also states that the Node interface defines constants that represent each of the node types it defines. For example, the constant Node.ELEMENT_NODE represents an Element node. In IE (at least up to and including version 6) these constants simply do not exist.
The examples in this chapter have been modified to work around this obstacle and contain integer literals instead of the corresponding symbolic constants.
For example:
if (n.nodeType == 1 /*Node.ELEMENT_NODE*/)
// Check that n is an Element object
Good programming style requires that constants be placed in the code rather than hard-coded integer literals, and those who want to make the code portable can include the following code in the program to define constants if they are missing:
If (!window.Node) (
var Node = ( // If there is no Node object, define
ELEMENT_NODE: 1, // it with the following properties and values.
ATTRIBUTE_NODE: 2, // Note that these are node types only
TEXT_NODE: 3, // HTML. For XML nodes you need to define
COMMENT_NODE: 8, // other constants here.
DOCUMENT_NODE: 9,
DOCUMENT_FRAGMENT_NODE: 11
}
}
Language-independent DOM interfaces
Although the DOM standard was born out of a desire to have a common API for programming dynamic HTML, the DOM is not just of interest to web programmers. In fact, the standard is now most heavily used by Java and C++ server programs for parsing and manipulating XML documents. Due to its many use cases, the DOM standard was defined as language independent. This book only covers binding the DOM API to JavaScript, but there are a few other things to keep in mind. First, note that object properties in JavaScript bindings typically correspond to the get/set method pair in other languages. Therefore, when a Java programmer asks you about the getFirstChild() method of the Node interface, you need to understand that in JavaScript the Node API binding does not define a getFirst-Child() method. Instead, it simply defines the firstChild property, and reading this property in JavaScript is equivalent to calling the getFirstChild() method in Java. Another important feature of binding the DOM API to JavaScript is that some DOM objects behave like JavaScript arrays. If an interface defines a method named item(), objects implementing that interface behave the same as read-only arrays with a numeric index. Suppose that as a result of reading the childNodes property of a node, a NodeList object is obtained. Individual Node objects from a list can be obtained, firstly, by passing the number of the desired node to the item() method, and secondly, by treating the NodeList object as an array and accessing it by index. The following code illustrates these two possibilities:
Var n = document.documentElement; // This is a Node object.
var children = n.childNodes; // This is a NodeList object.
var head = children.item(0); // This is one way
// use NodeList.
var body = children; // But there is an easier way!
Likewise, if a DOM object has a namedItem() method, passing a string to that method is the same as using the string as an array index. For example, the following lines of code represent equivalent ways to access a form element:
Var f = document.forms.namedItem("myform");
var g = document.forms["myform"];
var h = document.forms.myform;
The DOM standard can be used in a variety of ways, so the standard's developers carefully defined the DOM API in a way that does not limit the ability of other developers to implement the API. In particular, the DOM standard defines interfaces instead of classes. In object-oriented programming, a class is a fixed data type that must be implemented exactly as it is defined. On the other hand, an interface is a collection of methods and properties that must be implemented together. Therefore, a DOM implementation can define any classes it sees fit, but those classes must define methods and properties of the various DOM interfaces. This architecture has a couple of important consequences. First, the class names used in the implementation may not directly correspond to the interface names used in the DOM standard (and in this book). Secondly, one class can implement more than one interface. Consider, for example, a Document object. This object is an instance of some class defined by the web browser implementation. We don't know what class it is, but we do know that it implements the Document interface; that is, all methods and properties defined by the Document interface are available to us through the Document object. Since web browsers work with HTML documents, we also know that the Document object implements the interface
HTMLDocument and that we also have access to all the methods and properties defined by this interface. Additionally, if the web browser supports CSS style sheets and implements the CSS DOM module, the Document object also implements the DocumentStyle and DocumentCSS DOM interfaces. And if the web browser supports the Events and Views modules, Document also implements the DocumentEvent and DocumentView interfaces.
The DOM is broken into independent modules, so it defines several additional minor interfaces, such as DocumentStyle, DocumentEvent, and DocumentView, each of which defines only one or two methods. Such interfaces are never implemented independently of the underlying Document interface, and for this reason I do not describe them separately. If you read the description of the Document interface in the W3C DOM Reference, you will find that it also lists the methods and properties of various additional interfaces. Likewise, by looking at the description of additional interfaces, you will simply find a cross-reference to the base interface they are associated with. Exceptions to this rule are cases where the additional interface is complex. For example, the HTMLDocument interface is always implemented by the same object that implements the Document interface, but since HTMLDocument
adds a significant amount of new functionality, I've given it its own man page.
It is also important to understand that because the DOM standard defines interfaces and not classes, it does not define any constructor methods. If, for example, you want to create a new Text object to insert into a document, you can't just write:
Var t = new Text("this is a new text node"); // There is no such constructor!
The DOM standard cannot define constructors, but it does define several useful factory methods in the Document interface for creating objects. Therefore, to create a new Text node in the document, you need to write:
Var t = document.createTextNode("this is a new text node");
Factory methods defined in the DOM have names that begin with the word "create". In addition to the factory methods defined by the Document interface, several such factory methods are defined by the DOMImplementation interface and are accessible through document.implementation.
younger). All tree elements are descendants root, and that is their ancestor. In this case, all elements and texts that form their contents are nodes document tree.
Each element in this tree corresponds to an HTML element and therefore has tag(s), content, and a set of attributes. To move to the document object model, there is only one step left to take: call all tree elements objects, and make their attributes readable and changeable from scripts and applets. As a result, the tree of HTML document elements becomes dynamically managed; Moreover, we can now easily add new properties to each element, in addition to the standard HTML attributes.
It was this approach that was used as the basis for the dynamic HTML model of Microsoft browsers, and then adopted as the basis for W3C standards, called document object model(Document Object Model or DOM). At the same time, W3C expanded the concept of DOM to any XML documents, considering the HTML DOM as a specialized special case with additional features. Thus, DOM is a platform- and language-independent HTML and XML document model that defines:
- interfaces and objects that are used to represent and manipulate a document;
- the semantics of these interfaces and objects, including their attributes and reactions to events;
- relationships between these interfaces and objects.
To date, the W3C has standardized DOM levels one and two (DOM 1 and DOM 2);
- DOM 3 is in working draft stage. These acronyms respectively stand for the following:
- DOM 1 describes the basic representation of XML and HTML documents as trees of objects;
- DOM 2 extends the core DOM 1 interfaces and adds support for events and styles;
DOM 3 describes loading and parsing documents, as well as their display and formatting.
- Given the current state of things, we're only considering DOM 2 (and the DOM 1 it contains) here.
- DOM 2 consists of the following groups of interrelated interfaces:
- Core basic interfaces that define the tree view of any XML document;
- View interfaces describing possible document displays;
- Event interfaces that determine the order of generation and processing of events;
- Style interfaces that define the application of style sheets to documents;
Traversal & Range interfaces that define the traversal of the document tree and the manipulation of areas of its content;
HTML interfaces that define a tree view of an HTML document.DOM 2 Core represents XML documents as trees consisting of nodes, which in turn are also objects and implement more specialized interfaces. Some types of nodes can have children, that is, they themselves are subtrees, while others are leaves, that is, they do not have children. The following table summarizes all possible abstract document node types;
Interface Description Children | Document | |
Element (max. one), ProcessingInstruction , Comment , DocumentType (max. one) | Document fragment | |
Element, ProcessingInstruction, Comment, Text, CDATASection, EntityReference | DocumentType | Document type |
has no children | EntityReference | Document fragment |
Section link | Element | Document fragment |
Element | Attr | Attribute |
Text, EntityReference | ProcessingInstruction | Document type |
XML Directive | Comment | Document type |
A comment | Text | Document type |
Text | CDATASection | Document type |
CDATA section | Entity | Document fragment |
Chapter | Notation | Document type |
Notation In addition, DOM 2 Core contains the specification of the NodeList (ordered lists of nodes accessible by number in the list) and NamedNodeMap (unordered lists of nodes accessible by their name) interfaces. These objects are alive
, i.e. any change in a document automatically entails a change in all lists associated with it. It should be emphasized that DOM 2 Core contains two sets of interfaces, each of which provides full access to all document elements. The first set represents an object-oriented approach with the following inheritance hierarchy: document its constituent elements their attributes and text content. When considering the document tree in this way, we are talking about object hierarchy . The second approach is based on the principle “everything is nodes”. Here, all components of the document are considered as equal nodes of its tree, and we can only talk about
All DOM 2 Core interfaces are divided into basic (fundamental) and additional (extended).
The main interfaces are , , , , Node , NodeList , NamedNodeMap , CharacterData , Attr , Element , Text and Comment . These interfaces must be supported by all DOM implementations, for both XML and HTML documents. Additional interfaces target XML documents, so HTML DOM implementations may not support them. These include CDATASection, DocumentType, Notation, Entity, EntityReference, and ProcessingInstruction.
To be language and platform independent, the DOM defines the following types: DOMString A text string consisting of Unicode characters in UTF-16 format. In JavaScript and Java it is implemented by the String type.short description
all DOM interfaces indicating the model level (DOM 1 or DOM 2) in which this or that interface property is defined. The W3C specifications are written in the platform-independent language IDL. We present them in accordance with the syntax of JavaScript, which is the main scripting language today.Along with a description of the standard, we provide a brief description of its implementation in the Microsoft and Gecko object models. It should be noted that Microsoft's implementations for XML and HTML are completely independent (they are implemented by the XMLDOM and MSHTML software components, respectively), while in Gecko the object model is the same for HTML and XML documents. The following discussion focuses on the DOM for HTML; XML DOM will be discussed in detail in Part VIII.
Table 4.2. Standard DOM Exceptions | 1 | Name Value Description Model | INDEX_SIZE_ERR |
The index is out of range. | 2 | DOM 1 | INDEX_SIZE_ERR |
DOMSTRING_SIZE_ERR | 3 | The given text cannot be cast to type . | INDEX_SIZE_ERR |
HIERARCHY_REQUEST_ERR | 4 | An attempt was made to insert a node into the wrong place in the tree. | INDEX_SIZE_ERR |
WRONG_DOCUMENT_ERR | 5 | Invalid document type. | INDEX_SIZE_ERR |
INVALID_CHARACTER_ERR | 6 | An invalid character was encountered. | INDEX_SIZE_ERR |
NO_DATA_ALLOWED_ERR | 7 | An attempt was made to modify an object in an illegal manner. | INDEX_SIZE_ERR |
NOT_FOUND_ERR | 8 | Accessing a non-existent node. | INDEX_SIZE_ERR |
NOT_SUPPORTED_ERR | 9 | The parameter or operation is not implemented. | INDEX_SIZE_ERR |
INUSE_ATTRIBUTE_ERR | 10 | An attempt was made to add an attribute that already exists. | INDEX_SIZE_ERR |
INVALID_STATE_ERR | 11 | Referring to a non-existent object. | INDEX_SIZE_ERR |
SYNTAX_ERR | 12 | Syntax error. | DOM 2 |
INVALID_MODIFICATION_ERR | 13 | An attempt was made to change the type of an object. | DOM 2 |
NAMESPACE_ERR | 14 | An attempt was made to create or modify an object that does not conform to the XML namespace. | DOM 2 |
INVALID_ACCESS_ERR | 15 | The parameter or operation is not supported by the object. | DOM 2 |
Some error codes are supported.
4.2.4. Implementation description: DOMImplementation interface Support: For XML documents only (XMLDOMImplementation). Complies with DOM 1. : The DOMImplementation interface contains methods whose execution does not depend on a specific document object model. It is accessible through the object's . createCSSStyleSheet methodSyntax
an object Complies with DOM 1. : The DOMImplementation interface contains methods whose execution does not depend on a specific document object model. It is accessible through the object's ..createCSSStyleSheet(title, media) Arguments: title, media type expressions Result: new CSSStyleSheet object Exceptions: SYNTAX_ERR SupportThe createCSSStyleSheet method creates a new CSSStyleSheet object and returns a pointer to it. This method should only be supported by DOM implementations that support CSS.
The object is created outside the document context; DOM 2 does not allow you to include a newly created style sheet in a document. Complies with DOM 1. : The DOMImplementation interface contains methods whose execution does not depend on a specific document object model. It is accessible through the object's . The title argument specifies the title of the style sheet, and media specifies a comma-separated list of display devices.The createDocumentType method creates an empty DocumentType node and returns a pointer to it. It is intended for XML documents and may not be supported for HTML documents. The qualifiedName argument specifies the qualified name of the document type to be created, publicId is the public identifier of the external section, and systemId is the system identifier of the external section.
hasFeature method Complies with DOM 1. : The DOMImplementation interface contains methods whose execution does not depend on a specific document object model. It is accessible through the object's ..hasFeature(feature, version) Arguments : feature, version type expressions Result : boolean Exceptions : none SupportThe hasFeature method returns true if the DOM implementation supports the specified feature, and false otherwise. The property name (in any case) is specified by the feature argument; it must follow XML naming conventions. The version argument specifies the version name of the property being checked. If not specified, true is returned if at least some version of this property is supported.
In Gecko, the feature values can be the strings "XML" and "HTML", and the version value can be the strings "1.0" and "2.0". Example:
Alert(document.implementation.hasFeature("HTML", "1.0")); alert(document.implementation.hasFeature("HTML", "2.0")); alert(document.implementation.hasFeature("HTML", "3.0"));
The first two alert statements will output the string true , and the third will output false .
In Microsoft XMLDOM, the feature values can be the strings "XML", "DOM", and "MS-DOM", and the version value can be the string "1.0". Example:
Var objDoc = new ActiveXObject("Microsoft.XMLDOM"); alert(objDoc.implementation.hasFeature("XML", "1.0")); alert(objDoc.implementation.hasFeature("XML", "2.0"));
The first alert statement will output the string true , and the second one will output false .
4.2.5. Document fragment: DocumentFragment interface Support: For XML documents only (XMLDOMDocumentFragment).Complies with DOM 1. The DocumentFragment interface is a descendant of the Node interface and inherits all its properties and methods. As its name suggests, it is designed for operations with fragments of documents
(extracting part of the document tree, creating a new document fragment, inserting a fragment as a child of a node, etc.). Note that when inserting an object of type DocumentFragment into a Node that can have children, all children of that object are inserted, but not the object itself. For examples, see the Node interface.The Document interface corresponds to an XML or HTML document. It is the basis for accessing the content of the document and for creating its components.
INDEX_SIZE_ERR | Creates an attribute. | |
DOM 2 | Creates an attribute given a namespace. | |
INDEX_SIZE_ERR | Creates a CDATA section. | |
INDEX_SIZE_ERR | Creates a comment. | |
INDEX_SIZE_ERR | Creates a new document fragment. | |
INDEX_SIZE_ERR | Creates a new element. | |
DOM 2 | Creates an element with a given namespace. | |
INDEX_SIZE_ERR | Creates a link to a section. | |
DOM 2 | Creates a new Event object. | |
INDEX_SIZE_ERR | Creates a directive. | |
INDEX_SIZE_ERR | Creates a new text node. | |
DOM 2 | Returns the element with the given ID. | |
INDEX_SIZE_ERR | Returns a collection of all elements that have the given tag. | |
DOM 2 | Returns a collection of all elements that have the given tag, given the namespace. | |
DOM 2 | Imports a node from another document. |
The doctype property returns the type of this document (type DocumentType).
For HTML documents and for XML documents that do not have a document type declaration, null is returned. Complies with DOM 1. : document documentElement property.documentElement Mutable: no Support: Standard compliant.
Meets the standard.
The documentElement property returns the root element of this document (of type Element). For HTML documents, the HTML element is returned.
Example: operator Complies with DOM 1. : document Alert(document.documentElement.tagName);will output an HTML string to the screen. Implementation property.implementation Editable: no Support: For XML documents only.
Complies with DOM 1. Complies with DOM 1. : document The implementation property returns an object of type describingthis implementation
DOM. Complies with DOM 1. : document styleSheets property.styleSheets Editable: No Support: HTML documents only.
Example of creating an attribute for an HTML element:
Var myDiv = document.getElementById("idDiv"); var attr = document.createAttribute("temp"); attr.value = "temporary"; myDiv.setAttributeNode(attr); alert(myDiv.getAttribute("temp"));!}
The alert operator will print the string temporary .
An example of creating an attribute in Microsoft XMLDOM:
Var xmlDoc = new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async = false; xmlDoc.load("c:\My Documents\books.xml"); var root = xmlDoc.documentElement; var newAttr = xmlDoc.createAttribute("temp"); newAttr.value = "temporary"; root.setAttributeNode(attr); alert(root.getAttribute("temp"));!}
Here the alert operator will also print the string temporary .
createAttributeNS method Complies with DOM 1. : document.createAttributeNS(namespaceURI, qualifiedName) Arguments : namespaceURI, qualifiedName type expressions Result : new Attr object Exceptions : INVALID_CHARACTER_ERR, NAMESPACE_ERR Support : Not supported.Not supported.
The createAttributeNS method creates a new Attr object and returns a pointer to it. It is intended for XML documents and may not be supported for HTML documents.
The namespaceURI argument specifies the URI of the namespace, and qualifiedName is the qualified name of the attribute to be created in that namespace. The created object of type Attr has the following attributes: Complies with DOM 1. : document Subsequently, the created attribute can be assigned to any element using the Element.setAttributeNode method.createCDATASection method
.createCDATASection(data) Arguments : data type expression Result : new CDATASection object Exceptions : NOT_SUPPORTED_ERR Support : Standard compliant.
Meets the standard. Complies with DOM 1. : document The createCDATASection method creates a new CDATASection object and returns a pointer to it. It is intended for XML documents only; attempting to call it in the HTML DOM throws a NOT_SUPPORTED_ERR exception. The data argument specifies the contents of the one being created. An example of creating a CDATA section in Microsoft XMLDOM:Var xmlDoc = new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async = false; xmlDoc.load("c:\My Documents\books.xml"); var root = xmlDoc.documentElement; var newSection = xmlDoc.createCDATASection("Hello World!"); root.appendChild(newSection);
Var root = document.documentElement.firstChild; var comm = document.createComment("This is a comment."); root.appendChild(comm);
createDocumentFragment method Complies with DOM 1. : document.createDocumentFragment() Result : new object Exceptions : none Support : For XML documents only.Meets the standard.
The createDocumentFragment method creates a new empty object of type and returns a pointer to it. An example of creating a document fragment in Gecko:
Var elem = document.documentElement.firstChild; var o = document.createDocumentFragment(); elem.appendChild(o); Complies with DOM 1. : document createElement method.createElement(tagName) Arguments : tagName type expression Result : new Element object Exceptions : INVALID_CHARACTER_ERR Support : Standard compliant (see note 2).
Meets the standard.
Attempting to create a FRAME or IFRAME element in Internet Explorer results in either a fatal browser error or, at a minimum, complete destruction of the document object tree. | Node Type Behavior |
ATTRIBUTE_NODE | ownerElement attribute is null , specified is true . All children of the original node are recursively copied to the new Attr node, regardless of the deep value. |
DOCUMENT_FRAGMENT_NODE | If deep is true , then the specified document fragment is imported; otherwise, an empty node is created. |
DOCUMENT_NODE, DOCUMENT_TYPE_NODE | Cannot be imported. |
ELEMENT_NODE | All attributes of the original node, except those specified by default in the source document, are copied to the new Element node. The default attributes that are accepted in this document for elements with that name are then created. If deep is true , then the entire subtree of the original element is imported. |
ENTITY_NODE | Entity DocumentType Nodes |
ENTITY_REFERENCE_NODE | Notation nodes can be imported, but in DOM 2 the DocumentType is read-only, so importing such nodes doesn't make sense. |
PROCESSING_INSTRUCTION_NODE | The values of the target and data attributes of the source node are copied. |
TEXT_NODE, CDATA_SECTION_NODE, COMMENT_NODE | The values of the data and length attributes of the source node are copied. |
Prepared by: Evgeny Ryzhkov Publication date: 11/15/2010
A document tree is a diagram for constructing an HTML document that shows the relationships between various page elements: the order and nesting of elements. This diagram helps to navigate this seemingly chaotic mess of HTML tags.
The document tree helps a web developer when writing CSS rules and Javascript scripts.
The noteDon't confuse the document tree with the document object model (DOM). DOM is a more complex concept (will be written about it a little later).
In order not to go into long and tedious explanations of why a document tree was called a tree, let's look at an example - take a simple HTML code:
Page title Main title
paragraph of text.
- paragraph 1
- point 2
This is how the HTML code is seen by unenlightened natives who accidentally clicked to view the page code. But the trained eye of a web developer will take it apart, see all the levels of nesting and interconnection. It will build out of chaos a clear hierarchical structure in the form of a tree (because the diagram is similar to the outline of a tree):
Family tiesThere are certain connections between the elements of the document tree. Let's look at them.
Ancestors and descendantsFrom the schematic image of the tree, and from the HTML code itself, it is clear that some elements are nested within others. Elements that contain others are ancestors (ancestors) of everything embedded in it. The nested ones, in turn, are its descendants (descendant).
For clarity, consider one branch of our tree:
Each ancestor can have an unlimited number of descendants. Each descendant will have a number of ancestors depending on the structure of the tree and on which branch it will be located, but in any case there will be at least one ancestor.
Parents and daughtersParent is the immediate ancestor (first-level ancestor) of an element. Conversely, the immediate child (the first-level child) is called a child.
Each parent can have an unlimited number of daughters. A child element will only have one parent.
A parent element is also called a direct ancestor, and a child element is a direct descendant. These are something like semantic names.
Sister elementsSiblings are a group of two or more elements that share a common parent. The elements don't have to be the same type, they just have to have a common parent.
Adjacent ElementsAdjacent elements are sister elements that are located “neighborhood”.
Previous sister and next sisterEverything here should be clear from the names of the terms themselves. Preceding sibling - the previous sister element by code. Using our example branch for
- it will be
- Reverse compatibility with existing browsers.
- Familiar syntax.
- Unpretentious syntax (there will be no "Yellow Screen of Death" if an error is made).
- Syntax that allows the omission of certain tags and attributes.
- Strong XML syntax that some creators may find comfortable to support.
- Links directly with other XML vocabularies (such as SVG and MathML).
- Implementation of XML processing.
- The Differences from HTML 4 outlines the configurations compared to the previous version of HTML
- The HTML Design Principles discusses principles that help make decisions. They will help you understand the basis of available design solutions.
- The WEB Developer`s Guide to HTML5, a recently launched resource designed to help WEB designers and developers understand everything they need to know to write conformant HTML5 documents. Guiding principles are provided and best solutions are described.
For
- and for there will be no previous sisterhood.
Similarly, the following sister (following sibling): for -
For
—
- , For
- - No. Previous and Next
Previous element (preceeding) - the same previous element according to the code, only without restrictions of sister relations. For our branch: for
- it will be
For
- , For - .
And XHTML, which was defined in its own syntax definitions, HTML5 is described by Document Object Model (DOM) definitions - using an internal tree representation to display the document. For example, imagine a very small document consisting of a page title, a heading, and a paragraph body. The DOM tree might look like this:
The DOM tree contains a title element in the head block and an h1 and p element in the body.
The advantage of describing HTML5 in document object model definitions is that the language can be defined independently of the syntax. There are primarily two language syntaxes for document representation: HTML serialization (HTML5) and XML serialization (XHTML5).
HTML refers to a syntax derived from SGML (early HTML), but is defined to be more compatible with actual browser support for HTML in practice.
An HTML Document Example
As in previous versions of HTML, some tags are optional and are automatically assumed.
XML serialization derives from XML 1.0 syntax and namespace, the same as XHTML 1.0.
An HTML Document Example
This is an example HTML document.
Apart from differences in the presence and absence of the xmlns attribute, these two examples are equivalent.
Browsers use the MIME type for selection. Any document declared as text/html must satisfy the requirements of the HTML specification, and any document declared as an XML MIME type (such as application/xhtml+xml) must conform to the XML specification.
Creators make conscious choices about what to use, which can be based on a number of different reasons. Developers should not choose one or the other without reason; each is optimized for different situations.
Advantages of using HTMLWork on HTML5 is progressing rapidly, but completion is expected in a couple of years. In order to carry out various tests and achieve interoperability of implementations that meet the requirements, according to current estimates, it will take from 10 to 15 years of work. Throughout the development phase, feedback from a wide range of users, including WEB designers, developers, CMS and development tool manufacturers, and browser manufacturers is very important to achieving success. Contributions to the development of HTML5 are not only welcomed, but also actively encouraged.
In addition to the specification, there are a number of other resources designed to help people better understand the process.
There are many methods to contribute your own design. You will be able to join W3C's HTML WG and subscribe/contribute to the HTML WG mailing lists or wiki. You will also be able to participate in the WHATWG forum, write comments or write articles on the WHATWG blog.
Working with the DOM model
Every Window object has a document property that refers to the Document object. This Document object is not a standalone object. It is the central object of an extensive API known as the Document Object Model (DOM), which defines how document content can be accessed.
DOM OverviewThe Document Object Model (DOM) is the fundamental application programming interface that provides the ability to work with the content of HTML and XML documents. The DOM application programming interface (API) is not particularly complex, but there are a lot of architectural features that you should be aware of.
First, understand that nested elements in HTML or XML documents are represented as a tree of DOM objects. The tree view of an HTML document contains nodes representing elements or tags, such as and
And nodes representing lines of text. An HTML document can also contain nodes representing HTML comments. Consider the following simple HTML document:
Sample Document This is an HTML document
Example simple text.
The DOM representation of this document is shown in the following diagram:
For those unfamiliar with tree structures in computer programming, it is helpful to know that the terminology for describing them was borrowed from family trees. The node located directly above this node is called parental in relation to this node. Nodes located one level below another node are subsidiaries in relation to this node. Nodes that are at the same level and have the same parent are called sisterly. Nodes located any number of levels below another node are its children. Parent, grandparent, and any other nodes above a given node are its ancestors.
Each rectangle in this diagram is a document node, which is represented by a Node object. Note that the figure shows three different types of nodes. The root of the tree is the Document node, which represents the entire document. Nodes representing HTML elements are nodes of type Element, and nodes representing text are nodes of type Text. Document, Element and Text are subclasses of the Node class. Document and Element are the two most important classes in the DOM.
The Node type and its subtypes form the type hierarchy shown in the diagram below. Note the formal differences between the generic types Document and Element, and the types HTMLDocument and HTMLElement. The Document type represents an HTML and XML document, and the Element class represents an element of that document. The HTMLDocument and HTMLElement subclasses represent specifically an HTML document and its elements:
Another thing to note in this diagram is that there are a large number of subtypes of the HTMLElement class that represent specific types of HTML elements. Each of them defines JavaScript properties that reflect the HTML attributes of a particular element or group of elements. Some of these specific classes define additional properties or methods that do not reflect the HTML markup language syntax.
Selecting document elementsThe work of most client programs in the JavaScript language is somehow related to the manipulation of document elements. At runtime, these programs can use the global variable document, which refers to a Document object. However, to perform any manipulation on document elements, the program must somehow obtain, or select, Element objects that refer to those document elements. The DOM defines several ways to select elements. You can select an element or elements of a document:
by the value of the id attribute;
by the value of the name attribute;
by tag name;
by CSS class or classes name;
by matching a specific CSS selector.
All these element sampling techniques are described in the following subsections.
Selecting elements by id attribute valueAll HTML elements have id attributes. The value of this attribute must be unique within a document—no two elements in the same document must have the same id attribute value. You can select an element by a unique id attribute value using the getElementById() method of the Document object:
Var section1 = document.getElementById("section1");
This is the simplest and most common way to select elements. If your script needs to be able to manipulate a specific set of document elements, assign values to the id attributes of those elements and use the ability to search for them using those values.
In versions of Internet Explorer earlier than IE8, the getElementById() method searches for id attribute values in a case-insensitive manner and also returns elements that match the name attribute value.
Selecting elements by name attribute valueThe name HTML attribute was originally intended to name form elements, and the value of this attribute was used when form data was submitted to the server. Like the id attribute, the name attribute assigns a name to an element. However, unlike id, the value of the name attribute does not have to be unique: several elements can have the same name, which is quite common when used in forms of radio buttons and checkboxes. Additionally, unlike id, the name attribute is only allowed on certain HTML elements, including forms, form elements, and .
You can select HTML elements based on the values of their name attributes using the getElementsByName() method of the Document object:
Var radiobuttons = document.getElementsByName("favorite_color");
The getElementsByName() method is not defined by the Document class, but by the HTMLDocument class, so it is only available in HTML documents and not available in XML documents. It returns a NodeList object, which behaves like a read-only array of Element objects.
In IE, the getElementsByName() method also returns elements whose id attribute value matches the specified value. To ensure cross-browser compatibility, you must be careful when choosing attribute values and do not use the same strings as the values for the name and id attributes.
Select items by typeThe getElementsByTagName() method of the Document object allows you to select all HTML or XML elements of a specified type (or by tag name). For example, you could get a read-only array-like object containing the Element objects of all the elements in the document like this:
Var spans = document.getElementsByTagName("span");
Similar to the getElementsByName() method, getElementsByTagName() returns a NodeList object. Document elements are included in the NodeList array in the same order in which they appear in the document, i.e. first element
In the document you can choose:
Var firstParagraph = document.getElementsByTagName("p");
HTML tag names are not case sensitive, and when getElementsByTagName() is applied to an HTML document, it performs a case-insensitive comparison against the tag name. The spans variable created above, for example, will also include all elements that are written as .
You can get a NodeList containing all the elements of a document by passing the wildcard character "*" to the getElementsByTagName() method.
In addition, the Element class also defines the getElementsByTagName() method. It acts exactly like the Document class version of the method, but selects only elements that are descendants of the element on which the method is called. That is, find all the elements inside the first element
You can do this as follows:
Var firstParagraph = document.getElementsByTagName("p"); var firstParagraphSpans = firstParagraph.getElementsByTagName("span");
For historical reasons, the HTMLDocument class defines special properties to access certain types of nodes. Properties images, forms And links, for example, refer to objects that behave like read-only arrays containing elements , And (but only those tags , which have an href attribute). These properties refer to HTMLCollection objects, which are much like NodeList objects, but can additionally be indexed by the values of the id and name attributes.
The HTMLDocument object also defines synonymous properties embeds and plugins, which are collections of HTMLCollection elements. The anchors property is non-standard, but it can be used to access elements , which has a name attribute but no href attribute. The scripts property is defined by the HTML5 standard and is a collection of HTMLCollection elements.
Additionally, the HTMLDocument object defines two properties, each of which refers not to a collection, but to a single element. The document.body property represents the HTML document element, and the document.head property represents the . These properties are always defined in the document: even if the source document does not contain and elements, the browser will create them implicitly. The documentElement property of a Document object refers to the document's root element. In HTML documents it always represents the .
Selecting elements by CSS classThe value of the HTML class attribute is a list of zero or more identifiers, separated by spaces. It allows you to define sets of related document elements: any elements that have the same identifier in the class attribute are part of the same set. The word class is reserved in JavaScript, so client-side JavaScript uses the className property to store the value of the HTML class attribute.
Typically the class attribute is used in conjunction with CSS cascading style sheets to apply a common rendering style to all members of a set. However, in addition, the HTML5 standard defines the getElementsByClassName() method, which allows you to select multiple document elements based on the identifiers in their class attributes.
Like the getElementsByTagName() method, the getElementsByClassName() method can be called on both HTML documents and HTML elements, and returns a live NodeList object containing all descendants of the document or element that match the search criteria.
The getElementsByClassName() method takes a single string argument, but the string itself can contain multiple identifiers, separated by spaces. All elements whose class attributes contain all of the specified identifiers will be considered matched. The order of the identifiers does not matter. Note that in both the class attribute and the argument to the getElementsByClassName() method, class identifiers are separated by spaces rather than commas.
Below are some examples of using the getElementsByClassName() method:
// Find all elements with class "warning" var warnings = document.getElementsByClassName("warning"); // Find all descendants of an element with identifiers "log" // with classes "error" and "fatal" var log = document.getElementById("log"); var fatal = log.getElementsByClassName("fatal error");
Selecting Elements Using CSS SelectorsCSS Cascading Style Sheets have very powerful syntactic constructs known as selectors that allow you to describe elements or sets of elements in a document. Along with standardizing CSS3 selectors, another W3C standard known as the Selectors API defines JavaScript methods for retrieving elements that match a specified selector.
The key to this API is the querySelectorAll() method of the Document object. It takes a single string argument with a CSS selector and returns a NodeList object representing all document elements that match the selector.
In addition to the querySelectorAll() method, the document object also defines a querySelector() method, which is similar to querySelectorAll() except that it returns only the first (in document order) matching element, or null if there are no matching elements.
These two methods are also defined by the Elements class. When they are called on an element, the entire document is searched for a match for the given selector, and then the result is filtered to only include descendants of the element used. This approach may seem counterintuitive because it means that the selector string may include ancestors of the element being matched.
Document structure and document navigationAfter selecting a document element, it is sometimes necessary to find structurally related parts of the document (parent, siblings, child). A Document object can be thought of as a tree of Node objects. The Node type defines properties that allow you to navigate such a tree. There is another application interface for document navigation, such as the Element object tree.
Documents as node treesThe Document object, its Element objects, and the Text objects that represent text fragments in the document are all Node objects. The Node class defines the following important properties:
parentNodeThe parent node of this node, or null for nodes that have no parent, such as Document.
childNodesA readable array-like object (NodeList) that provides a representation of child nodes.
firstChild, lastChildThe first and last child nodes, or null if the given node has no child nodes.
nextSibling, previousSiblingNext and previous brother nodes. Sibling nodes are two nodes that have the same parent. The order in which they appear corresponds to the order in the document. These properties link nodes into a doubly linked list.
nodeTypeThe type of this node. Nodes of type Document have a value of 9 in this property. Nodes of type Element - value 1. Text nodes of type Text - value 3. Nodes of type Comments - value 8 and nodes of type DocumentFragment - value 11.
nodeValueText content of the Text and Comment nodes.
nodeNameThe name of an Element tag with all characters converted to uppercase.
Using these properties of the Node class, you can reference the second child node of the first child node of the Document object, as shown below:
Document.childNodes.childNodes == document.firstChild.firstChild.nextSibling
Let us assume that the document in question has next view:
TestHello World!
Then the second child node of the first child node will be the element. In the nodeType property it contains the value 1 and in the nodeName property it contains the value “BODY”.
However, note that this API is extremely sensitive to changes in document text. For example, if you add a single line feed between the and tags in this document, that line feed character becomes the first child node (Text node) of the first child node, and the second child node becomes the element, not .
Documents as element treesWhen the primary interest is in the document elements themselves, rather than in the text within them (and the white space between them), it is much more convenient to use an application interface that allows you to interpret the document as a tree of Element objects, ignoring the Text and Comment nodes that are also part of the document.
The first part of this application interface is the children property of Element objects. Like the childNodes property, its value is a NodeList object. However, unlike the childNodes property, the children list contains only Element objects.
Note that the Text and Comment nodes do not have child nodes. This means that the Node.parentNode property described above never returns nodes of type Text or Comment. The value of the parentNode property of any Element object will always be another Element object or the root of the tree - a Document or DocumentFragment object.
The second part of the application interface for navigating document elements is the properties of the Element object, similar to the properties for accessing child and sibling nodes of the Node object:
firstElementChild, lastElementChildSimilar to the firstChild and lastChild properties, but return the child elements.
nextElementSibling, previousElementSiblingSimilar to the nextSibling and previousSibling properties, but return the sibling elements.
childElementCountNumber of child elements. Returns the same value as the children.length property.
These child and sibling element access properties are standardized and implemented in all current browsers except IE.