blob: 85aaea34adfcea0647a90e3f8939bad2351ec2f9 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!-- $Id: introduction.xml 225913 2001-06-01 11:15:37Z dims $ -->
<!--
*************************************************************************
* BEGINNING OF DOM INTRODUCTION *
*************************************************************************
-->
<div1 id="Introduction">
<head>What is the Document Object Model?</head>
<orglist role="editors">
<member>
<name>Philippe Le H&eacute;garet</name>
<affiliation>W3C</affiliation>
</member>
<member>
<name>Lauren Wood</name>
<affiliation>SoftQuad Software Inc., WG Chair</affiliation>
</member>
<member>
<name>Jonathan Robie</name>
<affiliation>Texcel (for DOM Level 1)</affiliation>
</member>
</orglist>
<div2 id="ID-E7C3082">
<head>Introduction</head>
<p>The Document Object Model (DOM) is an application programming interface
(<termref def="dt-API">API</termref>) for valid <termref def="dt-HTML">HTML</termref> and
well-formed <termref def="dt-XML">XML</termref> documents. It defines the logical structure of documents and
the way a document is accessed and manipulated. In the DOM specification,
the term "document" is used in the broad sense - increasingly, XML is being used as a
way of representing many different kinds of information that may
be stored in diverse systems, and much of this would traditionally
be seen as data rather than as documents. Nevertheless, XML presents
this data as documents, and the DOM may be used to manage this data.</p>
<p>With the Document
Object Model, programmers can build documents, navigate
their structure, and add, modify, or delete elements and content.
Anything found in an HTML or XML document can be accessed,
changed, deleted, or added using the Document Object Model,
with a few exceptions - in particular, the DOM <termref
def="dt-interface">interfaces</termref> for
the XML internal and external subsets have not yet been specified.</p>
<p>As a W3C specification, one important objective for the Document
Object Model is to provide a standard programming interface that
can be used in a wide variety of environments and <termref
def="dt-application">applications</termref>.
The DOM is designed to be used with any programming
language. In order to provide a precise, language-independent
specification of the DOM interfaces, we have chosen to define
the specifications in Object Management Group (OMG) IDL <bibref
ref="OMGIDL"/>, as defined in the CORBA 2.3.1 specification <bibref
ref="CORBA"/>. In addition to the OMG IDL specification, we provide
<termref def="dt-lang-binding">language bindings</termref> for Java <bibref ref="Java"/> and ECMAScript <bibref
ref="ECMAScript"/> (an industry-standard scripting
language based on JavaScript <bibref ref="JavaScript"/> and JScript
<bibref ref='JScript'/>).</p>
<note><p>OMG IDL is used only as a language-independent and
implementation-neutral way to specify <termref def="dt-interface">interfaces</termref>. Various other
IDLs could have been used (<bibref ref="COM"/>, <bibref
ref="JavaIDL"/>, <bibref ref="MSIDL"/>, ...). In general, IDLs
are designed for specific computing environments. The Document Object
Model can be implemented in any computing environment, and does not
require the object binding runtimes generally associated with
such IDLs.
</p></note>
</div2>
<div2 id="ID-E7C30821">
<head>What the Document Object Model is</head>
<p>The DOM is a programming <termref def="dt-API">API</termref> for documents.
It is based on an object structure that closely resembles the structure of the
documents it <termref def="dt-model">models</termref>. For instance, consider this table, taken
from an HTML document: </p>
<eg role="code">
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;Shady Grove&lt;/TD&gt;
&lt;TD&gt;Aeolian&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Over the River, Charlie&lt;/TD&gt;
&lt;TD&gt;Dorian&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
</eg>
<p>A graphical representation of the DOM of the example table is:
<graphic source="./images/table.gif" alt="graphical representation of the
DOM of the example table"/>
</p>
<p>In the DOM, documents have a logical
structure which is very much like a tree; to be more precise, which is
like a &quot;forest&quot; or &quot;grove&quot;,
which can contain more than one tree. Each document contains zero or one
doctype nodes, one root element node, and zero or more comments
or processing instructions; the root element serves as the root
of the element tree for the document. However, the DOM
does not specify that documents must be <emph>implemented</emph> as a
tree or a grove<!--but not the same as an sgml grove-->, nor
does it specify how the relationships among objects be
implemented. The DOM is a logical model that may be implemented in any
convenient manner. In this
specification, we use the term <emph>structure model</emph> to
describe the tree-like representation of a document.
We also use the term "tree" when referring to the arrangement of
those information items which can be reached by using "tree-walking"
methods; (this does not include attributes).
One important property of DOM structure models
is <emph>structural isomorphism</emph>: if any two Document
Object Model implementations are used to create a representation
of the same document, they will create the same structure model,
in accordance with the XML Information Set <bibref ref="InfoSet"/>.</p>
<note><p>There may be some variations depending on the parser being
used to build the DOM. For instance, the DOM may not contain
whitespaces in element content if the parser discards them.</p>
</note>
<p>The name &quot;Document Object Model&quot; was chosen because
it is an &quot;<termref def="dt-object-model">object model</termref>&quot; in the traditional
object oriented design sense: documents are modeled using
objects, and the model encompasses not only the structure of a
document, but also the behavior of a document and the objects
of which it is composed. In other words, the nodes in the
above diagram do not represent a data structure, they
represent objects, which have functions and identity. As an
object model, the DOM identifies:</p>
<ulist>
<item><p>the interfaces and objects used to represent and manipulate
a document</p></item>
<item><p>the semantics of these interfaces and objects - including
both behavior and attributes</p></item>
<item><p>the relationships and collaborations among these interfaces
and objects</p></item>
</ulist>
<p>The structure of SGML documents has traditionally been
represented by an abstract <termref def="dt-datamodel">data model</termref>, not by an object model.
In an abstract <termref def="dt-datamodel">data model</termref>, the model is centered around the
data. In object oriented programming languages, the data itself
is encapsulated in objects that hide the data, protecting it
from direct external manipulation. The functions associated with
these objects determine how the objects may be manipulated, and
they are part of the object model.</p>
</div2>
<div2 id="ID-E7C30822">
<head>What the Document Object Model is not</head>
<p>This section is designed to give a more precise understanding
of the DOM by distinguishing it from other
systems that may seem to be like it.</p>
<ulist>
<item><p>The Document Object Model is not a binary specification.
DOM programs written in the same language binding will be
source code compatible across platforms, but the DOM
does not define any form of binary interoperability.</p></item>
<item><p>The Document Object Model is not a way of persisting objects
to XML or HTML. Instead of specifying how objects may be
represented in XML, the DOM specifies how
XML and HTML documents are represented as objects, so that
they may be used in object oriented programs.</p></item>
<item><p>The Document Object Model is not a set of data structures;
it is an <termref def="dt-object-model">object model</termref> that specifies interfaces. Although this
document contains diagrams showing parent/child relationships,
these are logical relationships defined by the programming
interfaces, not representations of any particular internal
data structures.</p></item>
<item><p>The Document Object Model does not define what information in a
document is relevant or how information in a document is
structured. For XML, this is specified by the W3C XML Information Set
<bibref ref="InfoSet"/>. The DOM is simply an <termref def="dt-API">API</termref> to this information
set.</p></item>
<item><p>The Document Object Model, despite its name, is not a
competitor to the Component Object Model (COM). COM, like
CORBA, is a language independent way to specify interfaces and
objects; the DOM is a set of interfaces and
objects designed for managing HTML and XML documents. The DOM
may be implemented using language-independent systems like COM
or CORBA; it may also be implemented using language-specific
bindings like the Java or ECMAScript bindings specified in
this document.</p></item>
</ulist>
</div2>
<div2 id="ID-E7C30823">
<head>Where the Document Object Model came from</head>
<p>The DOM originated as a specification to
allow JavaScript scripts and Java programs to be portable among
Web browsers. "Dynamic HTML" was the immediate ancestor of the
Document Object Model, and it was originally thought of largely
in terms of browsers. However, when the DOM
Working Group was formed at W3C, it was also joined by vendors in other
domains, including HTML or XML editors and document
repositories. Several of these vendors had worked with SGML
before XML was developed; as a result, the DOM
has been influenced by SGML Groves and the HyTime standard. Some
of these vendors had also developed their own object models for
documents in order to provide an API for SGML/XML
editors or document repositories, and these object models have
also influenced the DOM.</p>
</div2>
<div2 id="ID-E7C30824"><head>Entities and the DOM Core</head>
<p>In the fundamental DOM interfaces, there are no objects representing
entities. Numeric character references, and references to the
pre-defined entities in HTML and XML, are replaced by the
single character that makes up the entity's replacement.
For example, in:
</p>
<eg role="code">
&lt;p&gt;This is a dog &amp;amp; a cat&lt;/p&gt;
</eg>
<p>
the "&amp;amp;" will be replaced by the character "&amp;", and the text
in the P element will form a single continuous sequence of
characters. Since numeric character references and pre-defined entities
are not recognized as such in CDATA sections, or in the SCRIPT and STYLE
elements in HTML, they are not replaced by the single character they
appear to refer to. If the example above were enclosed in a CDATA
section, the "&amp;amp;" would not be replaced by "&amp;"; neither would
the &lt;p&gt; be recognized as a start tag. The representation of general
entities, both internal and external, are defined within the
extended (XML) interfaces of DOM Level 1 <bibref ref='DOM-Level-1'/>.</p>
<p>
Note: When a DOM representation of a document is serialized
as XML or HTML text, applications will need to check each
character in text data to see if it needs to be escaped
using a numeric or pre-defined entity. Failing to do so
could result in invalid HTML or XML. Also, <termref
def="dt-implementation">implementations</termref> should be
aware of the fact that serialization into a character encoding
("charset") that does not fully cover ISO 10646 may fail if there are
characters in markup or CDATA sections that are not present in the
encoding.</p>
</div2>
<div2 id="ID-Conformance">
<head>Conformance</head>
<p>
This section explains the different levels of conformance to DOM Level 2.
DOM Level 2 consists of 14 modules. It is possible to conform to DOM
Level 2, or to a DOM Level 2 module.
</p>
<p>
An implementation is DOM Level 2 conformant if it supports the Core
module defined in this document (see <specref ref="ID-BBACDC08"/>). An
implementation conforms to a DOM Level 2 module if it supports all the
interfaces for that module and the associated semantics.
</p>
<p>
Here is the complete list of DOM Level 2.0 modules and the features used
by them. Feature names are case-insensitive.
</p>
<glist>
<gitem>
<label>Core module</label>
<def>
<p>
defines the feature <xspecref
href="core.html#ID-BBACDC08">"Core"</xspecref>.
</p>
</def>
</gitem>
<gitem>
<label>XML module</label>
<def>
<p>
defines the feature <xspecref
href="core.html#ID-E067D597">"XML"</xspecref>.
</p>
</def>
</gitem>
<gitem>
<label>HTML module</label>
<def>
<p>defines the feature "HTML". (see <bibref ref="DOMHTML-inf"/>).</p>
<note>
<p>At time of publication, this DOM Level 2 module is not yet a W3C Recommendation.</p>
</note>
</def>
</gitem>
<gitem>
<label>Views module</label>
<def>
<p>defines the feature <xspecref
href='&views.latest.url;/views.html'>"Views"</xspecref> in <bibref ref="DOMViews-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>Style Sheets module</label>
<def>
<p>defines the feature <xspecref
href='&style.latest.url;/stylesheets.html'>"StyleSheets"</xspecref> in <bibref ref="DOMStyleSheets-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>CSS module</label>
<def>
<p>defines the feature <xspecref
href='&style.latest.url;/css.html'>"CSS"</xspecref> in <bibref ref="DOMCSS-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>CSS2 module</label>
<def>
<p>defines the feature <xspecref
href='&style.latest.url;/css.html'>"CSS2"</xspecref> in <bibref ref="DOMCSS-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>Events module</label>
<def>
<p>defines the feature <xspecref
href='&events.latest.url;/events.html'>"Events"</xspecref> in <bibref ref="DOMEvents-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>User interface Events module</label>
<def>
<p>defines the feature <xspecref
href='&events.latest.url;/events.html'>"UIEvents"</xspecref> in <bibref ref="DOMEvents-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>Mouse Events module</label>
<def>
<p>defines the feature <xspecref
href='&events.latest.url;/events.html'>"MouseEvents"</xspecref> in <bibref ref="DOMEvents-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>Mutation Events module</label>
<def>
<p>defines the feature <xspecref
href='&events.latest.url;/events.html'>"MutationEvents"</xspecref> in <bibref ref="DOMEvents-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>HTML Events module</label>
<def>
<p>defines the feature <xspecref
href='&events.latest.url;/events.html'>"HTMLEvents"</xspecref> in <bibref ref="DOMEvents-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>Range module</label>
<def>
<p>defines the feature <xspecref
href='&traversal-range.latest.url;/ranges.html'>"Range"</xspecref> in <bibref ref="DOMRange-inf"/>.</p>
</def>
</gitem>
<gitem>
<label>Traversal module</label>
<def>
<p>defines the feature <xspecref
href='&traversal-range.latest.url;/traversal.html'>"Traversal"</xspecref> in <bibref ref="DOMTraversal-inf"/>.</p>
</def>
</gitem>
</glist>
<p>
A DOM implementation must not return <code>"true"</code> to the
<code>hasFeature(feature, version)</code> <termref
def="dt-method">method</termref> of the <code>DOMImplementation</code>
interface for that feature unless the implementation conforms to that
module. The <code>version</code> number for all features used in DOM
Level 2.0 is "2.0".
</p>
</div2>
<div2 id="ID-E7C30826"><head>DOM Interfaces and DOM Implementations</head>
<p>The DOM specifies interfaces which may be used to manage XML or
HTML documents. It is important to realize that these interfaces
are an abstraction - much like "abstract base classes" in C++,
they are a means of specifying a way to access and manipulate an
application's internal representation of a document. Interfaces
do not imply a particular concrete
implementation. Each DOM application is free to maintain
documents in any convenient representation, as long as the
interfaces shown in this specification are supported. Some
DOM implementations will be existing programs that use the
DOM interfaces to access software written long before the
DOM specification existed. Therefore, the DOM is designed
to avoid implementation dependencies; in particular,</p>
<olist>
<item><p>Attributes defined in the IDL do not imply concrete
objects which must have specific data members - in the
language bindings, they are translated to a pair of
get()/set() functions, not to a data member. Read-only
attributes have only a get() function in the language
bindings. </p>
</item>
<item><p>DOM applications may provide additional interfaces
and objects not found in this specification and still be
considered DOM conformant.</p></item>
<item><p>Because we specify interfaces and not the actual
objects that are to be created, the DOM cannot know what
constructors to call for an implementation. In general,
DOM users call the createX() methods on the Document
class to create document structures, and DOM
implementations create their own internal representations
of these structures in their implementations of the
createX() functions.
</p></item>
</olist>
<p>
The Level 1 interfaces were extended to provide both Level 1 and Level 2
functionality.
</p>
<p>
DOM implementations in languages other than Java or ECMAScript may
choose bindings that are appropriate and natural for their language and
run time environment. For example, some systems may need to create a
Document2 class which inherits from Document and contains the new methods
and attributes.
</p>
<p>DOM Level 2 does not specify multithreading mechanisms.</p>
</div2>
</div1>
<!--
*************************************************************************
* END OF DOM INTRODUCTION *
*************************************************************************
-->