blob: 891469624a19f13f8f76a58fcaa0c809b1aeefb6 [file] [log] [blame]
<!-- $Id$ -->
<html>
<head>
<title>Xerces 2 | Crimson</title>
<link rel='stylesheet' type='text/css' href='css/site.css'>
</head>
<body>
<span class='netscape'>
<a name='TOP'></a>
<h1>Evaluation of Crimson Code</h1>
<a name='Overview'></a>
<h2>Overview</h2>
<p>
The Crimson code donated to the <a href='http://xml.apache.org/'>XML
Apache Project</a> by <a href='http://www.sun.com/'>Sun Microsystems</a>
is a relatively clean and straightforward implementation of a
conforming <a href='http://www.w3.org/XML/'>XML</a> parser. However,
there are some serious drawbacks to its design that hamper its use
in the Xerces2 effort. This page will highlight some of the problems
that I see with the Crimson code. This doesn't mean, however, that
there aren't good ideas in Crimson! I'll highlight some of the things
that I like about Crimson as well.
</p>
<a name='TheGood'></a>
<h2>The Good</h2>
<p>
<table border='0'>
<tr>
<th>Size:</th>
<td>Crimson has a small code footprint.</td>
</tr>
<tr>
<th>Simplicity:</th>
<td>
The code is very straightforward and easy to grok. I especially
like the simple approach to reading the input streams. The advanced
reader code in Xerces has been a continual source of bugs and
developer confusion. See my <a href='xerces.html'>evaluation of
Xerces</a> for more detail.
</td>
</tr>
</table>
</p>
<a name='TheBad'></a>
<h2>The Bad</h2>
<p>
<table border='0'>
<tr>
<th>Standards:</th>
<td>
Crimson is lacking implementation of important standards. Some
examples are DOM Level 2 and XML Schema.
</td>
</tr>
<tr>
<th>Modularity:</th>
<td>
The design of Crimson is not modular enough to be of general
use in a wide variety of applications. For example, the document
and DTD scanning code is hard-coded into the parser. Also, a lot
of the classes used by the parser rely on package visibility of
members. (Yuck!)
</td>
</tr>
<tr>
<th>Validation:</th>
<td>
The validation engine is rather simplistic and not very fast.
Plus, it doesn't seem to be able to handle the advanced
validation requirements of XML Schema.
</td>
</tr>
<tr>
<th>Performance:</th>
<td>
The general performance of the Crimson code is good but there
are some areas where it can (and should) be tuned for performance.
First, validation is not as fast as it could be but there are
comments in the code that suggest "compiling" the model into a
DFA for faster validation. Also, the DOM implementation wastes a
lot memory when traversing the document.
</td>
</tr>
</table>
</p>
</span>
<a name='BOTTOM'></a>
<hr>
<span class='netscape'>
Author: Andy Clark <br>
Last modified: $Date$
</span>
</body>
</html>