blob: a71e5c9fb50abf96d8423220a6473fe1754a3fac [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "../../dtd/document-v10.dtd">
<document>
<header>
<title>Lexical Transformer</title>
<version>0.1</version>
<type>Technical document</type>
<authors>
<person id="SMS" name="Stephan Michels" email="stephan@apache.org"/>
</authors>
<abstract>This document describes the lexical transformer of Cocoon.</abstract>
</header>
<body>
<s1 title="Lexical Transformer">
<p>The lexical transformer tokenizes the content of special marked
elements of a SAX stream, by using a lexicon file.</p>
<ul>
<li>Name: lexer</li>
<li>Class: org.apache.cocoon.transformation.LexicalTransformer</li>
<li>Cacheable: yes - uses the last modification date of the lexicon
document for validation.</li>
</ul>
<p>The lexer parses the following elements from the SAX stream, and
replaces them through generated documents.
</p>
<source><![CDATA[
<text xmlns="http://chaperon.sourceforge.net/schema/text/1.0">
[Text, which should be parsed]
</text>
]]></source>
<p>The lexical transformer will replace these elements by a list of
lexemes (tokens).
</p>
<source><![CDATA[
<lexemes xmlns="http://chaperon.sourceforge.net/schema/lexemes/1.0">
<lexeme symbol="..." text="..."/>
<lexeme symbol="..." text="..."/>
<lexeme symbol="..." text="..."/>
</lexemes>
]]></source>
<p>A detailed explanation of function and the lexicon format can be found at <link
href="http://chaperon.sourceforge.net/">Chaperon</link>.</p>
</s1>
</body>
</document>