blob: 1453cfc8d7b287f9c439d39b353f7e88c32e0340 [file] [log] [blame]
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE faqs SYSTEM 'dtd/faqs.dtd'>
<faqs title='General FAQs'>
<faq title="Jar file changes">
<q>What happened to xerces.jar</q>
<a>
<p>In order to take advantage of the fact that this parser is
very often used in conjunction with other XML technologies,
such as XSLT processors, which also rely on standard
API&apos;s like DOM and SAX, xerces.jar was split into two
jarfiles:
</p>
<ul>
<li><code>xmlParserAPIs.jar</code> contains the DOM level 2,
SAX 2.0 and JAXP 1.1 API&apos;s;</li>
<li><code>xercesImpl.jar</code> contains the implementation of
these API&apos;s as well as the XNI API.
</li>
</ul>
<p>For backwards compatibility, we have retained the ability
to generate old-style jarfiles. For instructions, see <link
idref="install">the installation documentation</link>.
</p>
</a>
</faq>
<faq title='Validation against DTD'>
<q>How do I turn on DTD validation?</q>
<a>
<p>
You can turn validation on and off via methods available
on the SAX2 <code>XMLReader</code> interface. While only the
<code>SAXParser</code> implements the <code>XMLReader</code>
interface, the methods required for turning on validation
are available to both parser classes, DOM and SAX.
<br/>
The code snippet below shows how to turn validation on -- assume
that <ref>parser</ref> is an instance of either
<code>org.apache.xerces.parsers.SAXParser</code> or
<code>org.apache.xerces.parsers.DOMParser</code>.
<br/><br/>
<code>parser.setFeature("http://xml.org/sax/features/validation", true);</code>
</p>
</a>
</faq>
<faq title='IDs and XML Schemas'>
<q>Why does getElementById() not always work for documents validated against XML Schemas?</q>
<a>
<p>According to the XML Schema specification, an instance document might have
more than one <jump href="http://www.w3.org/TR/xmlschema-1/#key-vr">validation root</jump> and
<jump href="http://www.w3.org/TR/xmlschema-1/#cvc-id">ID/IDREFS</jump> must be
unique only within the context of a particular validation root, meaning that a
document may potentially contain multiple identical ids. In this case, the output
of getElementById() is unspecified. On the other hand, if the document root is
a validation root of the document, getElementById() should work as expected.
</p>
</a>
</faq>
<faq title='PSVI'>
<q>How do I get access to the PSVI?</q>
<a>
<p>Xerces provides a sample component PSVIWriter that intercepts document
handler events and collects PSVI information. For more information read <link
idref="samples-xni">samples documentation</link> on how to use xni.parser.PSVIParser
and xni.parser.PSVIConfiguration.
</p>
<note>Xerces only produces light-weight PSVI.</note>
</a>
</faq>
<faq title='International Encodings'>
<q>What international encodings are supported by &ParserName;?</q>
<a>
<ul>
<li>UTF-8</li>
<li>UTF-16 Big Endian, UTF-16 Little Endian</li>
<li>IBM-1208</li>
<li>ISO Latin-1 (ISO-8859-1)</li>
<li>
ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech,
Hungarian, Polish, Romanian, Serbian (in Latin transcription),
Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian]
</li>
<li>ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]</li>
<li>ISO Latin-4 (ISO-8859-4)</li>
<li>ISO Latin Cyrillic (ISO-8859-5)</li>
<li>ISO Latin Arabic (ISO-8859-6)</li>
<li>ISO Latin Greek (ISO-8859-7)</li>
<li>ISO Latin Hebrew (ISO-8859-8)</li>
<li>ISO Latin-5 (ISO-8859-9) [Turkish]</li>
<li>Extended Unix Code, packed for Japanese (euc-jp, eucjis)</li>
<li>Japanese Shift JIS (shift-jis)</li>
<li>Chinese (big5)</li>
<li>Chinese for PRC (mixed 1/2 byte) (gb2312)</li>
<li>Japanese ISO-2022-JP (iso-2022-jp)</li>
<li>Cyrillic (koi8-r)</li>
<li>Extended Unix Code, packed for Korean (euc-kr)</li>
<li>Russian Unix, Cyrillic (koi8-r)</li>
<li>Windows Thai (cp874)</li>
<li>Latin 1 Windows (cp1252) (and all other cp125? encodings recognized by IANA)</li>
<li>cp858</li>
<li>EBCDIC encodings:</li>
<ul>
<li>EBCDIC US (ebcdic-cp-us)</li>
<li>EBCDIC Canada (ebcdic-cp-ca)</li>
<li>EBCDIC Netherland (ebcdic-cp-nl)</li>
<li>EBCDIC Denmark (ebcdic-cp-dk)</li>
<li>EBCDIC Norway (ebcdic-cp-no)</li>
<li>EBCDIC Finland (ebcdic-cp-fi)</li>
<li>EBCDIC Sweden (ebcdic-cp-se)</li>
<li>EBCDIC Italy (ebcdic-cp-it)</li>
<li>EBCDIC Spain, Latin America (ebcdic-cp-es)</li>
<li>EBCDIC Great Britain (ebcdic-cp-gb)</li>
<li>EBCDIC France (ebcdic-cp-fr)</li>
<li>EBCDIC Hebrew (ebcdic-cp-he)</li>
<li>EBCDIC Switzerland (ebcdic-cp-ch)</li>
<li>EBCDIC Roece (ebcdic-cp-roece)</li>
<li>EBCDIC Yugoslavia (ebcdic-cp-yu)</li>
<li>EBCDIC Iceland (ebcdic-cp-is)</li>
<li>EBCDIC Urdu (ebcdic-cp-ar2)</li>
<li>Latin 0 EBCDIC</li>
<li>EBCDIC Arabic (ebcdic-cp-ar1)</li>
</ul>
</ul>
<note>UCS-4 is not yet supported, but it is hoped that support will be available soon.</note>
</a>
</faq>
</faqs>