docs/faq-general.xml - xerces2-j - Git at Google

 <?xml version='1.0' encoding='UTF-8'?>
 <!DOCTYPE faqs SYSTEM 'dtd/faqs.dtd'>
 <faqs title='General FAQs'>
     <faq title="Jar file changes">
     <q>What happened to xerces.jar</q>
     <a>
         <p>In order to take advantage of the fact that this parser is
         very often used in conjunction with other XML technologies,
         such as XSLT processors, which also rely on standard
         API&apos;s like DOM and SAX, xerces.jar was split into two
         jarfiles:
         </p>
         <ul>
         <li><code>xmlParserAPIs.jar</code> contains the DOM level 2,
         SAX 2.0 and JAXP 1.1 API&apos;s;</li>
         <li><code>xercesImpl.jar</code> contains the implementation of
         these API&apos;s as well as the XNI API.
         </li>
         </ul>
         <p>For backwards compatibility, we have retained the ability
         to generate old-style jarfiles.  For instructions, see <link
         idref="install">the installation documentation</link>.
         </p>
     </a>
  </faq>
  <faq title='Validation against DTD'>
   <q>How do I turn on DTD validation?</q>
   <a>
    <p>
     You can turn validation on and off via methods available
     on the SAX2 <code>XMLReader</code> interface. While only the
     <code>SAXParser</code> implements the <code>XMLReader</code>
     interface, the methods required for turning on validation
     are available to both parser classes, DOM and SAX.
     <br/>
     The code snippet below shows how to turn validation on -- assume
     that <ref>parser</ref> is an instance of either
     <code>org.apache.xerces.parsers.SAXParser</code> or
     <code>org.apache.xerces.parsers.DOMParser</code>.
     <br/><br/>
     <code>parser.setFeature("http://xml.org/sax/features/validation", true);</code>
    </p>
   </a>
  </faq>
 <faq title='IDs and XML Schemas'>
   <q>Why does getElementById() not always work for documents validated against XML Schemas?</q>
   <a>
    <p>According to the XML Schema specification, an instance document might have
 more than one <jump href="http://www.w3.org/TR/xmlschema-1/#key-vr">validation root</jump> and
 <jump href="http://www.w3.org/TR/xmlschema-1/#cvc-id">ID/IDREFS</jump> must be
 unique only within the context of a particular validation root, meaning that a
 document may potentially contain multiple identical ids. In this case, the output
 of getElementById() is unspecified. On the other hand, if the document root is
 a validation root of the document, getElementById() should work as expected.
     </p>
   </a>
  </faq>

 <faq title='PSVI'>
   <q>How do I get access to the PSVI?</q>
   <a>
    <p>Xerces provides a sample component PSVIWriter that intercepts document
 handler events and collects PSVI information. For more information read <link
 idref="samples-xni">samples documentation</link> on how to use xni.parser.PSVIParser
 and xni.parser.PSVIConfiguration.
     </p>
 <note>Xerces only produces light-weight  PSVI.</note>
   </a>
  </faq>


  <faq title='International Encodings'>
   <q>What international encodings are supported by &ParserName;?</q>
   <a>
    <ul>
     <li>UTF-8</li>
     <li>UTF-16 Big Endian, UTF-16 Little Endian</li>
     <li>IBM-1208</li>
     <li>ISO Latin-1 (ISO-8859-1)</li>
     <li>
      ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech,
      Hungarian, Polish, Romanian, Serbian (in Latin transcription),
      Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian]
     </li>
     <li>ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]</li>
     <li>ISO Latin-4 (ISO-8859-4)</li>
     <li>ISO Latin Cyrillic (ISO-8859-5)</li>
     <li>ISO Latin Arabic (ISO-8859-6)</li>
     <li>ISO Latin Greek (ISO-8859-7)</li>
     <li>ISO Latin Hebrew (ISO-8859-8)</li>
     <li>ISO Latin-5 (ISO-8859-9) [Turkish]</li>
     <li>Extended Unix Code, packed for Japanese (euc-jp, eucjis)</li>
     <li>Japanese Shift JIS (shift-jis)</li>
     <li>Chinese (big5)</li>
     <li>Chinese for PRC (mixed 1/2 byte) (gb2312)</li>
     <li>Japanese ISO-2022-JP (iso-2022-jp)</li>
     <li>Cyrillic (koi8-r)</li>
     <li>Extended Unix Code, packed for Korean (euc-kr)</li>
     <li>Russian Unix, Cyrillic (koi8-r)</li>
     <li>Windows Thai (cp874)</li>
     <li>Latin 1 Windows (cp1252) (and all other cp125? encodings recognized by IANA)</li>
     <li>cp858</li>
     <li>EBCDIC encodings:</li>
      <ul>
       <li>EBCDIC US (ebcdic-cp-us)</li>
       <li>EBCDIC Canada (ebcdic-cp-ca)</li>
       <li>EBCDIC Netherland (ebcdic-cp-nl)</li>
       <li>EBCDIC Denmark (ebcdic-cp-dk)</li>
       <li>EBCDIC Norway (ebcdic-cp-no)</li>
       <li>EBCDIC Finland (ebcdic-cp-fi)</li>
       <li>EBCDIC Sweden (ebcdic-cp-se)</li>
       <li>EBCDIC Italy (ebcdic-cp-it)</li>
       <li>EBCDIC Spain, Latin America (ebcdic-cp-es)</li>
       <li>EBCDIC Great Britain (ebcdic-cp-gb)</li>
       <li>EBCDIC France (ebcdic-cp-fr)</li>
       <li>EBCDIC Hebrew (ebcdic-cp-he)</li>
       <li>EBCDIC Switzerland (ebcdic-cp-ch)</li>
       <li>EBCDIC Roece (ebcdic-cp-roece)</li>
       <li>EBCDIC Yugoslavia (ebcdic-cp-yu)</li>
       <li>EBCDIC Iceland (ebcdic-cp-is)</li>
       <li>EBCDIC Urdu (ebcdic-cp-ar2)</li>
       <li>Latin 0 EBCDIC</li>
       <li>EBCDIC Arabic (ebcdic-cp-ar1)</li>
      </ul>
    </ul>
    <note>UCS-4 is not yet supported, but it is hoped that support will be available soon.</note>
   </a>
  </faq>
 </faqs>
	<?xml version='1.0' encoding='UTF-8'?>
	<!DOCTYPE faqs SYSTEM 'dtd/faqs.dtd'>
	<faqs title='General FAQs'>
	<faq title="Jar file changes">
	<q>What happened to xerces.jar</q>
	<a>
	<p>In order to take advantage of the fact that this parser is
	very often used in conjunction with other XML technologies,
	such as XSLT processors, which also rely on standard
	API's like DOM and SAX, xerces.jar was split into two
	jarfiles:
	</p>
	<ul>
	<li><code>xmlParserAPIs.jar</code> contains the DOM level 2,
	SAX 2.0 and JAXP 1.1 API's;</li>
	<li><code>xercesImpl.jar</code> contains the implementation of
	these API's as well as the XNI API.
	</li>
	</ul>
	<p>For backwards compatibility, we have retained the ability
	to generate old-style jarfiles. For instructions, see <link
	idref="install">the installation documentation</link>.
	</p>
	</a>
	</faq>
	<faq title='Validation against DTD'>
	<q>How do I turn on DTD validation?</q>
	<a>
	<p>
	You can turn validation on and off via methods available
	on the SAX2 <code>XMLReader</code> interface. While only the
	<code>SAXParser</code> implements the <code>XMLReader</code>
	interface, the methods required for turning on validation
	are available to both parser classes, DOM and SAX.
	<br/>
	The code snippet below shows how to turn validation on -- assume
	that <ref>parser</ref> is an instance of either
	<code>org.apache.xerces.parsers.SAXParser</code> or
	<code>org.apache.xerces.parsers.DOMParser</code>.
	<br/><br/>
	<code>parser.setFeature("http://xml.org/sax/features/validation", true);</code>
	</p>
	</a>
	</faq>
	<faq title='IDs and XML Schemas'>
	<q>Why does getElementById() not always work for documents validated against XML Schemas?</q>
	<a>
	<p>According to the XML Schema specification, an instance document might have
	more than one <jump href="http://www.w3.org/TR/xmlschema-1/#key-vr">validation root</jump> and
	<jump href="http://www.w3.org/TR/xmlschema-1/#cvc-id">ID/IDREFS</jump> must be
	unique only within the context of a particular validation root, meaning that a
	document may potentially contain multiple identical ids. In this case, the output
	of getElementById() is unspecified. On the other hand, if the document root is
	a validation root of the document, getElementById() should work as expected.
	</p>
	</a>
	</faq>

	<faq title='PSVI'>
	<q>How do I get access to the PSVI?</q>
	<a>
	<p>Xerces provides a sample component PSVIWriter that intercepts document
	handler events and collects PSVI information. For more information read <link
	idref="samples-xni">samples documentation</link> on how to use xni.parser.PSVIParser
	and xni.parser.PSVIConfiguration.
	</p>
	<note>Xerces only produces light-weight PSVI.</note>
	</a>
	</faq>


	<faq title='International Encodings'>
	<q>What international encodings are supported by &ParserName;?</q>
	<a>
	<ul>
	<li>UTF-8</li>
	<li>UTF-16 Big Endian, UTF-16 Little Endian</li>
	<li>IBM-1208</li>
	<li>ISO Latin-1 (ISO-8859-1)</li>
	<li>
	ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech,
	Hungarian, Polish, Romanian, Serbian (in Latin transcription),
	Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian]
	</li>
	<li>ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]</li>
	<li>ISO Latin-4 (ISO-8859-4)</li>
	<li>ISO Latin Cyrillic (ISO-8859-5)</li>
	<li>ISO Latin Arabic (ISO-8859-6)</li>
	<li>ISO Latin Greek (ISO-8859-7)</li>
	<li>ISO Latin Hebrew (ISO-8859-8)</li>
	<li>ISO Latin-5 (ISO-8859-9) [Turkish]</li>
	<li>Extended Unix Code, packed for Japanese (euc-jp, eucjis)</li>
	<li>Japanese Shift JIS (shift-jis)</li>
	<li>Chinese (big5)</li>
	<li>Chinese for PRC (mixed 1/2 byte) (gb2312)</li>
	<li>Japanese ISO-2022-JP (iso-2022-jp)</li>
	<li>Cyrillic (koi8-r)</li>
	<li>Extended Unix Code, packed for Korean (euc-kr)</li>
	<li>Russian Unix, Cyrillic (koi8-r)</li>
	<li>Windows Thai (cp874)</li>
	<li>Latin 1 Windows (cp1252) (and all other cp125? encodings recognized by IANA)</li>
	<li>cp858</li>
	<li>EBCDIC encodings:</li>
	<ul>
	<li>EBCDIC US (ebcdic-cp-us)</li>
	<li>EBCDIC Canada (ebcdic-cp-ca)</li>
	<li>EBCDIC Netherland (ebcdic-cp-nl)</li>
	<li>EBCDIC Denmark (ebcdic-cp-dk)</li>
	<li>EBCDIC Norway (ebcdic-cp-no)</li>
	<li>EBCDIC Finland (ebcdic-cp-fi)</li>
	<li>EBCDIC Sweden (ebcdic-cp-se)</li>
	<li>EBCDIC Italy (ebcdic-cp-it)</li>
	<li>EBCDIC Spain, Latin America (ebcdic-cp-es)</li>
	<li>EBCDIC Great Britain (ebcdic-cp-gb)</li>
	<li>EBCDIC France (ebcdic-cp-fr)</li>
	<li>EBCDIC Hebrew (ebcdic-cp-he)</li>
	<li>EBCDIC Switzerland (ebcdic-cp-ch)</li>
	<li>EBCDIC Roece (ebcdic-cp-roece)</li>
	<li>EBCDIC Yugoslavia (ebcdic-cp-yu)</li>
	<li>EBCDIC Iceland (ebcdic-cp-is)</li>
	<li>EBCDIC Urdu (ebcdic-cp-ar2)</li>
	<li>Latin 0 EBCDIC</li>
	<li>EBCDIC Arabic (ebcdic-cp-ar1)</li>
	</ul>
	</ul>
	<note>UCS-4 is not yet supported, but it is hoped that support will be available soon.</note>
	</a>
	</faq>
	</faqs>