design/issues.html - xerces2-j - Git at Google

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
                       "http://www.w3.org/TR/WD-html-in-xml/DTD/xhtml1-strict.dtd">
 <HTML>
  <HEAD>
   <TITLE>Xerces 2 | Issues</TITLE>
   <LINK href="css/site.css" rel="stylesheet" type="text/css">
  </HEAD>
  <BODY>
   <SPAN class="netscape">
    <A name="TOP"></A><H1>Implementation Issues</H1>
    <A name></A>
    <H2>Open Issues</H2>
    <A name="I1"></A>
    <H3>Entity Management &amp; Readers    (I1)   </H3>
    <P>
     <TABLE border="0" cellspacing="5">
      <TR>
       <TH>Originator:</TH>
       <TD><A href="mailto:andyc@apache.org">Andy Clark</A></TD>
      </TR>
      <TR>
       <TH>Details:</TH>
       <TD>    The heart of parsing XML documents is the interaction
        between    the various scanners and the document input stream.
        Since this    is the critical path in the parser, performance is
        an important    consideration. Yet, at the same time, the
        interaction between    scanner and reader should be simple and
        straightforward.   </TD>
      </TR>
      <TR>
       <TH>Problem:</TH>
       <TD>     Entity stack management. The scanners can push readers on
        the     stack when an entity reference is seen but who pops the
        reader?    </TD>
      </TR>
      <TR>
       <TH>Problem:</TH>
       <TD>     Who scans the XMLDecl and TextDecl lines? The scanner(s)
        or     the entity reader?    </TD>
      </TR>
      <TR>
       <TH>Problem:</TH>
       <TD>     Handling the changing of the input stream reader based
        on the encoding specified in the XMLDecl or TextDecl lines.    </TD>
      </TR>
      <TR>
       <TH>Problem:</TH>
       <TD>     How do we handle the pushback buffer needed at certain
        times during parsing while still keeping the amount of
        delegation down to a minimum?    </TD>
      </TR>
     </TABLE>
    </P>
    <A name="I2"></A>
    <H3>Parser Pipeline Construction    (I2)   </H3>
    <P>
     <TABLE border="0" cellspacing="5">
      <TR>
       <TH>Originator:</TH>
       <TD><A href="mailto:estaub@mediaone.net">Ed Staub</A></TD>
      </TR>
      <TR>
       <TH>Details:</TH>
       <TD>    The parser is designed as a pipeline and the components
        need    to be connected together and configured.   </TD>
      </TR>
      <TR>
       <TH>Problem:</TH>
       <TD>     Is there a generic way to put components together in
        order to construct the pipeline? If users want to     construct a
        new parser object with a different pipeline     configuration,
        this should be easy to do.    </TD>
      </TR>
     </TABLE>
    </P>
    <A name="I3"></A>
    <H3>Operations on XML Infoset    (I3)   </H3>
    <P>
     <TABLE border="0" cellspacing="5">
      <TR>
       <TH>Originator:</TH>
       <TD><A href="mailto:estaub@mediaone.net">Ed Staub</A></TD>
      </TR>
      <TR>
       <TH>Details:</TH>
       <TD>    There are many emerging XML standards that involve
        operations    on the XML Infoset.    XInclude is one of    these
        standards that defines how XML inclusion operates on the
        Infoset.   </TD>
      </TR>
      <TR>
       <TH>Problem:</TH>
       <TD>     When should these operations take place in the parser
        pipeline: before or after validation? And if it's after
        validation, then what does it mean for the parser to say a
        document is &quot;valid&quot;? The inclusion could cause the
        document to      become invalid. This is more than an
        implementation issue; it's     a standards issue.    <BR>
        <STRONG>Comment: </STRONG>     If XInclude processing were called
        after validation, then what would be     validated would be the
        unexpanded XInclude.  In other words, the schema or     DTD would
        describe the original unexpanded xinclude:href attributes, and so
        forth.    </TD>
      </TR>
      <TR>
       <TH>Problem:</TH>
       <TD>     Should operations on the XML Infoset even be in the
        parser     pipeline?    </TD>
      </TR>
     </TABLE>
    </P>
   </SPAN>
   <A name="BOTTOM"></A>
   <HR>
   <SPAN class="netscape">      Last updated on $Date: 2000/09/15
    23:53:06 $</SPAN>
  </BODY>
 </HTML>
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
	"http://www.w3.org/TR/WD-html-in-xml/DTD/xhtml1-strict.dtd">
	<HTML>
	<HEAD>
	<TITLE>Xerces 2 \| Issues</TITLE>
	<LINK href="css/site.css" rel="stylesheet" type="text/css">
	</HEAD>
	<BODY>
	<SPAN class="netscape">
	<A name="TOP"></A><H1>Implementation Issues</H1>
	<A name></A>
	<H2>Open Issues</H2>
	<A name="I1"></A>
	<H3>Entity Management & Readers (I1) </H3>
	<P>
	<TABLE border="0" cellspacing="5">
	<TR>
	<TH>Originator:</TH>
	<TD><A href="mailto:andyc@apache.org">Andy Clark</A></TD>
	</TR>
	<TR>
	<TH>Details:</TH>
	<TD> The heart of parsing XML documents is the interaction
	between the various scanners and the document input stream.
	Since this is the critical path in the parser, performance is
	an important consideration. Yet, at the same time, the
	interaction between scanner and reader should be simple and
	straightforward. </TD>
	</TR>
	<TR>
	<TH>Problem:</TH>
	<TD> Entity stack management. The scanners can push readers on
	the stack when an entity reference is seen but who pops the
	reader? </TD>
	</TR>
	<TR>
	<TH>Problem:</TH>
	<TD> Who scans the XMLDecl and TextDecl lines? The scanner(s)
	or the entity reader? </TD>
	</TR>
	<TR>
	<TH>Problem:</TH>
	<TD> Handling the changing of the input stream reader based
	on the encoding specified in the XMLDecl or TextDecl lines. </TD>
	</TR>
	<TR>
	<TH>Problem:</TH>
	<TD> How do we handle the pushback buffer needed at certain
	times during parsing while still keeping the amount of
	delegation down to a minimum? </TD>
	</TR>
	</TABLE>
	</P>
	<A name="I2"></A>
	<H3>Parser Pipeline Construction (I2) </H3>
	<P>
	<TABLE border="0" cellspacing="5">
	<TR>
	<TH>Originator:</TH>
	<TD><A href="mailto:estaub@mediaone.net">Ed Staub</A></TD>
	</TR>
	<TR>
	<TH>Details:</TH>
	<TD> The parser is designed as a pipeline and the components
	need to be connected together and configured. </TD>
	</TR>
	<TR>
	<TH>Problem:</TH>
	<TD> Is there a generic way to put components together in
	order to construct the pipeline? If users want to construct a
	new parser object with a different pipeline configuration,
	this should be easy to do. </TD>
	</TR>
	</TABLE>
	</P>
	<A name="I3"></A>
	<H3>Operations on XML Infoset (I3) </H3>
	<P>
	<TABLE border="0" cellspacing="5">
	<TR>
	<TH>Originator:</TH>
	<TD><A href="mailto:estaub@mediaone.net">Ed Staub</A></TD>
	</TR>
	<TR>
	<TH>Details:</TH>
	<TD> There are many emerging XML standards that involve
	operations on the XML Infoset. XInclude is one of these
	standards that defines how XML inclusion operates on the
	Infoset. </TD>
	</TR>
	<TR>
	<TH>Problem:</TH>
	<TD> When should these operations take place in the parser
	pipeline: before or after validation? And if it's after
	validation, then what does it mean for the parser to say a
	document is "valid"? The inclusion could cause the
	document to become invalid. This is more than an
	implementation issue; it's a standards issue. <BR>
	<STRONG>Comment: </STRONG> If XInclude processing were called
	after validation, then what would be validated would be the
	unexpanded XInclude. In other words, the schema or DTD would
	describe the original unexpanded xinclude:href attributes, and so
	forth. </TD>
	</TR>
	<TR>
	<TH>Problem:</TH>
	<TD> Should operations on the XML Infoset even be in the
	parser pipeline? </TD>
	</TR>
	</TABLE>
	</P>
	</SPAN>
	<A name="BOTTOM"></A>
	<HR>
	<SPAN class="netscape"> Last updated on $Date: 2000/09/15
	23:53:06 $</SPAN>
	</BODY>
	</HTML>