Part of fix for XALANJ-2294 and XALANJ-2295.

Each KeyIndex for a particular key or for the id function was set up with a
Hashtable mapping Strings to nodes.  However, the set of nodes returned are
only supposed to be those in the same input document as the context node.  The
code was accepting nodes as node IDs and putting them all into the same table,
so node IDs from different documents were being mixed together.  Fixed this by
adding another Hashtable from the root node of a document to the Hashtable that
maps Strings to node handles in that document.  This affects insertion of nodes
into the KeyIndex (in add) and look-up of nodes for patterns (in containsID,
containsKey, getDOMNodeById).

Generated byte code previously looked up nodes to be retrieved by a reference
to the key or id function by cloning an IntegerArray containing the first set
of nodes and merging in subsequent nodes retrieved.  The generated code
contained any required looping code to loop over nodes in a node set that
appeared in a call to key or id.  The effect of all this was that every node in
the resulting node set was processed at least once, regardless of whether all
the node returned were actually used - they might not need to be if a positional
predicate is used, for instance.

The old KeyIndex.lookupId and KeyIndex.lookupKey are now deprecated, but
preserved for any previously compiled translets.  Instead, new code will use
the KeyIndex.getKeyIndexIterator methods to get an iterator that will return the
nodes for a particular reference to the key or id function.

The iterator returned by getKeyIndexIterator is an instance of an inner class -
KeyIndex.KeyIndexIterator - which extends the new MultiValuedNodeHeapIterator.
Each node in the heap refers to an IntegerArray that contains the nodes for
each key value or id value that was looked up.  It's sensitive to the context
node (or more importantly, the root of the context node) and retrieves node
handles for the function reference lazily to avoid unnecessarily greedy and
potentially duplicate processing of the nodes.

Also, fix for XALANJ-2292.  The byte code generation assumed that if the
second argument to a reference to the key function was not a node set or a
string, that it had to be converted to a string.  However, if the argument is
a parameter whose value is a node set, all the nodes in the node set should
play a role in computing the result of the function, not just the first.

The KeyIndex.KeyIndexIterator is responsible for processing the argument to the
key or id function, iterating over nodes in a node set if required, rather than
leaving that responsibility to generated byte code, because we don't generally
know whether the argument will be an iterator.

Reviewed by Christine Li (jycli () ca ! ibm ! com)

1 file changed