lucene/MIGRATE.txt - manifoldcf-integration-solr-5.x - Git at Google


 LUCENE-2380: FieldCache.getStrings/Index --> FieldCache.getDocTerms/Index

   * The field values returned when sorting by SortField.STRING are now
     BytesRef.  You can call value.utf8ToString() to convert back to
     string, if necessary.

   * In FieldCache, getStrings (returning String[]) has been replaced
     with getTerms (returning a FieldCache.DocTerms instance).
     DocTerms provides a getTerm method, taking a docID and a BytesRef
     to fill (which must not be null), and it fills it in with the
     reference to the bytes for that term.

     If you had code like this before:

       String[] values = FieldCache.DEFAULT.getStrings(reader, field);
       ...
       String aValue = values[docID];

     you can do this instead:

       DocTerms values = FieldCache.DEFAULT.getTerms(reader, field);
       ...
       BytesRef term = new BytesRef();
       String aValue = values.getTerm(docID, term).utf8ToString();

     Note however that it can be costly to convert to String, so it's
     better to work directly with the BytesRef.

   * Similarly, in FieldCache, getStringIndex (returning a StringIndex
     instance, with direct arrays int[] order and String[] lookup) has
     been replaced with getTermsIndex (returning a
     FieldCache.DocTermsIndex instance).  DocTermsIndex provides the
     getOrd(int docID) method to lookup the int order for a document,
     lookup(int ord, BytesRef reuse) to lookup the term from a given
     order, and the sugar method getTerm(int docID, BytesRef reuse)
     which internally calls getOrd and then lookup.

     If you had code like this before:

       StringIndex idx = FieldCache.DEFAULT.getStringIndex(reader, field);
       ...
       int ord = idx.order[docID];
       String aValue = idx.lookup[ord];

     you can do this instead:

       DocTermsIndex idx = FieldCache.DEFAULT.getTermsIndex(reader, field);
       ...
       int ord = idx.getOrd(docID);
       BytesRef term = new BytesRef();
       String aValue = idx.lookup(ord, term).utf8ToString();

     Note however that it can be costly to convert to String, so it's
     better to work directly with the BytesRef.

     DocTermsIndex also has a getTermsEnum() method, which returns an
     iterator (TermsEnum) over the term values in the index (ie,
     iterates ord = 0..numOrd()-1).

   * StringComparatorLocale is now more CPU costly than it was before
     (it was already very CPU costly since it does not compare using
     indexed collation keys; use CollationKeyFilter for better
     performance), since it converts BytesRef -> String on the fly.
     Also, the field values returned when sorting by SortField.STRING
     are now BytesRef.

   * FieldComparator.StringOrdValComparator has been renamed to
     TermOrdValComparator, and now uses BytesRef for its values.
     Likewise for StringValComparator, renamed to TermValComparator.
     This means when sorting by SortField.STRING or
     SortField.STRING_VAL (or directly invoking these comparators) the
     values returned in the FieldDoc.fields array will be BytesRef not
     String.  You can call the .utf8ToString() method on the BytesRef
     instances, if necessary.


 LUCENE-1458, LUCENE-2111: Flexible Indexing

   Flexible indexing changed the low level fields/terms/docs/positions
   enumeration APIs.  Here are the major changes:

     * Terms are now binary in nature (arbitrary byte[]), represented
       by the BytesRef class (which provides an offset + length "slice"
       into an existing byte[]).

     * Fields are separately enumerated (FieldsEnum) from the terms
       within each field (TermEnum).  So instead of this:

         TermEnum termsEnum = ...;
 	while(termsEnum.next()) {
 	  Term t = termsEnum.term();
 	  System.out.println("field=" + t.field() + "; text=" + t.text());
         }

       Do this:

         FieldsEnum fieldsEnum = ...;
 	String field;
 	while((field = fieldsEnum.next()) != null) {
 	  TermsEnum termsEnum = fieldsEnum.terms();
 	  BytesRef text;
 	  while((text = termsEnum.next()) != null) {
 	    System.out.println("field=" + field + "; text=" + text.utf8ToString());
 	  }

     * TermDocs is renamed to DocsEnum.  Instead of this:

         while(td.next()) {
 	  int doc = td.doc();
 	  ...
 	}

       do this:

         int doc;
 	while((doc = td.next()) != DocsEnum.NO_MORE_DOCS) {
 	  ...
  	}

       Instead of this:

         if (td.skipTo(target)) {
 	  int doc = td.doc();
 	  ...
 	}

       do this:

         if ((doc=td.advance(target)) != DocsEnum.NO_MORE_DOCS) {
 	  ...
 	}

       The bulk read API has also changed.  Instead of this:

         int[] docs = new int[256];
         int[] freqs = new int[256];

         while(true) {
           int count = td.read(docs, freqs)
           if (count == 0) {
             break;
           }
           // use docs[i], freqs[i]
         }

       do this:

         DocsEnum.BulkReadResult bulk = td.getBulkResult();
         while(true) {
           int count = td.read();
           if (count == 0) {
             break;
           }
           // use bulk.docs.ints[i] and bulk.freqs.ints[i]
         }

     * TermPositions is renamed to DocsAndPositionsEnum, and no longer
       extends the docs only enumerator (DocsEnum).

     * Deleted docs are no longer implicitly filtered from
       docs/positions enums.  Instead, you pass a Bits
       skipDocs (set bits are skipped) when obtaining the enums.  Also,
       you can now ask a reader for its deleted docs.

     * The docs/positions enums cannot seek to a term.  Instead,
       TermsEnum is able to seek, and then you request the
       docs/positions enum from that TermsEnum.

     * TermsEnum's seek method returns more information.  So instead of
       this:

         Term t;
         TermEnum termEnum = reader.terms(t);
 	if (t.equals(termEnum.term())) {
 	  ...
         }

       do this:

         TermsEnum termsEnum = ...;
 	BytesRef text;
 	if (termsEnum.seek(text) == TermsEnum.SeekStatus.FOUND) {
 	  ...
 	}

       SeekStatus also contains END (enumerator is done) and NOT_FOUND
       (term was not found but enumerator is now positioned to the next
       term).

     * TermsEnum has an ord() method, returning the long numeric
       ordinal (ie, first term is 0, next is 1, and so on) for the term
       it's not positioned to.  There is also a corresponding seek(long
       ord) method.  Note that these methods are optional; in
       particular the MultiFields TermsEnum does not implement them.


   How you obtain the enums has changed.  The primary entry point is
   the Fields class.  If you know your reader is a single segment
   reader, do this:

     Fields fields = reader.Fields();
     if (fields != null) {
       ...
     }

   If the reader might be multi-segment, you must do this:

     Fields fields = MultiFields.getFields(reader);
     if (fields != null) {
       ...
     }

   The fields may be null (eg if the reader has no fields).

   Note that the MultiFields approach entails a performance hit on
   MultiReaders, as it must merge terms/docs/positions on the fly. It's
   generally better to instead get the sequential readers (use
   oal.util.ReaderUtil) and then step through those readers yourself,
   if you can (this is how Lucene drives searches).

   If you pass a SegmentReader to MultiFields.fiels it will simply
   return reader.fields(), so there is no performance hit in that
   case.

   Once you have a non-null Fields you can do this:

     Terms terms = fields.terms("field");
     if (terms != null) {
       ...
     }

   The terms may be null (eg if the field does not exist).

   Once you have a non-null terms you can get an enum like this:

     TermsEnum termsEnum = terms.iterator();

   The returned TermsEnum will not be null.

   You can then .next() through the TermsEnum, or seek.  If you want a
   DocsEnum, do this:

     Bits skipDocs = MultiFields.getDeletedDocs(reader);
     DocsEnum docsEnum = null;

     docsEnum = termsEnum.docs(skipDocs, docsEnum);

   You can pass in a prior DocsEnum and it will be reused if possible.

   Likewise for DocsAndPositionsEnum.

   IndexReader has several sugar methods (which just go through the
   above steps, under the hood).  Instead of:

     Term t;
     TermDocs termDocs = reader.termDocs();
     termDocs.seek(t);

   do this:

     String field;
     BytesRef text;
     DocsEnum docsEnum = reader.termDocsEnum(reader.getDeletedDocs(), field, text);

   Likewise for DocsAndPositionsEnum.

 * LUCENE-2600: remove IndexReader.isDeleted

   Instead of IndexReader.isDeleted, do this:

     import org.apache.lucene.util.Bits;
     import org.apache.lucene.index.MultiFields;

     Bits delDocs = MultiFields.getDeletedDocs(indexReader);
     if (delDocs.get(docID)) {
       // document is deleted...
     }

 * LUCENE-2674: A new idfExplain method was added to Similarity, that
   accepts an incoming docFreq.  If you subclass Similarity, make sure
   you also override this method on upgrade, otherwise your
   customizations won't run for certain MultiTermQuerys.

 * LUCENE-2413: Lucene's core and contrib analyzers, along with Solr's analyzers,
   were consolidated into modules/analysis. During the refactoring some
   package names have changed:
     - o.a.l.analysis.KeywordAnalyzer -> o.a.l.analysis.core.KeywordAnalyzer
     - o.a.l.analysis.KeywordTokenizer -> o.a.l.analysis.core.KeywordTokenizer
     - o.a.l.analysis.LetterTokenizer -> o.a.l.analysis.core.LetterTokenizer
     - o.a.l.analysis.LowerCaseFilter -> o.a.l.analysis.core.LowerCaseFilter
     - o.a.l.analysis.LowerCaseTokenizer -> o.a.l.analysis.core.LowerCaseTokenizer
     - o.a.l.analysis.SimpleAnalyzer -> o.a.l.analysis.core.SimpleAnalyzer
     - o.a.l.analysis.StopAnalyzer -> o.a.l.analysis.core.StopAnalyzer
     - o.a.l.analysis.StopFilter -> o.a.l.analysis.core.StopFilter
     - o.a.l.analysis.WhitespaceAnalyzer -> o.a.l.analysis.core.WhitespaceAnalyzer
     - o.a.l.analysis.WhitespaceTokenizer -> o.a.l.analysis.core.WhitespaceTokenizer
     - o.a.l.analysis.PorterStemFilter -> o.a.l.analysis.en.PorterStemFilter
     - o.a.l.analysis.ASCIIFoldingFilter -> o.a.l.analysis.miscellaneous.ASCIIFoldingFilter
     - o.a.l.analysis.ISOLatin1AccentFilter -> o.a.l.analysis.miscellaneous.ISOLatin1AccentFilter
     - o.a.l.analysis.KeywordMarkerFilter -> o.a.l.analysis.miscellaneous.KeywordMarkerFilter
     - o.a.l.analysis.LengthFilter -> o.a.l.analysis.miscellaneous.LengthFilter
     - o.a.l.analysis.PerFieldAnalyzerWrapper -> o.a.l.analysis.miscellaneous.PerFieldAnalyzerWrapper
     - o.a.l.analysis.TeeSinkTokenFilter -> o.a.l.analysis.sinks.TeeSinkTokenFilter
     - o.a.l.analysis.CharFilter -> o.a.l.analysis.charfilter.CharFilter
     - o.a.l.analysis.BaseCharFilter -> o.a.l.analysis.charfilter.BaseCharFilter
     - o.a.l.analysis.MappingCharFilter -> o.a.l.analysis.charfilter.MappingCharFilter
     - o.a.l.analysis.NormalizeCharMap -> o.a.l.analysis.charfilter.NormalizeCharMap
     - o.a.l.analysis.CharArraySet -> o.a.l.analysis.util.CharArraySet
     - o.a.l.analysis.CharArrayMap -> o.a.l.analysis.util.CharArrayMap
     - o.a.l.analysis.ReusableAnalyzerBase -> o.a.l.analysis.util.ReusableAnalyzerBase
     - o.a.l.analysis.StopwordAnalyzerBase -> o.a.l.analysis.util.StopwordAnalyzerBase
     - o.a.l.analysis.WordListLoader -> o.a.l.analysis.util.WordListLoader
     - o.a.l.analysis.CharTokenizer -> o.a.l.analysis.util.CharTokenizer
     - o.a.l.util.CharacterUtils -> o.a.l.analysis.util.CharacterUtils

 * LUCENE-2514: The option to use a Collator's order (instead of binary order) for
   sorting and range queries has been moved to contrib/queries.

   The Collated TermRangeQuery/Filter has been moved to SlowCollatedTermRangeQuery/Filter,
   and the collated sorting has been moved to SlowCollatedStringComparator.

   Note: this functionality isn't very scalable and if you are using it, consider
   indexing collation keys with the collation support in the analysis module instead.

   To perform collated range queries, use a suitable collating analyzer: CollationKeyAnalyzer
   or ICUCollationKeyAnalyzer, and set qp.setAnalyzeRangeTerms(true).

   TermRangeQuery and TermRangeFilter now work purely on bytes. Both have helper factory methods
   (newStringRange) similar to the NumericRange API, to easily perform range queries on Strings.

 * LUCENE-2691: The near-real-time API has moved from IndexWriter to
   IndexReader.  Instead of IndexWriter.getReader(), call
   IndexReader.open(IndexWriter) or IndexReader.reopen(IndexWriter).

 * LUCENE-2690: MultiTermQuery boolean rewrites per segment.
   Also MultiTermQuery.getTermsEnum() now takes an AttributeSource. FuzzyTermsEnum
   is both consumer and producer of attributes: MTQ.BoostAttribute is
   added to the FuzzyTermsEnum and MTQ's rewrite mode consumes it.
   The other way round MTQ.TopTermsBooleanQueryRewrite supplys a
   global AttributeSource to each segments TermsEnum. The TermsEnum is consumer
   and gets the current minimum competitive boosts (MTQ.MaxNonCompetitiveBoostAttribute).

 * LUCENE-2374: The backwards layer in AttributeImpl was removed. To support correct
   reflection of AttributeImpl instances, where the reflection was done using deprecated
   toString() parsing, you have to now override reflectWith() to customize output.
   toString() is no longer implemented by AttributeImpl, so if you have overridden
   toString(), port your customization over to reflectWith(). reflectAsString() would
   then return what toString() did before.

 * LUCENE-2236, LUCENE-2912: DefaultSimilarity can no longer be set statically
   (and dangerously) for the entire JVM.
   Instead, IndexWriterConfig and IndexSearcher now take a SimilarityProvider.
   Similarity can now be configured on a per-field basis.
   Similarity retains only the field-specific relevance methods such as tf() and idf().
   Previously some (but not all) of these methods, such as computeNorm and scorePayload took
   field as a parameter, this is removed due to the fact the entire Similarity (all methods)
   can now be configured per-field.
   Methods that apply to the entire query such as coord() and queryNorm() exist in SimilarityProvider.

 * LUCENE-1076: TieredMergePolicy is now the default merge policy.
   It's able to merge non-contiguous segments; this may cause problems
   for applications that rely on Lucene's internal document ID
   assigment.  If so, you should instead use LogByteSize/DocMergePolicy
   during indexing.

 * LUCENE-2883: Lucene's o.a.l.search.function ValueSource based functionality, was consolidated
   into module/queries along with Solr's similar functionality.  The following classes were moved:
    - o.a.l.search.function.CustomScoreQuery -> o.a.l.queries.CustomScoreQuery
    - o.a.l.search.function.CustomScoreProvider -> o.a.l.queries.CustomScoreProvider
    - o.a.l.search.function.NumericIndexDocValueSource -> o.a.l.queries.function.valuesource.NumericIndexDocValueSource
   The following lists the replacement classes for those removed:
    - o.a.l.search.function.ByteFieldSource -> o.a.l.queries.function.valuesource.ByteFieldSource
    - o.a.l.search.function.DocValues -> o.a.l.queries.function.DocValues
    - o.a.l.search.function.FieldCacheSource -> o.a.l.queries.function.valuesource.FieldCacheSource
    - o.a.l.search.function.FieldScoreQuery ->o.a.l.queries.function.FunctionQuery
    - o.a.l.search.function.FloatFieldSource -> o.a.l.queries.function.valuesource.FloatFieldSource
    - o.a.l.search.function.IntFieldSource -> o.a.l.queries.function.valuesource.IntFieldSource
    - o.a.l.search.function.OrdFieldSource -> o.a.l.queries.function.valuesource.OrdFieldSource
    - o.a.l.search.function.ReverseOrdFieldSource -> o.a.l.queries.function.valuesource.ReverseOrdFieldSource
    - o.a.l.search.function.ShortFieldSource -> o.a.l.queries.function.valuesource.ShortFieldSource
    - o.a.l.search.function.ValueSource -> o.a.l.queries.function.ValueSource
    - o.a.l.search.function.ValueSourceQuery -> o.a.l.queries.function.FunctionQuery

 * LUCENE-2392: Enable flexible scoring:

   The existing "Similarity" api is now TFIDFSimilarity, if you were extending
   Similarity before, you should likely extend this instead.

   Weight.normalize no longer takes a norm value that incorporates the top-level
   boost from outer queries such as BooleanQuery, instead it takes 2 parameters,
   the outer boost (topLevelBoost) and the norm. Weight.sumOfSquaredWeights has
   been renamed to Weight.getValueForNormalization().

   The scorePayload method now takes a BytesRef. It is never null.

 * LUCENE-3283: Lucene's core o.a.l.queryParser QueryParsers have been consolidated into module/queryparser,
   where other QueryParsers from the codebase will also be placed.  The following classes were moved:
   - o.a.l.queryParser.CharStream -> o.a.l.queryparser.classic.CharStream
   - o.a.l.queryParser.FastCharStream -> o.a.l.queryparser.classic.FastCharStream
   - o.a.l.queryParser.MultiFieldQueryParser -> o.a.l.queryparser.classic.MultiFieldQueryParser
   - o.a.l.queryParser.ParseException -> o.a.l.queryparser.classic.ParseException
   - o.a.l.queryParser.QueryParser -> o.a.l.queryparser.classic.QueryParser
   - o.a.l.queryParser.QueryParserBase -> o.a.l.queryparser.classic.QueryParserBase
   - o.a.l.queryParser.QueryParserConstants -> o.a.l.queryparser.classic.QueryParserConstants
   - o.a.l.queryParser.QueryParserTokenManager -> o.a.l.queryparser.classic.QueryParserTokenManager
   - o.a.l.queryParser.QueryParserToken -> o.a.l.queryparser.classic.Token
   - o.a.l.queryParser.QueryParserTokenMgrError -> o.a.l.queryparser.classic.TokenMgrError


 * LUCENE-2308: Separate IndexableFieldType from Field instances

 With this change, the indexing details (indexed, tokenized, norms,
 indexOptions, stored, etc.) are moved into a separate FieldType
 instance (rather than being stored directly on the Field).

 This means you can create the IndexableFieldType instance once, up front,
 for a given field, and then re-use that instance whenever you instantiate
 the Field.

 Certain field types are pre-defined since they are common cases:

   * StringField: indexes a String value as a single token (ie, does
     not tokenize).  This field turns off norms and indexes only doc
     IDS (does not index term frequency nor positions).  This field
     does not store its value, but exposes TYPE_STORED as well.

   * BinaryField: a byte[] value that's only stored.

   * TextField: indexes and tokenizes a String, Reader or TokenStream
     value, without term vectors.  This field does not store its value,
     but exposes TYPE_STORED as well.

 If your usage fits one of those common cases you can simply
 instantiate the above class.  To use the TYPE_STORED variant, do this
 instead:

   Field f = new Field("field", StringField.TYPE_STORED, "value");

 Alternatively, if an existing type is close to what you want but you
 need to make a few changes, you can copy that type and make changes:

   FieldType bodyType = new FieldType(TextField.TYPE_STORED);
   bodyType.setStoreTermVectors(true);


 You can of course also create your own FieldType from scratch:

   FieldType t = new FieldType();
   t.setIndexed(true);
   t.setStored(true);
   t.setOmitNorms(true);
   t.setIndexOptions(IndexOptions.DOCS_AND_FREQS);

 FieldType has a freeze() method to prevent further changes.

 When migrating from the 3.x API, if you did this before:

   new Field("field", "value", Field.Store.NO, Field.Indexed.NOT_ANALYZED_NO_NORMS)

 you can now do this:

   new StringField("field", "value")

 (though note that StringField indexes DOCS_ONLY).

 If instead the value was stored:

   new Field("field", "value", Field.Store.YES, Field.Indexed.NOT_ANALYZED_NO_NORMS)

 you can now do this:

   new Field("field", StringField.TYPE_STORED, "value")

 If you didn't omit norms:

   new Field("field", "value", Field.Store.YES, Field.Indexed.NOT_ANALYZED)

 you can now do this:

   FieldType ft = new FieldType(StringField.TYPE_STORED);
   ft.setOmitNorms(false);
   new Field("field", ft, "value")

 If you did this before (value can be String or Reader):

   new Field("field", value, Field.Store.NO, Field.Indexed.ANALYZED)

 you can now do this:

   new TextField("field", value)

 If instead the value was stored:

   new Field("field", value, Field.Store.YES, Field.Indexed.ANALYZED)

 you can now do this:

   new Field("field", TextField.TYPE_STORED, value)

 If in addition you omit norms:

   new Field("field", value, Field.Store.YES, Field.Indexed.ANALYZED_NO_NORMS)

 you can now do this:

   FieldType ft = new FieldType(TextField.TYPE_STORED);
   ft.setOmitNorms(true);
   new Field("field", ft, value)

 If you did this before (bytes is a byte[]):

   new Field("field", bytes)

 you can now do this:

   new BinaryField("field", bytes)

	LUCENE-2380: FieldCache.getStrings/Index --> FieldCache.getDocTerms/Index

	* The field values returned when sorting by SortField.STRING are now
	BytesRef. You can call value.utf8ToString() to convert back to
	string, if necessary.

	* In FieldCache, getStrings (returning String[]) has been replaced
	with getTerms (returning a FieldCache.DocTerms instance).
	DocTerms provides a getTerm method, taking a docID and a BytesRef
	to fill (which must not be null), and it fills it in with the
	reference to the bytes for that term.

	If you had code like this before:

	String[] values = FieldCache.DEFAULT.getStrings(reader, field);
	...
	String aValue = values[docID];

	you can do this instead:

	DocTerms values = FieldCache.DEFAULT.getTerms(reader, field);
	...
	BytesRef term = new BytesRef();
	String aValue = values.getTerm(docID, term).utf8ToString();

	Note however that it can be costly to convert to String, so it's
	better to work directly with the BytesRef.

	* Similarly, in FieldCache, getStringIndex (returning a StringIndex
	instance, with direct arrays int[] order and String[] lookup) has
	been replaced with getTermsIndex (returning a
	FieldCache.DocTermsIndex instance). DocTermsIndex provides the
	getOrd(int docID) method to lookup the int order for a document,
	lookup(int ord, BytesRef reuse) to lookup the term from a given
	order, and the sugar method getTerm(int docID, BytesRef reuse)
	which internally calls getOrd and then lookup.

	If you had code like this before:

	StringIndex idx = FieldCache.DEFAULT.getStringIndex(reader, field);
	...
	int ord = idx.order[docID];
	String aValue = idx.lookup[ord];

	you can do this instead:

	DocTermsIndex idx = FieldCache.DEFAULT.getTermsIndex(reader, field);
	...
	int ord = idx.getOrd(docID);
	BytesRef term = new BytesRef();
	String aValue = idx.lookup(ord, term).utf8ToString();

	Note however that it can be costly to convert to String, so it's
	better to work directly with the BytesRef.

	DocTermsIndex also has a getTermsEnum() method, which returns an
	iterator (TermsEnum) over the term values in the index (ie,
	iterates ord = 0..numOrd()-1).

	* StringComparatorLocale is now more CPU costly than it was before
	(it was already very CPU costly since it does not compare using
	indexed collation keys; use CollationKeyFilter for better
	performance), since it converts BytesRef -> String on the fly.
	Also, the field values returned when sorting by SortField.STRING
	are now BytesRef.

	* FieldComparator.StringOrdValComparator has been renamed to
	TermOrdValComparator, and now uses BytesRef for its values.
	Likewise for StringValComparator, renamed to TermValComparator.
	This means when sorting by SortField.STRING or
	SortField.STRING_VAL (or directly invoking these comparators) the
	values returned in the FieldDoc.fields array will be BytesRef not
	String. You can call the .utf8ToString() method on the BytesRef
	instances, if necessary.



	LUCENE-1458, LUCENE-2111: Flexible Indexing

	Flexible indexing changed the low level fields/terms/docs/positions
	enumeration APIs. Here are the major changes:

	* Terms are now binary in nature (arbitrary byte[]), represented
	by the BytesRef class (which provides an offset + length "slice"
	into an existing byte[]).

	* Fields are separately enumerated (FieldsEnum) from the terms
	within each field (TermEnum). So instead of this:

	TermEnum termsEnum = ...;
	while(termsEnum.next()) {
	Term t = termsEnum.term();
	System.out.println("field=" + t.field() + "; text=" + t.text());
	}

	Do this:

	FieldsEnum fieldsEnum = ...;
	String field;
	while((field = fieldsEnum.next()) != null) {
	TermsEnum termsEnum = fieldsEnum.terms();
	BytesRef text;
	while((text = termsEnum.next()) != null) {
	System.out.println("field=" + field + "; text=" + text.utf8ToString());
	}

	* TermDocs is renamed to DocsEnum. Instead of this:

	while(td.next()) {
	int doc = td.doc();
	...
	}

	do this:

	int doc;
	while((doc = td.next()) != DocsEnum.NO_MORE_DOCS) {
	...
	}

	Instead of this:

	if (td.skipTo(target)) {
	int doc = td.doc();
	...
	}

	do this:

	if ((doc=td.advance(target)) != DocsEnum.NO_MORE_DOCS) {
	...
	}

	The bulk read API has also changed. Instead of this:

	int[] docs = new int[256];
	int[] freqs = new int[256];

	while(true) {
	int count = td.read(docs, freqs)
	if (count == 0) {
	break;
	}
	// use docs[i], freqs[i]
	}

	do this:

	DocsEnum.BulkReadResult bulk = td.getBulkResult();
	while(true) {
	int count = td.read();
	if (count == 0) {
	break;
	}
	// use bulk.docs.ints[i] and bulk.freqs.ints[i]
	}

	* TermPositions is renamed to DocsAndPositionsEnum, and no longer
	extends the docs only enumerator (DocsEnum).

	* Deleted docs are no longer implicitly filtered from
	docs/positions enums. Instead, you pass a Bits
	skipDocs (set bits are skipped) when obtaining the enums. Also,
	you can now ask a reader for its deleted docs.

	* The docs/positions enums cannot seek to a term. Instead,
	TermsEnum is able to seek, and then you request the
	docs/positions enum from that TermsEnum.

	* TermsEnum's seek method returns more information. So instead of
	this:

	Term t;
	TermEnum termEnum = reader.terms(t);
	if (t.equals(termEnum.term())) {
	...
	}

	do this:

	TermsEnum termsEnum = ...;
	BytesRef text;
	if (termsEnum.seek(text) == TermsEnum.SeekStatus.FOUND) {
	...
	}

	SeekStatus also contains END (enumerator is done) and NOT_FOUND
	(term was not found but enumerator is now positioned to the next
	term).

	* TermsEnum has an ord() method, returning the long numeric
	ordinal (ie, first term is 0, next is 1, and so on) for the term
	it's not positioned to. There is also a corresponding seek(long
	ord) method. Note that these methods are optional; in
	particular the MultiFields TermsEnum does not implement them.


	How you obtain the enums has changed. The primary entry point is
	the Fields class. If you know your reader is a single segment
	reader, do this:

	Fields fields = reader.Fields();
	if (fields != null) {
	...
	}

	If the reader might be multi-segment, you must do this:

	Fields fields = MultiFields.getFields(reader);
	if (fields != null) {
	...
	}

	The fields may be null (eg if the reader has no fields).

	Note that the MultiFields approach entails a performance hit on
	MultiReaders, as it must merge terms/docs/positions on the fly. It's
	generally better to instead get the sequential readers (use
	oal.util.ReaderUtil) and then step through those readers yourself,
	if you can (this is how Lucene drives searches).

	If you pass a SegmentReader to MultiFields.fiels it will simply
	return reader.fields(), so there is no performance hit in that
	case.

	Once you have a non-null Fields you can do this:

	Terms terms = fields.terms("field");
	if (terms != null) {
	...
	}

	The terms may be null (eg if the field does not exist).

	Once you have a non-null terms you can get an enum like this:

	TermsEnum termsEnum = terms.iterator();

	The returned TermsEnum will not be null.

	You can then .next() through the TermsEnum, or seek. If you want a
	DocsEnum, do this:

	Bits skipDocs = MultiFields.getDeletedDocs(reader);
	DocsEnum docsEnum = null;

	docsEnum = termsEnum.docs(skipDocs, docsEnum);

	You can pass in a prior DocsEnum and it will be reused if possible.

	Likewise for DocsAndPositionsEnum.

	IndexReader has several sugar methods (which just go through the
	above steps, under the hood). Instead of:

	Term t;
	TermDocs termDocs = reader.termDocs();
	termDocs.seek(t);

	do this:

	String field;
	BytesRef text;
	DocsEnum docsEnum = reader.termDocsEnum(reader.getDeletedDocs(), field, text);

	Likewise for DocsAndPositionsEnum.

	* LUCENE-2600: remove IndexReader.isDeleted

	Instead of IndexReader.isDeleted, do this:

	import org.apache.lucene.util.Bits;
	import org.apache.lucene.index.MultiFields;

	Bits delDocs = MultiFields.getDeletedDocs(indexReader);
	if (delDocs.get(docID)) {
	// document is deleted...
	}

	* LUCENE-2674: A new idfExplain method was added to Similarity, that
	accepts an incoming docFreq. If you subclass Similarity, make sure
	you also override this method on upgrade, otherwise your
	customizations won't run for certain MultiTermQuerys.

	* LUCENE-2413: Lucene's core and contrib analyzers, along with Solr's analyzers,
	were consolidated into modules/analysis. During the refactoring some
	package names have changed:
	- o.a.l.analysis.KeywordAnalyzer -> o.a.l.analysis.core.KeywordAnalyzer
	- o.a.l.analysis.KeywordTokenizer -> o.a.l.analysis.core.KeywordTokenizer
	- o.a.l.analysis.LetterTokenizer -> o.a.l.analysis.core.LetterTokenizer
	- o.a.l.analysis.LowerCaseFilter -> o.a.l.analysis.core.LowerCaseFilter
	- o.a.l.analysis.LowerCaseTokenizer -> o.a.l.analysis.core.LowerCaseTokenizer
	- o.a.l.analysis.SimpleAnalyzer -> o.a.l.analysis.core.SimpleAnalyzer
	- o.a.l.analysis.StopAnalyzer -> o.a.l.analysis.core.StopAnalyzer
	- o.a.l.analysis.StopFilter -> o.a.l.analysis.core.StopFilter
	- o.a.l.analysis.WhitespaceAnalyzer -> o.a.l.analysis.core.WhitespaceAnalyzer
	- o.a.l.analysis.WhitespaceTokenizer -> o.a.l.analysis.core.WhitespaceTokenizer
	- o.a.l.analysis.PorterStemFilter -> o.a.l.analysis.en.PorterStemFilter
	- o.a.l.analysis.ASCIIFoldingFilter -> o.a.l.analysis.miscellaneous.ASCIIFoldingFilter
	- o.a.l.analysis.ISOLatin1AccentFilter -> o.a.l.analysis.miscellaneous.ISOLatin1AccentFilter
	- o.a.l.analysis.KeywordMarkerFilter -> o.a.l.analysis.miscellaneous.KeywordMarkerFilter
	- o.a.l.analysis.LengthFilter -> o.a.l.analysis.miscellaneous.LengthFilter
	- o.a.l.analysis.PerFieldAnalyzerWrapper -> o.a.l.analysis.miscellaneous.PerFieldAnalyzerWrapper
	- o.a.l.analysis.TeeSinkTokenFilter -> o.a.l.analysis.sinks.TeeSinkTokenFilter
	- o.a.l.analysis.CharFilter -> o.a.l.analysis.charfilter.CharFilter
	- o.a.l.analysis.BaseCharFilter -> o.a.l.analysis.charfilter.BaseCharFilter
	- o.a.l.analysis.MappingCharFilter -> o.a.l.analysis.charfilter.MappingCharFilter
	- o.a.l.analysis.NormalizeCharMap -> o.a.l.analysis.charfilter.NormalizeCharMap
	- o.a.l.analysis.CharArraySet -> o.a.l.analysis.util.CharArraySet
	- o.a.l.analysis.CharArrayMap -> o.a.l.analysis.util.CharArrayMap
	- o.a.l.analysis.ReusableAnalyzerBase -> o.a.l.analysis.util.ReusableAnalyzerBase
	- o.a.l.analysis.StopwordAnalyzerBase -> o.a.l.analysis.util.StopwordAnalyzerBase
	- o.a.l.analysis.WordListLoader -> o.a.l.analysis.util.WordListLoader
	- o.a.l.analysis.CharTokenizer -> o.a.l.analysis.util.CharTokenizer
	- o.a.l.util.CharacterUtils -> o.a.l.analysis.util.CharacterUtils

	* LUCENE-2514: The option to use a Collator's order (instead of binary order) for
	sorting and range queries has been moved to contrib/queries.

	The Collated TermRangeQuery/Filter has been moved to SlowCollatedTermRangeQuery/Filter,
	and the collated sorting has been moved to SlowCollatedStringComparator.

	Note: this functionality isn't very scalable and if you are using it, consider
	indexing collation keys with the collation support in the analysis module instead.

	To perform collated range queries, use a suitable collating analyzer: CollationKeyAnalyzer
	or ICUCollationKeyAnalyzer, and set qp.setAnalyzeRangeTerms(true).

	TermRangeQuery and TermRangeFilter now work purely on bytes. Both have helper factory methods
	(newStringRange) similar to the NumericRange API, to easily perform range queries on Strings.

	* LUCENE-2691: The near-real-time API has moved from IndexWriter to
	IndexReader. Instead of IndexWriter.getReader(), call
	IndexReader.open(IndexWriter) or IndexReader.reopen(IndexWriter).

	* LUCENE-2690: MultiTermQuery boolean rewrites per segment.
	Also MultiTermQuery.getTermsEnum() now takes an AttributeSource. FuzzyTermsEnum
	is both consumer and producer of attributes: MTQ.BoostAttribute is
	added to the FuzzyTermsEnum and MTQ's rewrite mode consumes it.
	The other way round MTQ.TopTermsBooleanQueryRewrite supplys a
	global AttributeSource to each segments TermsEnum. The TermsEnum is consumer
	and gets the current minimum competitive boosts (MTQ.MaxNonCompetitiveBoostAttribute).

	* LUCENE-2374: The backwards layer in AttributeImpl was removed. To support correct
	reflection of AttributeImpl instances, where the reflection was done using deprecated
	toString() parsing, you have to now override reflectWith() to customize output.
	toString() is no longer implemented by AttributeImpl, so if you have overridden
	toString(), port your customization over to reflectWith(). reflectAsString() would
	then return what toString() did before.

	* LUCENE-2236, LUCENE-2912: DefaultSimilarity can no longer be set statically
	(and dangerously) for the entire JVM.
	Instead, IndexWriterConfig and IndexSearcher now take a SimilarityProvider.
	Similarity can now be configured on a per-field basis.
	Similarity retains only the field-specific relevance methods such as tf() and idf().
	Previously some (but not all) of these methods, such as computeNorm and scorePayload took
	field as a parameter, this is removed due to the fact the entire Similarity (all methods)
	can now be configured per-field.
	Methods that apply to the entire query such as coord() and queryNorm() exist in SimilarityProvider.

	* LUCENE-1076: TieredMergePolicy is now the default merge policy.
	It's able to merge non-contiguous segments; this may cause problems
	for applications that rely on Lucene's internal document ID
	assigment. If so, you should instead use LogByteSize/DocMergePolicy
	during indexing.

	* LUCENE-2883: Lucene's o.a.l.search.function ValueSource based functionality, was consolidated
	into module/queries along with Solr's similar functionality. The following classes were moved:
	- o.a.l.search.function.CustomScoreQuery -> o.a.l.queries.CustomScoreQuery
	- o.a.l.search.function.CustomScoreProvider -> o.a.l.queries.CustomScoreProvider
	- o.a.l.search.function.NumericIndexDocValueSource -> o.a.l.queries.function.valuesource.NumericIndexDocValueSource
	The following lists the replacement classes for those removed:
	- o.a.l.search.function.ByteFieldSource -> o.a.l.queries.function.valuesource.ByteFieldSource
	- o.a.l.search.function.DocValues -> o.a.l.queries.function.DocValues
	- o.a.l.search.function.FieldCacheSource -> o.a.l.queries.function.valuesource.FieldCacheSource
	- o.a.l.search.function.FieldScoreQuery ->o.a.l.queries.function.FunctionQuery
	- o.a.l.search.function.FloatFieldSource -> o.a.l.queries.function.valuesource.FloatFieldSource
	- o.a.l.search.function.IntFieldSource -> o.a.l.queries.function.valuesource.IntFieldSource
	- o.a.l.search.function.OrdFieldSource -> o.a.l.queries.function.valuesource.OrdFieldSource
	- o.a.l.search.function.ReverseOrdFieldSource -> o.a.l.queries.function.valuesource.ReverseOrdFieldSource
	- o.a.l.search.function.ShortFieldSource -> o.a.l.queries.function.valuesource.ShortFieldSource
	- o.a.l.search.function.ValueSource -> o.a.l.queries.function.ValueSource
	- o.a.l.search.function.ValueSourceQuery -> o.a.l.queries.function.FunctionQuery

	* LUCENE-2392: Enable flexible scoring:

	The existing "Similarity" api is now TFIDFSimilarity, if you were extending
	Similarity before, you should likely extend this instead.

	Weight.normalize no longer takes a norm value that incorporates the top-level
	boost from outer queries such as BooleanQuery, instead it takes 2 parameters,
	the outer boost (topLevelBoost) and the norm. Weight.sumOfSquaredWeights has
	been renamed to Weight.getValueForNormalization().

	The scorePayload method now takes a BytesRef. It is never null.

	* LUCENE-3283: Lucene's core o.a.l.queryParser QueryParsers have been consolidated into module/queryparser,
	where other QueryParsers from the codebase will also be placed. The following classes were moved:
	- o.a.l.queryParser.CharStream -> o.a.l.queryparser.classic.CharStream
	- o.a.l.queryParser.FastCharStream -> o.a.l.queryparser.classic.FastCharStream
	- o.a.l.queryParser.MultiFieldQueryParser -> o.a.l.queryparser.classic.MultiFieldQueryParser
	- o.a.l.queryParser.ParseException -> o.a.l.queryparser.classic.ParseException
	- o.a.l.queryParser.QueryParser -> o.a.l.queryparser.classic.QueryParser
	- o.a.l.queryParser.QueryParserBase -> o.a.l.queryparser.classic.QueryParserBase
	- o.a.l.queryParser.QueryParserConstants -> o.a.l.queryparser.classic.QueryParserConstants
	- o.a.l.queryParser.QueryParserTokenManager -> o.a.l.queryparser.classic.QueryParserTokenManager
	- o.a.l.queryParser.QueryParserToken -> o.a.l.queryparser.classic.Token
	- o.a.l.queryParser.QueryParserTokenMgrError -> o.a.l.queryparser.classic.TokenMgrError



	* LUCENE-2308: Separate IndexableFieldType from Field instances

	With this change, the indexing details (indexed, tokenized, norms,
	indexOptions, stored, etc.) are moved into a separate FieldType
	instance (rather than being stored directly on the Field).

	This means you can create the IndexableFieldType instance once, up front,
	for a given field, and then re-use that instance whenever you instantiate
	the Field.

	Certain field types are pre-defined since they are common cases:

	* StringField: indexes a String value as a single token (ie, does
	not tokenize). This field turns off norms and indexes only doc
	IDS (does not index term frequency nor positions). This field
	does not store its value, but exposes TYPE_STORED as well.

	* BinaryField: a byte[] value that's only stored.

	* TextField: indexes and tokenizes a String, Reader or TokenStream
	value, without term vectors. This field does not store its value,
	but exposes TYPE_STORED as well.

	If your usage fits one of those common cases you can simply
	instantiate the above class. To use the TYPE_STORED variant, do this
	instead:

	Field f = new Field("field", StringField.TYPE_STORED, "value");

	Alternatively, if an existing type is close to what you want but you
	need to make a few changes, you can copy that type and make changes:

	FieldType bodyType = new FieldType(TextField.TYPE_STORED);
	bodyType.setStoreTermVectors(true);


	You can of course also create your own FieldType from scratch:

	FieldType t = new FieldType();
	t.setIndexed(true);
	t.setStored(true);
	t.setOmitNorms(true);
	t.setIndexOptions(IndexOptions.DOCS_AND_FREQS);

	FieldType has a freeze() method to prevent further changes.

	When migrating from the 3.x API, if you did this before:

	new Field("field", "value", Field.Store.NO, Field.Indexed.NOT_ANALYZED_NO_NORMS)

	you can now do this:

	new StringField("field", "value")

	(though note that StringField indexes DOCS_ONLY).

	If instead the value was stored:

	new Field("field", "value", Field.Store.YES, Field.Indexed.NOT_ANALYZED_NO_NORMS)

	you can now do this:

	new Field("field", StringField.TYPE_STORED, "value")

	If you didn't omit norms:

	new Field("field", "value", Field.Store.YES, Field.Indexed.NOT_ANALYZED)

	you can now do this:

	FieldType ft = new FieldType(StringField.TYPE_STORED);
	ft.setOmitNorms(false);
	new Field("field", ft, "value")

	If you did this before (value can be String or Reader):

	new Field("field", value, Field.Store.NO, Field.Indexed.ANALYZED)

	you can now do this:

	new TextField("field", value)

	If instead the value was stored:

	new Field("field", value, Field.Store.YES, Field.Indexed.ANALYZED)

	you can now do this:

	new Field("field", TextField.TYPE_STORED, value)

	If in addition you omit norms:

	new Field("field", value, Field.Store.YES, Field.Indexed.ANALYZED_NO_NORMS)

	you can now do this:

	FieldType ft = new FieldType(TextField.TYPE_STORED);
	ft.setOmitNorms(true);
	new Field("field", ft, value)

	If you did this before (bytes is a byte[]):

	new Field("field", bytes)

	you can now do this:

	new BinaryField("field", bytes)