LUCENE-9300: Fix field infos update on doc values update (#1394)

Today a doc values update creates a new field infos file that contains the original field infos updated for the new generation as well as the new fields created by the doc values update.

However existing fields are cloned through the global fields (shared in the index writer) instead of the local ones (present in the segment).
In practice this is not an issue since field numbers are shared between segments created by the same index writer.
But this assumption doesn't hold for segments created by different writers and added through IndexWriter#addIndexes(Directory).
In this case, the field number of the same field can differ between segments so any doc values update can corrupt the index
by assigning the wrong field number to an existing field in the next generation.

When this happens, queries and merges can access wrong fields without throwing any error, leading to a silent corruption in the index.

This change ensures that we preserve local field numbers when creating
a new field infos generation.
3 files changed
tree: 65a9c7f9e7d068214ad36db434f4598bdc09500b
  1. dev-docs/
  2. dev-tools/
  3. lucene/
  4. solr/
  5. .gitignore
  6. .hgignore
  7. build.xml
  8. README.md
README.md

Apache Lucene and Solr

Apache Lucene is a high-performance, full featured text search engine library written in Java.

Apache Solr is an enterprise search platform written using Apache Lucene. Major features include full-text search, index replication and sharding, and result faceting and highlighting.

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit:

Building Lucene/Solr

(You do not need to do this if you downloaded a pre-built package.)

Lucene and Solr are built using Apache Ant. To build Lucene and Solr, run:

ant compile

If you see an error about Ivy missing while invoking Ant (e.g., .ant/lib does not exist), run ant ivy-bootstrap and retry.

Sometimes you may face issues with Ivy (e.g., an incompletely downloaded artifact). Cleaning up the Ivy cache and retrying is a workaround for most of such issues:

rm -rf ~/.ivy2/cache

The Solr server can then be packaged and prepared for startup by running the following command from the solr/ directory:

ant server

Running Solr

After building Solr, the server can be started using the bin/solr control scripts. Solr can be run in either standalone or distributed (SolrCloud mode).

To run Solr in standalone mode, run the following command from the solr/ directory:

bin/solr start

To run Solr in SolrCloud mode, run the following command from the solr/ directory:

bin/solr start -c

The bin/solr control script allows heavy modification of the started Solr. Common options are described in some detail in solr/README.txt. For an exhaustive treatment of options, run bin/solr start -h from the solr/ directory.

Development/IDEs

Ant can be used to generate project files compatible with most common IDEs. Run the ant command corresponding to your IDE of choice before attempting to import Lucene/Solr.

  • Eclipse - ant eclipse (See this for details)
  • IntelliJ - ant idea (See this for details)
  • Netbeans - ant netbeans (See this for details)

Running Tests

The standard test suite can be run with the command:

ant test

Like Solr itself, the test-running can be customized or tailored in a number or ways. For an exhaustive discussion of the options available, run:

ant test-help

Contributing

Please review the Contributing to Solr Guide for information on contributing.

Discussion and Support