blob: 0d8bbedea91245569681f348daaaaa060bfa5c47 [file] [log] [blame]
# Apache Lucene Migration Guide
## Lucene 3.x index format no longer supported
Lucene 5 no longer supports the Lucene 3.x index format. Opening
indexes will result in `IndexFormatTooOldException`. It is recommended
to either reindex all your data, or upgrade the old indexes with
the `IndexUpgrader` tool of latest Lucene 4 version (4.10.x).
Those indexes can then be read (see next section) with Lucene 5.
## Support for previous Lucene 4.x index formats moved to new module
Lucene 5 will by default only read indexes created with Lucene 5.
To read and upgrade Lucene 4.x indexes, you must add the
`lucene-backward-codecs.jar` to the classpath. It is recommended
to upgrade the old indexes with the `IndexUpgrader` tool,
so you can remove the backward-codecs module from classpath.
This will also improve performance.
## All file handling APIs changed to Java 7 NIO.2 (LUCENE-5945)
All APIs around Directory and other file-based resources were changed to make
use of the new Java 7 NIO.2 API. It is no longer possible to pass
java.io.File onames to FSDirectory classes. FSDirectory classes now requires
java.nio.file.Path instances. This allows to place index directories also
on "virtual file systems" like ZIP or TAR files. To migrate existing code
use java.io.File#toPath().
In addition, make sure that custom directory implementations throw the new
IOException types, because Lucene cannot understand the old legacy
IOExceptions (like java.io.FileNotFoundException) instead of the new ones
like java.nio.file.NoSuchFileException.
## Directory and LockFactory APIs restructured (LUCENE-5953)
Locking is now under the responsibility of the Directory implementation.
LockFactory is only used by subclasses of BaseDirectory to delegate locking
to an impl class. LockFactories are responsible to create a Lock on behalf
of a BaseDirectory subclass.
The following changes in existing code need to be done:
- LockFactory implementations are singletons now and have no state. They only
need to implement one method: makeLock(Directory dir, String name).
The passed directory can be used to determine the corect file system path
for the lock file or similar, so it knows where to create the lock.
In addition, the factory may check with instanceof, if the lock factory
can be used with the type of directory at all.
- It was never really supported to place lock files outside of the index
directory and this functionality was removed. If you still rely on this,
you can use the following trick: Use FileSwitchDirectory and delegate
the file extension ".lock" to another Directory instance pointing to
another path. FileSwitchDirectory also delegates lock files based on
the extension.
- If you wrap another directory using FilterDirectory, you cannot make use
of LockFactories anymore, because only BaseDirectory knows about them.
To wrap locking, you must hook into FilterDirectory.makeLock(String name)
and wrap the Lock instance returned, as needed. See MockDirectoryWrapper
in lucene-test-framework for an example.
- It is no longer allowed to pass "null" as LockFactory to FSDirectory
implementations. You have to explicitely pass the platform default to the
directory (currently always NativeFSLockFactory.INSTANCE, but subject to
change!). To get the platform default, call FSLockFactory.getDefault().
## Removed Reader from Tokenizer constructor (LUCENE-5388)
The constructor of Tokenizer no longer takes Reader, as this was a leftover
from before it was reusable. See the org.apache.lucene.analysis package
documentation for more details.
## Refactored Collector API (LUCENE-5299)
The Collector API has been refactored to use a different Collector instance
per segment. It is possible to migrate existing collectors painlessly by
extending SimpleCollector instead of Collector: SimpleCollector is a
specialization of Collector that returns itself as a per-segment Collector.
## Refactored FieldComparator API (LUCENE-5702)
Like collectors (see above), field comparators have been refactored to
produce a new comparator (called LeafFieldComparator) per segment. It is
possible to migrate existing comparators painlessly by extending
SimpleFieldComparator, which will implements both FieldComparator and
LeafFieldComparator and return itself as a per-segment comparator.
## Removed ChainedFilter (LUCENE-5984)
Users are advised to switch to BooleanFilter instead.
## Removed OpenBitSet (LUCENE-6010)
OpenBitSet only differs from LongBitSet by its ability to grow automatically.
In case growth is required, it would need to be managed externally.
## FunctionValues.exist() Behavior Changes due to ValueSource bug fixes (LUCENE-5961)
Bugs fixed in several ValueSource functions may result in different behavior in
situations where some documents do not have values for fields wrapped in other
ValueSources. Users who want to preserve the previous behavior may need to wrap
their ValueSources in a "DefFunction" along with a ConstValueSource of "0.0".
## PayloadAttributeImpl.clone() (LUCENE-6055)
PayloadAttributeImpl.clone() did a shallow clone which was incorrect, and was
fixed to do a deep clone. If you require shallow cloning of the underlying bytes,
you should override PayloadAttributeImpl.clone() to do a shallow clone instead.
## Removed out-of-order scoring (LUCENE-6179)
Bulk scorers must now always collect documents in order. If you have custom
collectors, the acceptsDocsOutOfOrder method has been removed and collectors
can safely assume that they will be collected in order.
## Renamed "Atomic" to "Leaf" for segment readers (LUCENE-5569)
AtomicReader and AtomicReaderContext are now called LeafReader and LeafReaderContext, respectively.
## Removed custom Analyzer per-document indexing APIs from IndexWriter (LUCENE-6212)
These methods were removed because they are dangerous since they let
you analyze each document arbitrarily differently, making it difficult
to properly analyze text at query time and easy to accidentally "lose"
search hits. Instead, you should break out text into separate fields
and use a different analyzer for each field with
PerFieldAnalyzerWrapper.