| commit | a1418d9433c6cdd4102b1334a413f1efd49e2507 | [log] [tgz] |
|---|---|---|
| author | Michael McCandless <mikemccand@apache.org> | Sun Sep 15 15:26:28 2024 -0400 |
| committer | GitHub <noreply@github.com> | Sun Sep 15 15:26:28 2024 -0400 |
| tree | 0eea26e49eb1cff50abea123509bcee7f3b6366c | |
| parent | 111ccf6c30a74e3b257a53fa8703452b44954a9b [diff] |
Remove 8 bit quantization for HNSW/KNN vector indexing (it is buggy today) (#13767) 4 and 7 bit quantization still work. It's a bit tricky because 9.11 indices may have 8 bit compressed vectors which are buggy at search time (and users may not realize it, or may not be using them at search time). But the index is still intact since we keep the original full float precision vectors. So, users can force rewrite all their 9.11 written segments (or reindex those docs), and can change to 4 or 7 bit quantization for newly indexed documents. The 9.11 index is still usable. (I added a couple test cases confirming that one can indeed change their mind, indexing a given vector field first with 4 bit quantization, then later (new IndexWriter / Codec) with 7 bit or with no quantization.) I added MIGRATE.md explanation. Separately, I also tightned up the `compress` boolean to throw an exception unless bits=4. Previously (for 7 bit compression) it silently ignored `compress=true` for 7, 8 bit quantization. And tried to improve its javadocs a bit. Closes #13519.

Apache Lucene is a high-performance, full-featured text search engine library written in Java.
This README file only contains basic setup instructions. For more comprehensive documentation, visit:
gradlew).We‘ll assume that you know how to get and set up the JDK - if you don’t, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.
Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.
#lucene and #lucene-dev on freenode.net