Stop exploring HNSW graph if scores are not getting better. (#12770)

I noticed while testing lower dimensionality and quantization, we would explore the HNSW graph way too much. I was stuck figuring out why until I noticed the searcher checks for distance equality (not just if the distance is better) when exploring neighbors-of-neighbors. This seems like a bad heuristic, but to double check I looked at what nmslib does. This pointed me back to this commit: nmslib/nmslib#106

Seems like this performance hitch was discovered awhile ago :).

This commit adjusts HNSW to only explore the graph layer if the distance is actually better.
3 files changed
tree: 287b51b7ca0a19b10dc5b5c6893f8725382066f4
  1. .github/
  2. buildSrc/
  3. dev-docs/
  4. dev-tools/
  5. gradle/
  6. help/
  7. lucene/
  8. .asf.yaml
  9. .dir-locals.el
  10. .git-blame-ignore-revs
  11. .gitattributes
  12. .gitignore
  13. .hgignore
  14. .lift.toml
  15. build.gradle
  16. CONTRIBUTING.md
  17. gradlew
  18. gradlew.bat
  19. LICENSE.txt
  20. NOTICE.txt
  21. README.md
  22. settings.gradle
  23. versions.lock
  24. versions.props
README.md

Apache Lucene

Lucene Logo

Apache Lucene is a high-performance, full-featured text search engine library written in Java.

Build Status

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit:

Building

Basic steps:

  1. Install OpenJDK 17 or 18.
  2. Clone Lucene's git repository (or download the source distribution).
  3. Run gradle launcher script (gradlew).

We‘ll assume that you know how to get and set up the JDK - if you don’t, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.

See Contributing Guide for details.

Contributing

Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.

Discussion and Support