commit | 6c5dcc1795057f604fa43b9715dcf83b7c8ddb5c | [log] [tgz] |
---|---|---|
author | Kaival Parikh <46070017+kaivalnp@users.noreply.github.com> | Wed Dec 13 20:41:45 2023 +0530 |
committer | GitHub <noreply@github.com> | Wed Dec 13 10:11:45 2023 -0500 |
tree | d3f888c64ed4a0f9fbd797c5aecfb593eaa97559 | |
parent | 98d2df17d593a17f833eb8172b285cc0298b4aad [diff] |
Fix failing BaseVectorSimilarityQueryTestCase#testApproximate (#12922) Discovered in #12921, and introduced in #12679 The first issue is that we weren't advancing the `VectorScorer` [here](https://github.com/apache/lucene/blob/cf13a9295052288b748ed8f279f05ee26f3bfd5f/lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java#L257-L262) -- so it was still un-positioned while trying to compute the similarity score Earlier in the PR, the underlying delegate of the `FilteredDocIdSetIterator` was `scorer.iterator()` (see [here](https://github.com/apache/lucene/blob/cad565439be512ac6e95a698007b1fc971173f00/lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java#L107)) -- so we didn't need to explicitly advance it Later, we decided to maintain parity to `AbstractKnnVectorQuery` and introduce filtering in `AbstractVectorSimilarityQuery` (see [this commit](https://github.com/apache/lucene/commit/5096790f281e477c529a7c8311aeb353ccdffdeb)) to determine the `visitLimit` of approximate search -- after which the underlying iterator changed to the accepted docs (see [here](https://github.com/apache/lucene/blob/5096790f281e477c529a7c8311aeb353ccdffdeb/lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java#L255)) and I missed advancing the `VectorScorer` explicitly.. After doing so, we no longer get the original `java.lang.ArrayIndexOutOfBoundsException` -- but the `BaseVectorSimilarityQueryTestCase#testApproximate` starts failing because it falls back to exact search, as the limit of the prefilter is met during graph search Relaxed the parameters of the test to fix this (making the filter less restrictive, and trying to visit a fewer number of nodes so that approximate search completes without hitting its limit) Sorry for missing this earlier!
Apache Lucene is a high-performance, full-featured text search engine library written in Java.
This README file only contains basic setup instructions. For more comprehensive documentation, visit:
gradlew
).We‘ll assume that you know how to get and set up the JDK - if you don’t, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.
See Contributing Guide for details.
Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.
#lucene
and #lucene-dev
on freenode.net