libclucene-v2.4.11
[Improvement](index compaction) Improve index compaction perf by priority queue (#59)

This pull request addresses the issue of index compaction performance by introducing a priority queue to better manage the process. This update involves changes in IndexWriter.cpp, SegmentMergeInfo.cpp, and _SegmentMergeInfo.h. The key modifications include:

1. Replacing the previous approach for finding the smallest term and constructing a dest_idx_bitmap with a more efficient priority queue based approach.
2. Introducing a new postingQueue class to manage the document merging process.
3. Implementing a new DestDoc struct to store relevant information for each document in the merging process.
4. Refactoring the mergeTerms() method to use the new priority queue based approach.

These changes have resulted in improved performance during the index compaction process, leading to faster indexing and optimized resource usage.

Co-authored-by: airborne12 <airborne12@gmail.com>
3 files changed
tree: 3f6f21083227ce32a7ffabc60de6a3e249c87653
  1. .github/
  2. cmake/
  3. doc/
  4. src/
  5. .clang-format
  6. APACHE.license
  7. AUTHORS
  8. ChangeLog
  9. CMakeLists.txt
  10. COPYING
  11. dist-test.sh
  12. INSTALL
  13. LGPL.license
  14. NEWS
  15. README
  16. README.md
  17. README.PACKAGE
  18. REQUESTS
README.md

CLucene README

CLucene is a C++ port of Lucene. It is a high-performance, full-featured text search engine written in C++. CLucene is faster than lucene as it is written in C++.

CLucene has contributions from many, see AUTHORS

CLucene is distributed under the GNU Lesser General Public License (LGPL) or the Apache License, Version 2.0 See the LGPL.license and APACHE.license for the respective license information. Read COPYING for more about the license.

Installation

Read the INSTALL file

Mailing List

Questions and discussion should be directed to the CLucene mailing list at clucene-developers@lists.sourceforge.net
Find subscription instructions at http://lists.sourceforge.net/lists/listinfo/clucene-developers Suggestions and bug reports can be made on our bug tracking database (http://sourceforge.net/tracker/?group_id=80013&atid=558446)

The latest version

Details of the latest version can be found on the CLucene sourceforge project web site: http://www.sourceforge.net/projects/clucene

Documentation

You can build your own documentation by running ‘make DoxygenDoc’ from your ‘out-of-source’ cmake-configured build directory. CLucene is a very close port of Java Lucene, so you can also try looking at the Java Docs on http://lucene.apache.org/java/ There is an online version (which won't be as up to date as if you build your own) at http://clucene.sourceforge.net/doc/html/

Acknowledgments

The Apache Lucene project is the basis for this software, so the biggest acknoledgment goes to that project.

We wish to acknowledge the following copyrighted works that make up portions of the CLucene software:

This software contains code derived from the RSA Data Security Inc. MD5 Message-Digest Algorithm.

CLucene relies heavily on the use of cmake to provide a stable build environment.