| <!DOCTYPE html> |
| <!--[if IE]><![endif]--> |
| <html> |
| |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> |
| <title>Namespace Lucene.Net.Misc |
| | Apache Lucene.NET 4.8.0-beta00011 Documentation </title> |
| <meta name="viewport" content="width=device-width"> |
| <meta name="title" content="Namespace Lucene.Net.Misc |
| | Apache Lucene.NET 4.8.0-beta00011 Documentation "> |
| <meta name="generator" content="docfx 2.56.0.0"> |
| |
| <link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css"> |
| <meta property="docfx:navrel" content="toc.html"> |
| <meta property="docfx:tocrel" content="misc/toc.html"> |
| |
| <meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/"> |
| |
| </head> |
| <body data-spy="scroll" data-target="#affix" data-offset="120"> |
| <div id="wrapper"> |
| <header> |
| |
| <nav id="autocollapse" class="navbar ng-scope" role="navigation"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| |
| <a class="navbar-brand" href="/"> |
| <img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt=""> |
| </a> |
| </div> |
| <div class="collapse navbar-collapse" id="navbar"> |
| <form class="navbar-form navbar-right" role="search" id="search"> |
| <div class="form-group"> |
| <input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off"> |
| </div> |
| </form> |
| </div> |
| </div> |
| </nav> |
| |
| <div class="subnav navbar navbar-default"> |
| <div class="container hide-when-search"> |
| <ul class="level0 breadcrumb"> |
| <li> |
| <a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a> |
| <span id="breadcrumb"> |
| <ul class="breadcrumb"> |
| <li></li> |
| </ul> |
| </span> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </header> |
| <div class="container body-content"> |
| |
| <div id="search-results"> |
| <div class="search-list"></div> |
| <div class="sr-items"> |
| <p><i class="glyphicon glyphicon-refresh index-loading"></i></p> |
| </div> |
| <ul id="pagination"></ul> |
| </div> |
| </div> |
| <div role="main" class="container body-content hide-when-search"> |
| |
| <div class="sidenav hide-when-search"> |
| <a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a> |
| <div class="sidetoggle collapse" id="sidetoggle"> |
| <div id="sidetoc"></div> |
| </div> |
| </div> |
| <div class="article row grid-right"> |
| <div class="col-md-10"> |
| <article class="content wrap" id="_content" data-uid="Lucene.Net.Misc"> |
| |
| <h1 id="Lucene_Net_Misc" data-uid="Lucene.Net.Misc" class="text-break">Namespace Lucene.Net.Misc |
| </h1> |
| <div class="markdown level0 summary"><!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <h2 id="misc-tools">Misc Tools</h2> |
| <p>The misc package has various tools for splitting/merging indices, |
| changing norms, finding high freq terms, and others.</p> |
| <h2 id="nativeunixdirectory">NativeUnixDirectory</h2> |
| <p><strong>NOTE</strong>: This uses C++ sources (accessible via JNI), which you'll |
| have to compile on your platform.</p> |
| <p><xref:Lucene.Net.Store.NativeUnixDirectory> is a Directory implementation that bypasses the |
| OS's buffer cache (using direct IO) for any IndexInput and IndexOutput |
| used during merging of segments larger than a specified size (default |
| 10 MB). This avoids evicting hot pages that are still in-use for |
| searching, keeping search more responsive while large merges run.</p> |
| <p>See <a href="http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html">this blog post</a> |
| for details.</p> |
| <p>Steps to build:</p> |
| <ul> |
| <li><tt>cd lucene/misc/</tt> |
| |
| </li> |
| <li><p>To compile NativePosixUtil.cpp -> libNativePosixUtil.so, run<tt> ant build-native-unix</tt>.</p> |
| </li> |
| <li><p><tt>libNativePosixUtil.so</tt> will be located in the <tt>lucene/build/native/</tt> folder</p> |
| </li> |
| <li><p>Make sure libNativePosixUtil.so is on your LD_LIBRARY_PATH so java can find it (something like <tt>export LD_LIBRARY_PATH=/path/to/dir:$LD_LIBRARY_PATH</tt>, where /path/to/dir contains libNativePosixUtil.so)</p> |
| </li> |
| <li><p><tt>ant jar</tt> to compile the java source and put that JAR on your CLASSPATH</p> |
| </li> |
| </ul> |
| <p>NativePosixUtil.cpp/java also expose access to the posix_madvise, |
| madvise, posix_fadvise functions, which are somewhat more cross |
| platform than O_DIRECT, however, in testing (see above link), these |
| APIs did not seem to help prevent buffer cache eviction.</p> |
| </div> |
| <div class="markdown level0 conceptual"></div> |
| <div class="markdown level0 remarks"></div> |
| <h3 id="classes">Classes |
| </h3> |
| <h4><a class="xref" href="Lucene.Net.Misc.GetTermInfo.html">GetTermInfo</a></h4> |
| <section><p>Utility to get document frequency and total number of occurrences (sum of the tf for each doc) of a term. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Misc.HighFreqTerms.html">HighFreqTerms</a></h4> |
| <section><p><a class="xref" href="Lucene.Net.Misc.HighFreqTerms.html">HighFreqTerms</a> class extracts the top n most frequent terms |
| (by document frequency) from an existing Lucene index and reports their |
| document frequency. |
| <p> |
| If the -t flag is given, both document frequency and total tf (total |
| number of occurrences) are reported, ordered by descending total tf.</p> |
| <p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Misc.HighFreqTerms.DocFreqComparer.html">HighFreqTerms.DocFreqComparer</a></h4> |
| <section><p>Compares terms by <a class="xref" href="Lucene.Net.Misc.TermStats.html#Lucene_Net_Misc_TermStats_DocFreq">DocFreq</a></p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Misc.HighFreqTerms.TotalTermFreqComparer.html">HighFreqTerms.TotalTermFreqComparer</a></h4> |
| <section><p>Compares terms by <a class="xref" href="Lucene.Net.Misc.TermStats.html#Lucene_Net_Misc_TermStats_TotalTermFreq">TotalTermFreq</a> </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Misc.IndexMergeTool.html">IndexMergeTool</a></h4> |
| <section><p>Merges indices specified on the command line into the index |
| specified as the first command line argument.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Misc.SweetSpotSimilarity.html">SweetSpotSimilarity</a></h4> |
| <section><p> |
| A similarity with a lengthNorm that provides for a "plateau" of |
| equally good lengths, and tf helper functions. |
| </p> |
| <p> |
| For lengthNorm, A min/max can be specified to define the |
| plateau of lengths that should all have a norm of 1.0. |
| Below the min, and above the max the lengthNorm drops off in a |
| sqrt function. |
| </p> |
| <p> |
| For tf, baselineTf and hyperbolicTf functions are provided, which |
| subclasses can choose between. |
| </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Misc.TermStats.html">TermStats</a></h4> |
| <section><p>Holder for a term along with its statistics |
| (<a class="xref" href="Lucene.Net.Misc.TermStats.html#Lucene_Net_Misc_TermStats_DocFreq">DocFreq</a> and <a class="xref" href="Lucene.Net.Misc.TermStats.html#Lucene_Net_Misc_TermStats_TotalTermFreq">TotalTermFreq</a>).</p> |
| </section> |
| </article> |
| </div> |
| |
| <div class="hidden-sm col-md-2" role="complementary"> |
| <div class="sideaffix"> |
| <div class="contribution"> |
| <ul class="nav"> |
| <li> |
| <a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00011/src/Lucene.Net.Misc/overview.md/#L2" class="contribution-link">Improve this Doc</a> |
| </li> |
| </ul> |
| </div> |
| <nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix"> |
| <!-- <p><a class="back-to-top" href="#top">Back to top</a><p> --> |
| </nav> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <footer> |
| <div class="grad-bottom"></div> |
| <div class="footer"> |
| <div class="container"> |
| <span class="pull-right"> |
| <a href="#top">Back to top</a> |
| </span> |
| Copyright © 2020 Licensed to the Apache Software Foundation (ASF) |
| |
| </div> |
| </div> |
| </footer> |
| </div> |
| |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script> |
| </body> |
| </html> |