| <!DOCTYPE html> |
| <!--[if IE]><![endif]--> |
| <html> |
| |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> |
| <title>Namespace Lucene.Net.Codecs.Bloom |
| | Apache Lucene.NET 4.8.0-beta00011 Documentation </title> |
| <meta name="viewport" content="width=device-width"> |
| <meta name="title" content="Namespace Lucene.Net.Codecs.Bloom |
| | Apache Lucene.NET 4.8.0-beta00011 Documentation "> |
| <meta name="generator" content="docfx 2.56.0.0"> |
| |
| <link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css"> |
| <meta property="docfx:navrel" content="toc.html"> |
| <meta property="docfx:tocrel" content="codecs/toc.html"> |
| |
| <meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/"> |
| |
| </head> |
| <body data-spy="scroll" data-target="#affix" data-offset="120"> |
| <div id="wrapper"> |
| <header> |
| |
| <nav id="autocollapse" class="navbar ng-scope" role="navigation"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| |
| <a class="navbar-brand" href="/"> |
| <img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt=""> |
| </a> |
| </div> |
| <div class="collapse navbar-collapse" id="navbar"> |
| <form class="navbar-form navbar-right" role="search" id="search"> |
| <div class="form-group"> |
| <input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off"> |
| </div> |
| </form> |
| </div> |
| </div> |
| </nav> |
| |
| <div class="subnav navbar navbar-default"> |
| <div class="container hide-when-search"> |
| <ul class="level0 breadcrumb"> |
| <li> |
| <a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a> |
| <span id="breadcrumb"> |
| <ul class="breadcrumb"> |
| <li></li> |
| </ul> |
| </span> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </header> |
| <div class="container body-content"> |
| |
| <div id="search-results"> |
| <div class="search-list"></div> |
| <div class="sr-items"> |
| <p><i class="glyphicon glyphicon-refresh index-loading"></i></p> |
| </div> |
| <ul id="pagination"></ul> |
| </div> |
| </div> |
| <div role="main" class="container body-content hide-when-search"> |
| |
| <div class="sidenav hide-when-search"> |
| <a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a> |
| <div class="sidetoggle collapse" id="sidetoggle"> |
| <div id="sidetoc"></div> |
| </div> |
| </div> |
| <div class="article row grid-right"> |
| <div class="col-md-10"> |
| <article class="content wrap" id="_content" data-uid="Lucene.Net.Codecs.Bloom"> |
| |
| <h1 id="Lucene_Net_Codecs_Bloom" data-uid="Lucene.Net.Codecs.Bloom" class="text-break">Namespace Lucene.Net.Codecs.Bloom |
| </h1> |
| <div class="markdown level0 summary"><!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <p>Codec PostingsFormat for fast access to low-frequency terms such as primary key fields.</p> |
| </div> |
| <div class="markdown level0 conceptual"></div> |
| <div class="markdown level0 remarks"></div> |
| <h3 id="classes">Classes |
| </h3> |
| <h4><a class="xref" href="Lucene.Net.Codecs.Bloom.BloomFilterFactory.html">BloomFilterFactory</a></h4> |
| <section><p>Class used to create index-time <a class="xref" href="Lucene.Net.Codecs.Bloom.FuzzySet.html">FuzzySet</a> appropriately configured for |
| each field. Also called to right-size bitsets for serialization. |
| <p> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section> |
| <h4><a class="xref" href="Lucene.Net.Codecs.Bloom.BloomFilteringPostingsFormat.html">BloomFilteringPostingsFormat</a></h4> |
| <section><p>A <span class="xref">Lucene.Net.Codecs.PostingsFormat</span> useful for low doc-frequency fields such as primary |
| keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" |
| for reads in segments known to have no record of the key. A choice of |
| delegate <span class="xref">Lucene.Net.Codecs.PostingsFormat</span> is used to record all other Postings data. |
| <p> |
| A choice of <a class="xref" href="Lucene.Net.Codecs.Bloom.BloomFilterFactory.html">BloomFilterFactory</a> can be passed to tailor Bloom Filter |
| settings on a per-field basis. The default configuration is |
| <a class="xref" href="Lucene.Net.Codecs.Bloom.DefaultBloomFilterFactory.html">DefaultBloomFilterFactory</a> which allocates a ~8mb bitset and hashes |
| values using <a class="xref" href="Lucene.Net.Codecs.Bloom.MurmurHash2.html">MurmurHash2</a>. This should be suitable for most purposes. |
| <p> |
| The format of the blm file is as follows:</p> |
| <p><ul><li>BloomFilter (.blm) --> Header, DelegatePostingsFormatName, |
| NumFilteredFields, Filter<sup>NumFilteredFields</sup>, Footer</li><li>Filter --> FieldNumber, FuzzySet</li><li>FuzzySet -->See <a class="xref" href="Lucene.Net.Codecs.Bloom.FuzzySet.html#Lucene_Net_Codecs_Bloom_FuzzySet_Serialize_Lucene_Net_Store_DataOutput_">Serialize(DataOutput)</a></li><li>Header --> CodecHeader (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteHeader_Lucene_Net_Store_DataOutput_System_String_System_Int32_">WriteHeader(DataOutput, String, Int32)</a>) </li><li>DelegatePostingsFormatName --> String (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteString_System_String_">WriteString(String)</a>) |
| The name of a ServiceProvider registered <span class="xref">Lucene.Net.Codecs.PostingsFormat</span></li><li>NumFilteredFields --> Uint32 (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteInt32_System_Int32_">WriteInt32(Int32)</a>) </li><li>FieldNumber --> Uint32 (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.DataOutput.html#Lucene_Net_Store_DataOutput_WriteInt32_System_Int32_">WriteInt32(Int32)</a>) The number of the |
| field in this segment</li><li>Footer --> CodecFooter (<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Codecs.CodecUtil.html#Lucene_Net_Codecs_CodecUtil_WriteFooter_Lucene_Net_Store_IndexOutput_">WriteFooter(IndexOutput)</a>) </li></ul> |
| <p> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section> |
| <h4><a class="xref" href="Lucene.Net.Codecs.Bloom.DefaultBloomFilterFactory.html">DefaultBloomFilterFactory</a></h4> |
| <section><p>Default policy is to allocate a bitset with 10% saturation given a unique term per document. |
| Bits are set via <a class="xref" href="Lucene.Net.Codecs.Bloom.MurmurHash2.html">MurmurHash2</a> hashing function. |
| <p> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section> |
| <h4><a class="xref" href="Lucene.Net.Codecs.Bloom.FuzzySet.html">FuzzySet</a></h4> |
| <section><p>A class used to represent a set of many, potentially large, values (e.g. many |
| long strings such as URLs), using a significantly smaller amount of memory. |
| <p> |
| The set is "lossy" in that it cannot definitively state that is does contain |
| a value but it <em>can</em> definitively say if a value is <em>not</em> in |
| the set. It can therefore be used as a Bloom Filter. |
| <p> |
| Another application of the set is that it can be used to perform fuzzy counting because |
| it can estimate reasonably accurately how many unique values are contained in the set. |
| <p> |
| This class is NOT threadsafe. |
| <p> |
| Internally a Bitset is used to record values and once a client has finished recording |
| a stream of values the <a class="xref" href="Lucene.Net.Codecs.Bloom.FuzzySet.html#Lucene_Net_Codecs_Bloom_FuzzySet_Downsize_System_Single_">Downsize(Single)</a> method can be used to create a suitably smaller set that |
| is sized appropriately for the number of values recorded and desired saturation levels. |
| <p> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section> |
| <h4><a class="xref" href="Lucene.Net.Codecs.Bloom.HashFunction.html">HashFunction</a></h4> |
| <section><p>Base class for hashing functions that can be referred to by name. |
| Subclasses are expected to provide threadsafe implementations of the hash function |
| on the range of bytes referenced in the provided <span class="xref">Lucene.Net.Util.BytesRef</span>. |
| <p> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section> |
| <h4><a class="xref" href="Lucene.Net.Codecs.Bloom.MurmurHash2.html">MurmurHash2</a></h4> |
| <section><p>This is a very fast, non-cryptographic hash suitable for general hash-based |
| lookup. See <a href="http://murmurhash.googlepages.com/">http://murmurhash.googlepages.com/</a> for more details. |
| <p> |
| The C version of MurmurHash 2.0 found at that site was ported to Java by |
| Andrzej Bialecki (ab at getopt org). |
| <p> |
| The code from getopt.org was adapted by Mark Harwood in the form here as one of a pluggable choice of |
| hashing functions as the core function had to be adapted to work with <span class="xref">Lucene.Net.Util.BytesRef</span>s with offsets and lengths |
| rather than raw byte arrays.<br><p> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section> |
| <h3 id="enums">Enums |
| </h3> |
| <h4><a class="xref" href="Lucene.Net.Codecs.Bloom.FuzzySet.ContainsResult.html">FuzzySet.ContainsResult</a></h4> |
| <section></section> |
| </article> |
| </div> |
| |
| <div class="hidden-sm col-md-2" role="complementary"> |
| <div class="sideaffix"> |
| <div class="contribution"> |
| <ul class="nav"> |
| <li> |
| <a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00011/src/Lucene.Net.Codecs/Bloom/package.md/#L2" class="contribution-link">Improve this Doc</a> |
| </li> |
| </ul> |
| </div> |
| <nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix"> |
| <!-- <p><a class="back-to-top" href="#top">Back to top</a><p> --> |
| </nav> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <footer> |
| <div class="grad-bottom"></div> |
| <div class="footer"> |
| <div class="container"> |
| <span class="pull-right"> |
| <a href="#top">Back to top</a> |
| </span> |
| Copyright © 2020 Licensed to the Apache Software Foundation (ASF) |
| |
| </div> |
| </div> |
| </footer> |
| </div> |
| |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script> |
| </body> |
| </html> |