blob: b125708dae4bd94b9ac76b3d447152c4629fbcb6 [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE]><![endif]-->
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Namespace Lucene.Net.Index.Memory
| Apache Lucene.NET 4.8.0-beta00010 Documentation </title>
<meta name="viewport" content="width=device-width">
<meta name="title" content="Namespace Lucene.Net.Index.Memory
| Apache Lucene.NET 4.8.0-beta00010 Documentation ">
<meta name="generator" content="docfx 2.56.0.0">
<link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
<meta property="docfx:navrel" content="toc.html">
<meta property="docfx:tocrel" content="memory/toc.html">
<meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">
</head>
<body data-spy="scroll" data-target="#affix" data-offset="120">
<div id="wrapper">
<header>
<nav id="autocollapse" class="navbar ng-scope" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">
<img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
</a>
</div>
<div class="collapse navbar-collapse" id="navbar">
<form class="navbar-form navbar-right" role="search" id="search">
<div class="form-group">
<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
</div>
</form>
</div>
</div>
</nav>
<div class="subnav navbar navbar-default">
<div class="container hide-when-search">
<ul class="level0 breadcrumb">
<li>
<a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
<span id="breadcrumb">
<ul class="breadcrumb">
<li></li>
</ul>
</span>
</li>
</ul>
</div>
</div>
</header>
<div class="container body-content">
<div id="search-results">
<div class="search-list"></div>
<div class="sr-items">
<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
</div>
<ul id="pagination"></ul>
</div>
</div>
<div role="main" class="container body-content hide-when-search">
<div class="sidenav hide-when-search">
<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
<div class="sidetoggle collapse" id="sidetoggle">
<div id="sidetoc"></div>
</div>
</div>
<div class="article row grid-right">
<div class="col-md-10">
<article class="content wrap" id="_content" data-uid="Lucene.Net.Index.Memory">
<h1 id="Lucene_Net_Index_Memory" data-uid="Lucene.Net.Index.Memory" class="text-break">Namespace Lucene.Net.Index.Memory
</h1>
<div class="markdown level0 summary"><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>High-performance single-document main memory Apache Lucene fulltext search index.</p>
</div>
<div class="markdown level0 conceptual"></div>
<div class="markdown level0 remarks"></div>
<h3 id="classes">Classes
</h3>
<h4><a class="xref" href="Lucene.Net.Index.Memory.MemoryIndex.html">MemoryIndex</a></h4>
<section><p>High-performance single-document main memory Apache Lucene fulltext search index. </p>
<h4>Overview</h4>
<p>This class is a replacement/substitute for a large subset of
<a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a> functionality. It is designed to
enable maximum efficiency for on-the-fly matchmaking combining structured and
fuzzy fulltext search in realtime streaming applications such as Nux XQuery based XML
message queues, publish-subscribe systems for Blogs/newsfeeds, text chat, data acquisition and
distribution systems, application level routers, firewalls, classifiers, etc.
Rather than targeting fulltext search of infrequent queries over huge persistent
data archives (historic search), this class targets fulltext search of huge
numbers of queries over comparatively small transient realtime data (prospective
search).
For example as in </p>
<pre><code>float score = Search(string text, Query query)</code></pre>
<p>
Each instance can hold at most one Lucene &quot;document&quot;, with a document containing
zero or more &quot;fields&quot;, each field having a name and a fulltext value. The
fulltext value is tokenized (split and transformed) into zero or more index terms
(aka words) on <pre><code>AddField()</code></pre>, according to the policy implemented by an
Analyzer. For example, Lucene analyzers can split on whitespace, normalize to lower case
for case insensitivity, ignore common terms with little discriminatory value such as &quot;he&quot;, &quot;in&quot;, &quot;and&quot; (stop
words), reduce the terms to their natural linguistic root form such as &quot;fishing&quot;
being reduced to &quot;fish&quot; (stemming), resolve synonyms/inflexions/thesauri
(upon indexing and/or querying), etc. For details, see
<a target="_blank" href="http://today.java.net/pub/a/today/2003/07/30/LuceneIntro.html">Lucene Analyzer Intro</a>.
</p>
<p>
Arbitrary Lucene queries can be run against this class - see <a target="_blank" href="{@docRoot}/../queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description">
Lucene Query Syntax</a>
as well as <a target="_blank" href="http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html">Query Parser Rules</a>.
Note that a Lucene query selects on the field names and associated (indexed)
tokenized terms, not on the original fulltext(s) - the latter are not stored
but rather thrown away immediately after tokenization.
</p>
<p>
For some interesting background information on search technology, see Bob Wyman&apos;s
<a target="_blank" href="http://bobwyman.pubsub.com/main/2005/05/mary_hodder_poi.html">Prospective Search</a>,
Jim Gray&apos;s
<a target="_blank" href="http://www.acmqueue.org/modules.php?name=Content&amp;pa=showpage&amp;pid=293&amp;page=4">
A Call to Arms - Custom subscriptions</a>, and Tim Bray&apos;s
<a target="_blank" href="http://www.tbray.org/ongoing/When/200x/2003/07/30/OnSearchTOC">On Search, the Series</a>.
<h4>Example Usage</h4>
<pre><code>Analyzer analyzer = new SimpleAnalyzer(version);
MemoryIndex index = new MemoryIndex();
index.AddField(&quot;content&quot;, &quot;Readings about Salmons and other select Alaska fishing Manuals&quot;, analyzer);
index.AddField(&quot;author&quot;, &quot;Tales of James&quot;, analyzer);
QueryParser parser = new QueryParser(version, &quot;content&quot;, analyzer);
float score = index.Search(parser.Parse(&quot;+author:james +salmon~ +fish* manual~&quot;));
if (score > 0.0f) {
Console.WriteLine(&quot;it&apos;s a match&quot;);
} else {
Console.WriteLine(&quot;no match found&quot;);
}
Console.WriteLine(&quot;indexData=&quot; + index.toString());</code></pre>
<h4>Example XQuery Usage</h4>
<pre><code>(: An XQuery that finds all books authored by James that have something to do with &quot;salmon fishing manuals&quot;, sorted by relevance :)
declare namespace lucene = &quot;java:nux.xom.pool.FullTextUtil&quot;;
declare variable $query := &quot;+salmon~ +fish* manual~&quot;; (: any arbitrary Lucene query can go here :)
for $book in /books/book[author=&quot;James&quot; and lucene:match(abstract, $query) > 0.0]
let $score := lucene:match($book/abstract, $query)
order by $score descending
return $book</code></pre>
<h4>No thread safety guarantees</h4>
An instance can be queried multiple times with the same or different queries,
but an instance is not thread-safe. If desired use idioms such as:
<pre><code>MemoryIndex index = ...
lock (index) {
// read and/or write index (i.e. add fields and/or query)
} </code></pre>
<h4>Performance Notes</h4>
Internally there&apos;s a new data structure geared towards efficient indexing
and searching, plus the necessary support code to seamlessly plug into the Lucene
framework.
</p>
<p>
This class performs very well for very small texts (e.g. 10 chars)
as well as for large texts (e.g. 10 MB) and everything in between.
Typically, it is about 10-100 times faster than <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a>.
Note that <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a> has particularly
large efficiency overheads for small to medium sized texts, both in time and space.
Indexing a field with N tokens takes O(N) in the best case, and O(N logN) in the worst
case. Memory consumption is probably larger than for <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a>.
</p>
<p>
Example throughput of many simple term queries over a single MemoryIndex:
~500000 queries/sec on a MacBook Pro, jdk 1.5.0_06, server VM.
As always, your mileage may vary.
</p>
<p>
If you&apos;re curious about
the whereabouts of bottlenecks, run java 1.5 with the non-perturbing &apos;-server
-agentlib:hprof=cpu=samples,depth=10&apos; flags, then study the trace log and
correlate its hotspot trailer with its call stack headers (see <a target="_blank" href="http://java.sun.com/developer/technicalArticles/Programming/HPROF.html">
hprof tracing </a>).
</p>
</section>
</article>
</div>
<div class="hidden-sm col-md-2" role="complementary">
<div class="sideaffix">
<div class="contribution">
<ul class="nav">
<li>
<a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00010/src/Lucene.Net.Memory/package.md/#L2" class="contribution-link">Improve this Doc</a>
</li>
</ul>
</div>
<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
</nav>
</div>
</div>
</div>
</div>
<footer>
<div class="grad-bottom"></div>
<div class="footer">
<div class="container">
<span class="pull-right">
<a href="#top">Back to top</a>
</span>
Copyright © 2020 Licensed to the Apache Software Foundation (ASF)
</div>
</div>
</footer>
</div>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
</body>
</html>