| <!DOCTYPE html> |
| <!--[if IE]><![endif]--> |
| <html> |
| |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> |
| <title>Namespace Lucene.Net.Index.Memory |
| | Apache Lucene.NET 4.8.0-beta00010 Documentation </title> |
| <meta name="viewport" content="width=device-width"> |
| <meta name="title" content="Namespace Lucene.Net.Index.Memory |
| | Apache Lucene.NET 4.8.0-beta00010 Documentation "> |
| <meta name="generator" content="docfx 2.56.0.0"> |
| |
| <link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css"> |
| <meta property="docfx:navrel" content="toc.html"> |
| <meta property="docfx:tocrel" content="memory/toc.html"> |
| |
| <meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/"> |
| |
| </head> |
| <body data-spy="scroll" data-target="#affix" data-offset="120"> |
| <div id="wrapper"> |
| <header> |
| |
| <nav id="autocollapse" class="navbar ng-scope" role="navigation"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| |
| <a class="navbar-brand" href="/"> |
| <img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt=""> |
| </a> |
| </div> |
| <div class="collapse navbar-collapse" id="navbar"> |
| <form class="navbar-form navbar-right" role="search" id="search"> |
| <div class="form-group"> |
| <input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off"> |
| </div> |
| </form> |
| </div> |
| </div> |
| </nav> |
| |
| <div class="subnav navbar navbar-default"> |
| <div class="container hide-when-search"> |
| <ul class="level0 breadcrumb"> |
| <li> |
| <a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a> |
| <span id="breadcrumb"> |
| <ul class="breadcrumb"> |
| <li></li> |
| </ul> |
| </span> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </header> |
| <div class="container body-content"> |
| |
| <div id="search-results"> |
| <div class="search-list"></div> |
| <div class="sr-items"> |
| <p><i class="glyphicon glyphicon-refresh index-loading"></i></p> |
| </div> |
| <ul id="pagination"></ul> |
| </div> |
| </div> |
| <div role="main" class="container body-content hide-when-search"> |
| |
| <div class="sidenav hide-when-search"> |
| <a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a> |
| <div class="sidetoggle collapse" id="sidetoggle"> |
| <div id="sidetoc"></div> |
| </div> |
| </div> |
| <div class="article row grid-right"> |
| <div class="col-md-10"> |
| <article class="content wrap" id="_content" data-uid="Lucene.Net.Index.Memory"> |
| |
| <h1 id="Lucene_Net_Index_Memory" data-uid="Lucene.Net.Index.Memory" class="text-break">Namespace Lucene.Net.Index.Memory |
| </h1> |
| <div class="markdown level0 summary"><!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <p>High-performance single-document main memory Apache Lucene fulltext search index.</p> |
| </div> |
| <div class="markdown level0 conceptual"></div> |
| <div class="markdown level0 remarks"></div> |
| <h3 id="classes">Classes |
| </h3> |
| <h4><a class="xref" href="Lucene.Net.Index.Memory.MemoryIndex.html">MemoryIndex</a></h4> |
| <section><p>High-performance single-document main memory Apache Lucene fulltext search index. </p> |
| <h4>Overview</h4> |
| |
| <p>This class is a replacement/substitute for a large subset of |
| <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a> functionality. It is designed to |
| enable maximum efficiency for on-the-fly matchmaking combining structured and |
| fuzzy fulltext search in realtime streaming applications such as Nux XQuery based XML |
| message queues, publish-subscribe systems for Blogs/newsfeeds, text chat, data acquisition and |
| distribution systems, application level routers, firewalls, classifiers, etc. |
| Rather than targeting fulltext search of infrequent queries over huge persistent |
| data archives (historic search), this class targets fulltext search of huge |
| numbers of queries over comparatively small transient realtime data (prospective |
| search). |
| For example as in </p> |
| <pre><code>float score = Search(string text, Query query)</code></pre> |
| <p> |
| Each instance can hold at most one Lucene "document", with a document containing |
| zero or more "fields", each field having a name and a fulltext value. The |
| fulltext value is tokenized (split and transformed) into zero or more index terms |
| (aka words) on <pre><code>AddField()</code></pre>, according to the policy implemented by an |
| Analyzer. For example, Lucene analyzers can split on whitespace, normalize to lower case |
| for case insensitivity, ignore common terms with little discriminatory value such as "he", "in", "and" (stop |
| words), reduce the terms to their natural linguistic root form such as "fishing" |
| being reduced to "fish" (stemming), resolve synonyms/inflexions/thesauri |
| (upon indexing and/or querying), etc. For details, see |
| <a target="_blank" href="http://today.java.net/pub/a/today/2003/07/30/LuceneIntro.html">Lucene Analyzer Intro</a>. |
| </p> |
| <p> |
| Arbitrary Lucene queries can be run against this class - see <a target="_blank" href="{@docRoot}/../queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description"> |
| Lucene Query Syntax</a> |
| as well as <a target="_blank" href="http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html">Query Parser Rules</a>. |
| Note that a Lucene query selects on the field names and associated (indexed) |
| tokenized terms, not on the original fulltext(s) - the latter are not stored |
| but rather thrown away immediately after tokenization. |
| </p> |
| <p> |
| For some interesting background information on search technology, see Bob Wyman's |
| <a target="_blank" href="http://bobwyman.pubsub.com/main/2005/05/mary_hodder_poi.html">Prospective Search</a>, |
| Jim Gray's |
| <a target="_blank" href="http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=293&page=4"> |
| A Call to Arms - Custom subscriptions</a>, and Tim Bray's |
| <a target="_blank" href="http://www.tbray.org/ongoing/When/200x/2003/07/30/OnSearchTOC">On Search, the Series</a>. |
| |
| |
| <h4>Example Usage</h4> |
| |
| <pre><code>Analyzer analyzer = new SimpleAnalyzer(version); |
| MemoryIndex index = new MemoryIndex(); |
| index.AddField("content", "Readings about Salmons and other select Alaska fishing Manuals", analyzer); |
| index.AddField("author", "Tales of James", analyzer); |
| QueryParser parser = new QueryParser(version, "content", analyzer); |
| float score = index.Search(parser.Parse("+author:james +salmon~ +fish* manual~")); |
| if (score > 0.0f) { |
| Console.WriteLine("it's a match"); |
| } else { |
| Console.WriteLine("no match found"); |
| } |
| Console.WriteLine("indexData=" + index.toString());</code></pre> |
| |
| |
| <h4>Example XQuery Usage</h4> |
| |
| <pre><code>(: An XQuery that finds all books authored by James that have something to do with "salmon fishing manuals", sorted by relevance :) |
| declare namespace lucene = "java:nux.xom.pool.FullTextUtil"; |
| declare variable $query := "+salmon~ +fish* manual~"; (: any arbitrary Lucene query can go here :) |
| |
| for $book in /books/book[author="James" and lucene:match(abstract, $query) > 0.0] |
| let $score := lucene:match($book/abstract, $query) |
| order by $score descending |
| return $book</code></pre> |
| |
| |
| <h4>No thread safety guarantees</h4> |
| |
| An instance can be queried multiple times with the same or different queries, |
| but an instance is not thread-safe. If desired use idioms such as: |
| <pre><code>MemoryIndex index = ... |
| lock (index) { |
| // read and/or write index (i.e. add fields and/or query) |
| } </code></pre> |
| |
| |
| <h4>Performance Notes</h4> |
| |
| Internally there's a new data structure geared towards efficient indexing |
| and searching, plus the necessary support code to seamlessly plug into the Lucene |
| framework. |
| </p> |
| <p> |
| This class performs very well for very small texts (e.g. 10 chars) |
| as well as for large texts (e.g. 10 MB) and everything in between. |
| Typically, it is about 10-100 times faster than <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a>. |
| Note that <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a> has particularly |
| large efficiency overheads for small to medium sized texts, both in time and space. |
| Indexing a field with N tokens takes O(N) in the best case, and O(N logN) in the worst |
| case. Memory consumption is probably larger than for <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.RAMDirectory.html">RAMDirectory</a>. |
| </p> |
| <p> |
| Example throughput of many simple term queries over a single MemoryIndex: |
| ~500000 queries/sec on a MacBook Pro, jdk 1.5.0_06, server VM. |
| As always, your mileage may vary. |
| </p> |
| <p> |
| If you're curious about |
| the whereabouts of bottlenecks, run java 1.5 with the non-perturbing '-server |
| -agentlib:hprof=cpu=samples,depth=10' flags, then study the trace log and |
| correlate its hotspot trailer with its call stack headers (see <a target="_blank" href="http://java.sun.com/developer/technicalArticles/Programming/HPROF.html"> |
| hprof tracing </a>). |
| |
| </p> |
| </section> |
| </article> |
| </div> |
| |
| <div class="hidden-sm col-md-2" role="complementary"> |
| <div class="sideaffix"> |
| <div class="contribution"> |
| <ul class="nav"> |
| <li> |
| <a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00010/src/Lucene.Net.Memory/package.md/#L2" class="contribution-link">Improve this Doc</a> |
| </li> |
| </ul> |
| </div> |
| <nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix"> |
| <!-- <p><a class="back-to-top" href="#top">Back to top</a><p> --> |
| </nav> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <footer> |
| <div class="grad-bottom"></div> |
| <div class="footer"> |
| <div class="container"> |
| <span class="pull-right"> |
| <a href="#top">Back to top</a> |
| </span> |
| Copyright © 2020 Licensed to the Apache Software Foundation (ASF) |
| |
| </div> |
| </div> |
| </footer> |
| </div> |
| |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script> |
| </body> |
| </html> |