blob: 64a110955cdf377b7b0b4cffb99d26672af9b8f7 [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE]><![endif]-->
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Namespace Lucene.Net.Demo
| Apache Lucene.NET 4.8.0-beta00010 Documentation </title>
<meta name="viewport" content="width=device-width">
<meta name="title" content="Namespace Lucene.Net.Demo
| Apache Lucene.NET 4.8.0-beta00010 Documentation ">
<meta name="generator" content="docfx 2.56.0.0">
<link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
<meta property="docfx:navrel" content="toc.html">
<meta property="docfx:tocrel" content="demo/toc.html">
<meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">
</head>
<body data-spy="scroll" data-target="#affix" data-offset="120">
<div id="wrapper">
<header>
<nav id="autocollapse" class="navbar ng-scope" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">
<img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
</a>
</div>
<div class="collapse navbar-collapse" id="navbar">
<form class="navbar-form navbar-right" role="search" id="search">
<div class="form-group">
<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
</div>
</form>
</div>
</div>
</nav>
<div class="subnav navbar navbar-default">
<div class="container hide-when-search">
<ul class="level0 breadcrumb">
<li>
<a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
<span id="breadcrumb">
<ul class="breadcrumb">
<li></li>
</ul>
</span>
</li>
</ul>
</div>
</div>
</header>
<div class="container body-content">
<div id="search-results">
<div class="search-list"></div>
<div class="sr-items">
<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
</div>
<ul id="pagination"></ul>
</div>
</div>
<div role="main" class="container body-content hide-when-search">
<div class="sidenav hide-when-search">
<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
<div class="sidetoggle collapse" id="sidetoggle">
<div id="sidetoc"></div>
</div>
</div>
<div class="article row grid-right">
<div class="col-md-10">
<article class="content wrap" id="_content" data-uid="Lucene.Net.Demo">
<h1 id="Lucene_Net_Demo" data-uid="Lucene.Net.Demo" class="text-break">Namespace Lucene.Net.Demo
</h1>
<div class="markdown level0 summary"><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>The demo module offers simple example code to show the features of Lucene.</p>
<h1 id="apache-lucene---building-and-installing-the-basic-demo">Apache Lucene - Building and Installing the Basic Demo</h1>
<ul>
<li><p><a href="#about-this-document">About this Document</a></p>
</li>
<li><p><a href="#about-the-demo">About the Demo</a></p>
</li>
<li><p><a href="#indexing-files">Indexing Files</a></p>
</li>
<li><p><a href="#about-the-code">About the code</a></p>
</li>
<li><p><a href="#location-of-the-source">Location of the source</a></p>
</li>
<li><p><a class="xref" href="Lucene.Net.Demo.IndexFiles.html">IndexFiles</a></p>
</li>
<li><p><a href="#searching-files">Searching Files</a></p>
</li>
</ul>
<h2 id="about-this-document">About this Document</h2>
<p>This document is intended as a &quot;getting started&quot; guide to using and running the Lucene demos. It walks you through some basic installation and configuration.</p>
<h2 id="about-the-demo">About the Demo</h2>
<p>The Lucene command-line demo code consists of an application that demonstrates various functionalities of Lucene and how you can add Lucene to your applications.</p>
<h2 id="indexing-files">Indexing Files</h2>
<p>Once you&#39;ve gotten this far you&#39;re probably itching to go. Let&#39;s <strong>build an index!</strong> Assuming you&#39;ve set your CLASSPATH correctly, just type:</p>
<pre><code> java org.apache.lucene.demo.IndexFiles -docs {path-to-lucene}/src
</code></pre><p>This will produce a subdirectory called <span class="codefrag">index</span>
which will contain an index of all of the Lucene source code.</p>
<p>To <strong>search the index</strong> type:</p>
<pre><code> java org.apache.lucene.demo.SearchFiles
</code></pre><p>You&#39;ll be prompted for a query. Type in a gibberish or made up word (for example:
&quot;supercalifragilisticexpialidocious&quot;).
You&#39;ll see that there are no maching results in the lucene source code.
Now try entering the word &quot;string&quot;. That should return a whole bunch
of documents. The results will page at every tenth result and ask you whether
you want more results.</p>
<h2 id="about-the-code">About the code</h2>
<p>In this section we walk through the sources behind the command-line Lucene demo: where to find them, their parts and their function. This section is intended for Java developers wishing to understand how to use Lucene in their applications.</p>
<h2 id="location-of-the-source">Location of the source</h2>
<p>The files discussed here are linked into this documentation directly: * <a class="xref" href="Lucene.Net.Demo.IndexFiles.html">IndexFiles</a>: code to create a Lucene index. * <a class="xref" href="Lucene.Net.Demo.SearchFiles.html">SearchFiles</a>: code to search a Lucene index. </p>
<h2 id="indexfiles">IndexFiles</h2>
<p>As we discussed in the previous walk-through, the <a class="xref" href="Lucene.Net.Demo.IndexFiles.html">IndexFiles</a> class creates a Lucene Index. Let&#39;s take a look at how it does this.</p>
<p>The <span class="codefrag">main()</span> method parses the command-line parameters, then in preparation for instantiating <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Index.IndexWriter.html">IndexWriter</a>, opens a <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.Directory.html">Directory</a>, and instantiates <a href="xref:Lucene.Net.Analysis.Standard.StandardAnalyzer">StandardAnalyzer</a> and <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Index.IndexWriterConfig.html">IndexWriterConfig</a>.</p>
<p>The value of the <span class="codefrag">-index</span> command-line parameter is the name of the filesystem directory where all index information should be stored. If <span class="codefrag">IndexFiles</span> is invoked with a relative path given in the <span class="codefrag">-index</span> command-line parameter, or if the <span class="codefrag">-index</span> command-line parameter is not given, causing the default relative index path &quot;<span class="codefrag">index</span>&quot; to be used, the index path will be created as a subdirectory of the current working directory (if it does not already exist). On some platforms, the index path may be created in a different directory (such as the user&#39;s home directory).</p>
<p>The <span class="codefrag">-docs</span> command-line parameter value is the location of the directory containing files to be indexed.</p>
<p>The <span class="codefrag">-update</span> command-line parameter tells <span class="codefrag">IndexFiles</span> not to delete the index if it already exists. When <span class="codefrag">-update</span> is not given, <span class="codefrag">IndexFiles</span> will first wipe the slate clean before indexing any documents.</p>
<p>Lucene <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.Directory.html">Directory</a>s are used by the <span class="codefrag">IndexWriter</span> to store information in the index. In addition to the <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Store.FSDirectory.html">FSDirectory</a> implementation we are using, there are several other <span class="codefrag">Directory</span> subclasses that can write to RAM, to databases, etc.</p>
<p>Lucene <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Analysis.Analyzer.html">Analyzer</a>s are processing pipelines that break up text into indexed tokens, a.k.a. terms, and optionally perform other operations on these tokens, e.g. downcasing, synonym insertion, filtering out unwanted tokens, etc. The <span class="codefrag">Analyzer</span> we are using is <span class="codefrag">StandardAnalyzer</span>, which creates tokens using the Word Break rules from the Unicode Text Segmentation algorithm specified in <a href="http://unicode.org/reports/tr29/">Unicode Standard Annex #29</a>; converts tokens to lowercase; and then filters out stopwords. Stopwords are common language words such as articles (a, an, the, etc.) and other tokens that may have less value for searching. It should be noted that there are different rules for every language, and you should use the proper analyzer for each. Lucene currently provides Analyzers for a number of different languages (see the javadocs under <a href="../analyzers-common/overview-summary.html">lucene/analysis/common/src/java/org/apache/lucene/analysis</a>).</p>
<p>The <span class="codefrag">IndexWriterConfig</span> instance holds all configuration for <span class="codefrag">IndexWriter</span>. For example, we set the <span class="codefrag">OpenMode</span> to use here based on the value of the <span class="codefrag">-update</span> command-line parameter.</p>
<p>Looking further down in the file, after <span class="codefrag">IndexWriter</span> is instantiated, you should see the <span class="codefrag">indexDocs()</span> code. This recursive function crawls the directories and creates <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Documents.Document.html">Document</a> objects. The <span class="codefrag">Document</span> is simply a data object to represent the text content from the file as well as its creation time and location. These instances are added to the <span class="codefrag">IndexWriter</span>. If the <span class="codefrag">-update</span> command-line parameter is given, the <span class="codefrag">IndexWriterConfig</span> <span class="codefrag">OpenMode</span> will be set to <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Index.IndexWriterConfig.html#methods">OpenMode.CREATE_OR_APPEND</a>, and rather than adding documents to the index, the <span class="codefrag">IndexWriter</span> will <strong>update</strong> them in the index by attempting to find an already-indexed document with the same identifier (in our case, the file path serves as the identifier); deleting it from the index if it exists; and then adding the new document to the index.</p>
<h2 id="searching-files">Searching Files</h2>
<p>The <a class="xref" href="Lucene.Net.Demo.SearchFiles.html">SearchFiles</a> class is quite simple. It primarily collaborates with an <a href="xref:Lucene.Net.Search.IndexSearcher">IndexSearcher</a>, <a href="xref:Lucene.Net.Analysis.Standard.StandardAnalyzer">StandardAnalyzer</a>, (which is used in the <a class="xref" href="Lucene.Net.Demo.IndexFiles.html">IndexFiles</a> class as well) and a <a href="xref:Lucene.Net.QueryParsers.Classic.QueryParser">QueryParser</a>. The query parser is constructed with an analyzer used to interpret your query text in the same way the documents are interpreted: finding word boundaries, downcasing, and removing useless words like &#39;a&#39;, &#39;an&#39; and &#39;the&#39;. The &lt;xref:Lucene.Net.Search.Query&gt; object contains the results from the <a href="xref:Lucene.Net.QueryParsers.Classic.QueryParser">QueryParser</a> which is passed to the searcher. Note that it&#39;s also possible to programmatically construct a rich &lt;xref:Lucene.Net.Search.Query&gt; object without using the query parser. The query parser just enables decoding the <a href="../queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description"> Lucene query syntax</a> into the corresponding <a href="xref:Lucene.Net.Search.Query">Query</a> object.</p>
<p><span class="codefrag">SearchFiles</span> uses the <a href="xref:Lucene.Net.Search.IndexSearcher#methods">IndexSearcher.search</a> method that returns <a href="xref:Lucene.Net.Search.TopDocs">TopDocs</a> with max <span class="codefrag">n</span> hits. The results are printed in pages, sorted by score (i.e. relevance).</p>
</div>
<div class="markdown level0 conceptual"></div>
<div class="markdown level0 remarks"></div>
<h3 id="classes">Classes
</h3>
<h4><a class="xref" href="Lucene.Net.Demo.IndexFiles.html">IndexFiles</a></h4>
<section><p>Index all text files under a directory.
<p>
This is a command-line application demonstrating simple Lucene indexing.
Run it with no command-line arguments for usage information.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Demo.SearchFiles.html">SearchFiles</a></h4>
<section><p>Simple command-line based search demo.</p>
</section>
</article>
</div>
<div class="hidden-sm col-md-2" role="complementary">
<div class="sideaffix">
<div class="contribution">
<ul class="nav">
<li>
<a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00010/src/Lucene.Net.Demo/overview.md/#L2" class="contribution-link">Improve this Doc</a>
</li>
</ul>
</div>
<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
</nav>
</div>
</div>
</div>
</div>
<footer>
<div class="grad-bottom"></div>
<div class="footer">
<div class="container">
<span class="pull-right">
<a href="#top">Back to top</a>
</span>
Copyright © 2020 Licensed to the Apache Software Foundation (ASF)
</div>
</div>
</footer>
</div>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
</body>
</html>