src/Lucene.Net/overview.md

--- uid: Lucene.Net title: Lucene.Net summary: *content

Apache Lucene is a high-performance, full-featured text search engine library. Here's a simple example how to use Lucene for indexing and searching (using JUnit to check if the results are what we expect):

    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);

// Store the index in memory:
    Directory directory = new RAMDirectory();
    // To store an index on disk, use this instead:
    //Directory directory = FSDirectory.open("/tmp/testindex");
    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer);
    IndexWriter iwriter = new IndexWriter(directory, config);
    Document doc = new Document();
    String text = "This is the text to be indexed.";
    doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
    iwriter.addDocument(doc);
    iwriter.close();

    // Now search the index:
    DirectoryReader ireader = DirectoryReader.open(directory);
    IndexSearcher isearcher = new IndexSearcher(ireader);
    // Parse a simple query that searches for "text":
    QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "fieldname", analyzer);
    Query query = parser.parse("text");
    ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
    assertEquals(1, hits.length);
    // Iterate through the results:
    for (int i = 0; i < hits.length;="" i++)="" {="" document="" hitdoc="isearcher.doc(hits[i].doc);" assertequals("this="" is="" the="" text="" to="" be="" indexed.",="" hitdoc.get("fieldname"));="" }="" ireader.close();="">

The Lucene API is divided into several packages:

xref:Lucene.Net.Analysis defines an abstract Analyzer API for converting text from a {@link java.io.Reader} into a TokenStream, an enumeration of token Attributes. A TokenStream can be composed by applying TokenFilters to the output of a Tokenizer. Tokenizers and TokenFilters are strung together and applied with an Analyzer. analyzers-common provides a number of Analyzer implementations, including StopAnalyzer and the grammar-based StandardAnalyzer.
xref:Lucene.Net.Codecs provides an abstraction over the encoding and decoding of the inverted index structure, as well as different implementations that can be chosen depending upon application needs.
xref:Lucene.Net.Documents provides a simple Document class. A Document is simply a set of named Fields, whose values may be strings or instances of {@link java.io.Reader}.
xref:Lucene.Net.Index provides two primary classes: IndexWriter, which creates and adds documents to indices; and xref:Lucene.Net.Index.IndexReader, which accesses the data in the index.
xref:Lucene.Net.Search provides data structures to represent queries (ie TermQuery for individual words, PhraseQuery for phrases, and BooleanQuery for boolean combinations of queries) and the IndexSearcher which turns queries into TopDocs. A number of QueryParsers are provided for producing query structures from strings or xml.
xref:Lucene.Net.Store defines an abstract class for storing persistent data, the Directory, which is a collection of named files written by an IndexOutput and read by an IndexInput. Multiple implementations are provided, including FSDirectory, which uses a file system directory to store files, and RAMDirectory which implements files as memory-resident data structures.
xref:Lucene.Net.Util contains a few handy data structures and util classes, ie OpenBitSet and PriorityQueue.

To use Lucene, an application should:

Create Documents by adding Fields;
Create an IndexWriter and add documents to it with AddDocument;
Call QueryParser.parse() to build a query from a string; and
Create an IndexSearcher and pass the query to its Search method.

Some simple examples of code which does this are:

IndexFiles.java creates an index for all the files contained in a directory.
SearchFiles.java prompts for queries and searches an index.

﻿--- uid: Lucene.Net title: Lucene.Net summary: *content

--- uid: Lucene.Net title: Lucene.Net summary: *content