src/java/overview.html - lucene-solr - Git at Google

 <html>
 <head>
    <title>Jakarta Lucene API</title>
 </head>
 <body>

 <h1>Jakarta Lucene API</h1>
 The Jakarta Lucene API is divided into several packages:
 <ul>
 <li>
 <b><a href="org/apache/lucene/util/package-summary.html">com.lucene.util</a></b>
 contains a few handy data structures, e.g., <a href="org/apache/lucene/util/BitVector.html">BitVector</a>
 and <a href="org/apache/lucene/util/PriorityQueue.html">PriorityQueue</a>.</li>

 <li>
 <b><a href="org/apache/lucene/store/package-summary.html">com.lucene.store</a></b>
 defines an abstract class for storing persistent data, the <a href="org/apache/lucene/store/Directory.html">Directory</a>,
 a collection of named files written by an <a href="org/apache/lucene/store/OutputStream.html">OutputStream</a>
 and read by an <a href="org/apache/lucene/store/InputStream.html">InputStream</a>.&nbsp;
 Two implementations are provided, <a href="org/apache/lucene/store/FSDirectory.html">FSDirectory</a>,
 which uses a file system directory to store files, and <a href="org/apache/lucene/store/RAMDirectory.html">RAMDirectory</a>
 which implements files as memory-resident data structures.</li>

 <li>
 <b><a href="org/apache/lucene/document/package-summary.html">com.lucene.document</a></b>
 provides a simple <a href="org/apache/lucene/document/Document.html">Document</a>
 class.&nbsp; A document is simply a set of named <a href="org/apache/lucene/document/Field.html">Field</a>'s,
 whose values may be strings or instances of <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>.</li>

 <li>
 <b><a href="org/apache/lucene/analysis/package-summary.html">com.lucene.analysis</a></b>
 defines an abstract <a href="org/apache/lucene/analysis/Analyzer.html">Analyzer</a>
 API for converting text from a <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>
 into a <a href="org/apache/lucene/analysis/TokenStream.html">TokenStream</a>,
 an enumeration of&nbsp; <a href="org/apache/lucene/analysis/Token.html">Token</a>'s.&nbsp;
 A TokenStream is composed by applying <a href="org/apache/lucene/analysis/TokenFilter.html">TokenFilter</a>'s
 to the output of a <a href="org/apache/lucene/analysis/Tokenizer.html">Tokenizer</a>.&nbsp;
 A few simple implemenations are provided, including <a href="org/apache/lucene/analysis/StopAnalyzer.html">StopAnalyzer</a>
 and the grammar-based <a href="org/apache/lucene/analysis/standard/StandardAnalyzer.html">StandardAnalyzer</a>.</li>

 <li>
 <b><a href="org/apache/lucene/index/package-summary.html">com.lucene.index</a></b>
 provides two primary classes: <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>,
 which creates and adds documents to indices; and <a href="org/apache/lucene/index/IndexReader.html">IndexReader</a>,
 which accesses the data in the index.</li>

 <li>
 <b><a href="org/apache/lucene/search/package-summary.html">com.lucene.search</a></b>
 provides data structures to represent queries (<a href="org/apache/lucene/search/TermQuery.html">TermQuery</a>
 for individual words, <a href="org/apache/lucene/search/PhraseQuery.html">PhraseQuery</a>
 for phrases, and <a href="org/apache/lucene/search/BooleanQuery.html">BooleanQuery</a>
 for boolean combinations of queries) and the abstract <a href="org/apache/lucene/search/Searcher.html">Searcher</a>
 which turns queries into <a href="org/apache/lucene/search/Hits.html">Hits</a>.
 <a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>
 implements search over a single IndexReader.</li>

 <li>
 <b><a href="org/apache/lucene/queryParser/package-summary.html">com.lucene.queryParser</a></b>
 uses <a href="http://www.suntest.com/JavaCC/">JavaCC</a> to implement a
 <a href="org/apache/lucene/queryParser/QueryParser.html">QueryParser</a>.</li>
 </ul>
 To use Lucene, an application should:
 <ol>
 <li>
 Create <a href="org/apache/lucene/document/Document.html">Document</a>'s by
 adding
 <a href="org/apache/lucene/document/Field.html">Field</a>'s.</li>

 <li>
 Create an <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>
 and add documents to to it with <a href="org/apache/lucene/index/IndexWriter.html#addDocument(com.lucene.document.Document)">addDocument()</a>;</li>

 <li>
 Call <a href="org/apache/lucene/queryParser/QueryParser.html#parse(java.lang.String)">QueryParser.parse()</a>
 to build a query from a string; and</li>

 <li>
 Create an <a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>
 and pass the query to it's <a href="org/apache/lucene/search/Searcher.html#search(com.lucene.search.Query)">search()</a>
 method.</li>
 </ol>
 Some simple examples of code which does this are:
 <ul>
 <li>
 &nbsp;<a href="../../demo/src/org/apache/lucene/FileDocument.java">FileDocument.java</a> contains
 code to create a Document for a file.</li>

 <li>
 &nbsp;<a href="../../demo/src/org/apache/lucene/IndexFiles.java">IndexFiles.java</a> creates an
 index for all the files contained in a directory.</li>

 <li>
 &nbsp;<a href="../../demo/src/org/apache/lucene/DeleteFiles.java">DeleteFiles.java</a> deletes some
 of these files from the index.</li>

 <li>
 &nbsp;<a href="../../demo/src/org/apache/lucene/SearchFiles.java">SearchFiles.java</a> prompts for
 queries and searches an index.</li>
 </ul>
 To demonstrate these, try something like:
 <blockquote><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFiles rec.food.recipes/soups</b></tt>
 <br><tt>adding rec.food.recipes/soups/abalone-chowder</tt>
 <br><tt>&nbsp; </tt>[ ... ]
 <p><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFilesSearchFiles</b></tt>
 <br><tt>Query: <b>chowder</b></tt>
 <br><tt>Searching for: chowder</tt>
 <br><tt>34 total matching documents</tt>
 <br><tt>0. rec.food.recipes/soups/spam-chowder</tt>
 <br><tt>&nbsp; </tt>[ ... thirty-four documents contain the word "chowder",
 "spam-chowder" with the greatest density.]
 <p><tt>Query: <b>path:chowder</b></tt>
 <br><tt>Searching for: path:chowder</tt>
 <br><tt>31 total matching documents</tt>
 <br><tt>0. rec.food.recipes/soups/abalone-chowder</tt>
 <br><tt>&nbsp; </tt>[ ... only thrity-one have "chowder" in the "path"
 field. ]
 <p><tt>Query: <b>path:"clam chowder"</b></tt>
 <br><tt>Searching for: path:"clam chowder"</tt>
 <br><tt>10 total matching documents</tt>
 <br><tt>0. rec.food.recipes/soups/clam-chowder</tt>
 <br><tt>&nbsp; </tt>[ ... only ten have "clam chowder" in the "path" field.
 ]
 <p><tt>Query: <b>path:"clam chowder" AND manhattan</b></tt>
 <br><tt>Searching for: +path:"clam chowder" +manhattan</tt>
 <br><tt>2 total matching documents</tt>
 <br><tt>0. rec.food.recipes/soups/clam-chowder</tt>
 <br><tt>&nbsp; </tt>[ ... only two also have "manhattan" in the contents.
 ]
 <br>&nbsp;&nbsp;&nbsp; [ Note: "+" and "-" are canonical, but "AND", "OR"
 and "NOT" may be used. ]</blockquote>
 The <a href="../../demo/src/org/apache/lucene/IndexHTML.java">IndexHtml</a> demo is more sophisticated.&nbsp;
 It incrementally maintains an index of HTML files, adding new files as
 they appear, deleting old files as they disappear and re-indexing files
 as they change.
 <blockquote><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFilesIndexHTML -create java/jdk1.1.6/docs/relnotes</b></tt>
 <br><tt>adding java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt>
 <br><tt>&nbsp; </tt>[ ... create an index containing all the relnotes ]
 <p><tt>> <b>rm java/jdk1.1.6/docs/relnotes/smicopyright.html</b></tt>
 <p><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFilesIndexHTML java/jdk1.1.6/docs/relnotes</b></tt>
 <br><tt>deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt></blockquote>
 HTML indexes are searched using SUN's <a href="http://jserv.javasoft.com/products/webserver/index.html">JavaWebServer</a>
 (JWS) and <a href="../../demo/src/org/apache/lucene/Search.jhtml">Search.jhtml</a>.&nbsp; To use
 this:
 <ul>
 <li>
 copy <tt>Search.html</tt> and <tt>Search.jhtml</tt> to JWS's <tt>public_html</tt>
 directory;</li>

 <li>
 copy lucene.jar to JWS's lib directory;</li>

 <li>
 create and maintain your indexes with demo.IndexHTML in JWS's top-level
 directory;</li>

 <li>
 launch JWS, with the <tt>demo</tt> directory on CLASSPATH (only one class
 is actually needed);</li>

 <li>
 visit <a href="../../demo/src/org/apache/lucene/Search.html">Search.html</a>.</li>
 </ul>
 Note that indexes can be updated while searches are going on.&nbsp; <tt>Search.jhtml</tt>
 will re-open the index when it is updated so that the latest version is
 immediately available.
 <br>&nbsp;
 </body>
 </html>
	<html>
	<head>
	<title>Jakarta Lucene API</title>
	</head>
	<body>

	<h1>Jakarta Lucene API</h1>
	The Jakarta Lucene API is divided into several packages:
	<ul>
	<li>
	<b><a href="org/apache/lucene/util/package-summary.html">com.lucene.util</a></b>
	contains a few handy data structures, e.g., <a href="org/apache/lucene/util/BitVector.html">BitVector</a>
	and <a href="org/apache/lucene/util/PriorityQueue.html">PriorityQueue</a>.</li>

	<li>
	<b><a href="org/apache/lucene/store/package-summary.html">com.lucene.store</a></b>
	defines an abstract class for storing persistent data, the <a href="org/apache/lucene/store/Directory.html">Directory</a>,
	a collection of named files written by an <a href="org/apache/lucene/store/OutputStream.html">OutputStream</a>
	and read by an <a href="org/apache/lucene/store/InputStream.html">InputStream</a>.
	Two implementations are provided, <a href="org/apache/lucene/store/FSDirectory.html">FSDirectory</a>,
	which uses a file system directory to store files, and <a href="org/apache/lucene/store/RAMDirectory.html">RAMDirectory</a>
	which implements files as memory-resident data structures.</li>

	<li>
	<b><a href="org/apache/lucene/document/package-summary.html">com.lucene.document</a></b>
	provides a simple <a href="org/apache/lucene/document/Document.html">Document</a>
	class.  A document is simply a set of named <a href="org/apache/lucene/document/Field.html">Field</a>'s,
	whose values may be strings or instances of <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>.</li>

	<li>
	<b><a href="org/apache/lucene/analysis/package-summary.html">com.lucene.analysis</a></b>
	defines an abstract <a href="org/apache/lucene/analysis/Analyzer.html">Analyzer</a>
	API for converting text from a <a href="http://java.sun.com/products/jdk/1.2/docs/api/java/io/Reader.html">java.io.Reader</a>
	into a <a href="org/apache/lucene/analysis/TokenStream.html">TokenStream</a>,
	an enumeration of  <a href="org/apache/lucene/analysis/Token.html">Token</a>'s.
	A TokenStream is composed by applying <a href="org/apache/lucene/analysis/TokenFilter.html">TokenFilter</a>'s
	to the output of a <a href="org/apache/lucene/analysis/Tokenizer.html">Tokenizer</a>.
	A few simple implemenations are provided, including <a href="org/apache/lucene/analysis/StopAnalyzer.html">StopAnalyzer</a>
	and the grammar-based <a href="org/apache/lucene/analysis/standard/StandardAnalyzer.html">StandardAnalyzer</a>.</li>

	<li>
	<b><a href="org/apache/lucene/index/package-summary.html">com.lucene.index</a></b>
	provides two primary classes: <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>,
	which creates and adds documents to indices; and <a href="org/apache/lucene/index/IndexReader.html">IndexReader</a>,
	which accesses the data in the index.</li>

	<li>
	<b><a href="org/apache/lucene/search/package-summary.html">com.lucene.search</a></b>
	provides data structures to represent queries (<a href="org/apache/lucene/search/TermQuery.html">TermQuery</a>
	for individual words, <a href="org/apache/lucene/search/PhraseQuery.html">PhraseQuery</a>
	for phrases, and <a href="org/apache/lucene/search/BooleanQuery.html">BooleanQuery</a>
	for boolean combinations of queries) and the abstract <a href="org/apache/lucene/search/Searcher.html">Searcher</a>
	which turns queries into <a href="org/apache/lucene/search/Hits.html">Hits</a>.
	<a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>
	implements search over a single IndexReader.</li>

	<li>
	<b><a href="org/apache/lucene/queryParser/package-summary.html">com.lucene.queryParser</a></b>
	uses <a href="http://www.suntest.com/JavaCC/">JavaCC</a> to implement a
	<a href="org/apache/lucene/queryParser/QueryParser.html">QueryParser</a>.</li>
	</ul>
	To use Lucene, an application should:
	<ol>
	<li>
	Create <a href="org/apache/lucene/document/Document.html">Document</a>'s by
	adding
	<a href="org/apache/lucene/document/Field.html">Field</a>'s.</li>

	<li>
	Create an <a href="org/apache/lucene/index/IndexWriter.html">IndexWriter</a>
	and add documents to to it with <a href="org/apache/lucene/index/IndexWriter.html#addDocument(com.lucene.document.Document)">addDocument()</a>;</li>

	<li>
	Call <a href="org/apache/lucene/queryParser/QueryParser.html#parse(java.lang.String)">QueryParser.parse()</a>
	to build a query from a string; and</li>

	<li>
	Create an <a href="org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>
	and pass the query to it's <a href="org/apache/lucene/search/Searcher.html#search(com.lucene.search.Query)">search()</a>
	method.</li>
	</ol>
	Some simple examples of code which does this are:
	<ul>
	<li>
	<a href="../../demo/src/org/apache/lucene/FileDocument.java">FileDocument.java</a> contains
	code to create a Document for a file.</li>

	<li>
	<a href="../../demo/src/org/apache/lucene/IndexFiles.java">IndexFiles.java</a> creates an
	index for all the files contained in a directory.</li>

	<li>
	<a href="../../demo/src/org/apache/lucene/DeleteFiles.java">DeleteFiles.java</a> deletes some
	of these files from the index.</li>

	<li>
	<a href="../../demo/src/org/apache/lucene/SearchFiles.java">SearchFiles.java</a> prompts for
	queries and searches an index.</li>
	</ul>
	To demonstrate these, try something like:
	<blockquote><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFiles rec.food.recipes/soups</b></tt>
	<br><tt>adding rec.food.recipes/soups/abalone-chowder</tt>
	<br><tt>  </tt>[ ... ]
	<p><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFilesSearchFiles</b></tt>
	<br><tt>Query: <b>chowder</b></tt>
	<br><tt>Searching for: chowder</tt>
	<br><tt>34 total matching documents</tt>
	<br><tt>0. rec.food.recipes/soups/spam-chowder</tt>
	<br><tt>  </tt>[ ... thirty-four documents contain the word "chowder",
	"spam-chowder" with the greatest density.]
	<p><tt>Query: <b>path:chowder</b></tt>
	<br><tt>Searching for: path:chowder</tt>
	<br><tt>31 total matching documents</tt>
	<br><tt>0. rec.food.recipes/soups/abalone-chowder</tt>
	<br><tt>  </tt>[ ... only thrity-one have "chowder" in the "path"
	field. ]
	<p><tt>Query: <b>path:"clam chowder"</b></tt>
	<br><tt>Searching for: path:"clam chowder"</tt>
	<br><tt>10 total matching documents</tt>
	<br><tt>0. rec.food.recipes/soups/clam-chowder</tt>
	<br><tt>  </tt>[ ... only ten have "clam chowder" in the "path" field.
	]
	<p><tt>Query: <b>path:"clam chowder" AND manhattan</b></tt>
	<br><tt>Searching for: +path:"clam chowder" +manhattan</tt>
	<br><tt>2 total matching documents</tt>
	<br><tt>0. rec.food.recipes/soups/clam-chowder</tt>
	<br><tt>  </tt>[ ... only two also have "manhattan" in the contents.
	]
	<br>    [ Note: "+" and "-" are canonical, but "AND", "OR"
	and "NOT" may be used. ]</blockquote>
	The <a href="../../demo/src/org/apache/lucene/IndexHTML.java">IndexHtml</a> demo is more sophisticated.
	It incrementally maintains an index of HTML files, adding new files as
	they appear, deleting old files as they disappear and re-indexing files
	as they change.
	<blockquote><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFilesIndexHTML -create java/jdk1.1.6/docs/relnotes</b></tt>
	<br><tt>adding java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt>
	<br><tt>  </tt>[ ... create an index containing all the relnotes ]
	<p><tt>> <b>rm java/jdk1.1.6/docs/relnotes/smicopyright.html</b></tt>
	<p><tt>> <b>java -cp lucene.jar:demo/classes org.apache.lucene.IndexFilesIndexHTML java/jdk1.1.6/docs/relnotes</b></tt>
	<br><tt>deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html</tt></blockquote>
	HTML indexes are searched using SUN's <a href="http://jserv.javasoft.com/products/webserver/index.html">JavaWebServer</a>
	(JWS) and <a href="../../demo/src/org/apache/lucene/Search.jhtml">Search.jhtml</a>.  To use
	this:
	<ul>
	<li>
	copy <tt>Search.html</tt> and <tt>Search.jhtml</tt> to JWS's <tt>public_html</tt>
	directory;</li>

	<li>
	copy lucene.jar to JWS's lib directory;</li>

	<li>
	create and maintain your indexes with demo.IndexHTML in JWS's top-level
	directory;</li>

	<li>
	launch JWS, with the <tt>demo</tt> directory on CLASSPATH (only one class
	is actually needed);</li>

	<li>
	visit <a href="../../demo/src/org/apache/lucene/Search.html">Search.html</a>.</li>
	</ul>
	Note that indexes can be updated while searches are going on.  <tt>Search.jhtml</tt>
	will re-open the index when it is updated so that the latest version is
	immediately available.
	<br>
	</body>
	</html>