blob: 22f86badc983cc85df4798c88214e6827f1c2f35 [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE]><![endif]-->
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Namespace Lucene.Net.Util.Fst
| Apache Lucene.NET 4.8.0-beta00013 Documentation </title>
<meta name="viewport" content="width=device-width">
<meta name="title" content="Namespace Lucene.Net.Util.Fst
| Apache Lucene.NET 4.8.0-beta00013 Documentation ">
<meta name="generator" content="docfx 2.56.2.0">
<link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
<meta property="docfx:navrel" content="toc.html">
<meta property="docfx:tocrel" content="core/toc.html">
<meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">
</head>
<body data-spy="scroll" data-target="#affix" data-offset="120">
<span id="forkongithub"><a href="https://github.com/apache/lucenenet" target="_blank">Fork me on GitHub</a></span>
<div id="wrapper">
<header>
<nav id="autocollapse" class="navbar ng-scope" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">
<img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
</a>
</div>
<div class="collapse navbar-collapse" id="navbar">
<form class="navbar-form navbar-right" role="search" id="search">
<div class="form-group">
<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
</div>
</form>
</div>
</div>
</nav>
<div class="subnav navbar navbar-default">
<div class="container hide-when-search">
<ul class="level0 breadcrumb">
<li>
<a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
<span id="breadcrumb">
<ul class="breadcrumb">
<li></li>
</ul>
</span>
</li>
</ul>
</div>
</div>
</header>
<div class="container body-content">
<div id="search-results">
<div class="search-list"></div>
<div class="sr-items">
<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
</div>
<ul id="pagination"></ul>
</div>
</div>
<div role="main" class="container body-content hide-when-search">
<div class="sidenav hide-when-search">
<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
<div class="sidetoggle collapse" id="sidetoggle">
<div id="sidetoc"></div>
</div>
</div>
<div class="article row grid-right">
<div class="col-md-10">
<article class="content wrap" id="_content" data-uid="Lucene.Net.Util.Fst">
<h1 id="Lucene_Net_Util_Fst" data-uid="Lucene.Net.Util.Fst" class="text-break">Namespace Lucene.Net.Util.Fst
</h1>
<div class="markdown level0 summary"><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>Finite state transducers</p>
<p>This package implements <a href="http://en.wikipedia.org/wiki/Finite_state_transducer">
Finite State Transducers</a> with the following characteristics:</p>
<ul>
<li><p>Fast and low memory overhead construction of the minimal FST
(but inputs must be provided in sorted order)</p>
</li>
<li><p>Low object overhead and quick deserialization (byte[] representation)</p>
</li>
<li><p>Optional two-pass compression: <a class="xref" href="Lucene.Net.Util.Fst.FST.html#methods">FST.pack</a></p>
</li>
<li><p><a class="xref" href="Lucene.Net.Util.Fst.Util.html#methods">Lookup-by-output</a> when the
outputs are in sorted order (e.g., ordinals or file pointers)</p>
</li>
<li><p>Pluggable <a href="xref:Lucene.Net.Util.Fst.Outputs">Outputs</a> representation</p>
</li>
<li><p><a class="xref" href="Lucene.Net.Util.Fst.Util.html#methods">N-shortest-paths</a> search by
weight</p>
</li>
<li><p>Enumerators (<a href="xref:Lucene.Net.Util.Fst.IntsRefFSTEnum">IntsRef</a> and <a class="xref" href="Lucene.Net.Util.Fst.BytesRefFSTEnum.html">BytesRef</a>) that behave like {@link java.util.SortedMap SortedMap} iterators</p>
</li>
</ul>
<p>FST Construction example:</p>
<pre><code> // Input values (keys). These must be provided to Builder in Unicode sorted order!
String inputValues[] = {&quot;cat&quot;, &quot;dog&quot;, &quot;dogs&quot;};
long outputValues[] = {5, 7, 12};
PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
Builder&lt;Long&gt; builder = new Builder&lt;Long&gt;(INPUT_TYPE.BYTE1, outputs);
BytesRef scratchBytes = new BytesRef();
IntsRef scratchInts = new IntsRef();
for (int i = 0; i &lt; inputValues.length; i++) {
scratchBytes.copyChars(inputValues[i]);
builder.add(Util.toIntsRef(scratchBytes, scratchInts), outputValues[i]);
}
FST&lt;Long&gt; fst = builder.finish();
</code></pre><p>Retrieval by key:</p>
<pre><code> Long value = Util.get(fst, new BytesRef(&quot;dog&quot;));
System.out.println(value); // 7
</code></pre><p>Retrieval by value:</p>
<pre><code> // Only works because outputs are also in sorted order
IntsRef key = Util.getByOutput(fst, 12);
System.out.println(Util.toBytesRef(key, scratchBytes).utf8ToString()); // dogs
</code></pre><p>Iterate over key-value pairs in sorted order:</p>
<pre><code> // Like TermsEnum, this also supports seeking (advance)
BytesRefFSTEnum&lt;Long&gt; iterator = new BytesRefFSTEnum&lt;Long&gt;(fst);
while (iterator.next() != null) {
InputOutput&lt;Long&gt; mapEntry = iterator.current();
System.out.println(mapEntry.input.utf8ToString());
System.out.println(mapEntry.output);
}
</code></pre><p>N-shortest paths by weight:</p>
<pre><code> Comparator&lt;Long&gt; comparator = new Comparator&lt;Long&gt;() {
public int compare(Long left, Long right) {
return left.compareTo(right);
}
};
Arc&lt;Long&gt; firstArc = fst.getFirstArc(new Arc&lt;Long&gt;());
MinResult&lt;Long&gt; paths[] = Util.shortestPaths(fst, firstArc, comparator, 2);
System.out.println(Util.toBytesRef(paths[0].input, scratchBytes).utf8ToString()); // cat
System.out.println(paths[0].output); // 5
System.out.println(Util.toBytesRef(paths[1].input, scratchBytes).utf8ToString()); // dog
System.out.println(paths[1].output); // 7
</code></pre></div>
<div class="markdown level0 conceptual"></div>
<div class="markdown level0 remarks"></div>
<h3 id="classes">Classes
</h3>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Builder.html">Builder</a></h4>
<section><p>LUCENENET specific type used to access nested types of <a class="xref" href="Lucene.Net.Util.Fst.Builder-1.html">Builder&lt;T&gt;</a>
without referring to its generic closing type.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Builder.Arc-1.html">Builder.Arc&lt;S&gt;</a></h4>
<section><p>Expert: holds a pending (seen but not yet serialized) arc. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Builder.CompiledNode.html">Builder.CompiledNode</a></h4>
<section></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Builder.FreezeTail-1.html">Builder.FreezeTail&lt;S&gt;</a></h4>
<section><p>Expert: this is invoked by Builder whenever a suffix
is serialized.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Builder.UnCompiledNode-1.html">Builder.UnCompiledNode&lt;S&gt;</a></h4>
<section><p>Expert: holds a pending (seen but not yet serialized) Node. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Builder-1.html">Builder&lt;T&gt;</a></h4>
<section><p>Builds a minimal FST (maps an <a class="xref" href="Lucene.Net.Util.Int32sRef.html">Int32sRef</a> term to an arbitrary
output) from pre-sorted terms with outputs. The FST
becomes an FSA if you use NoOutputs. The FST is written
on-the-fly into a compact serialized format byte array, which can
be saved to / loaded from a Directory or used directly
for traversal. The FST is always finite (no cycles).</p>
<p><p>NOTE: The algorithm is described at
<a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.3698">http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.3698</a></p>
<p><p>The parameterized type <typeparam name="T"></typeparam> is the output type. See the
subclasses of <a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html">Outputs&lt;T&gt;</a>.</p>
<p><p>FSTs larger than 2.1GB are now possible (as of Lucene
4.2). FSTs containing more than 2.1B nodes are also now
possible, however they cannot be packed.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.ByteSequenceOutputs.html">ByteSequenceOutputs</a></h4>
<section><p>An FST <span class="xref">Outputs{BytesRef}</span> implementation where each output
is a sequence of bytes.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.BytesRefFSTEnum.html">BytesRefFSTEnum</a></h4>
<section><p>LUCENENET specific. This class is to mimic Java&apos;s ability to specify
nested classes of Generics without having to specify the generic type
(i.e. BytesRefFSTEnum.InputOutput{T} rather than BytesRefFSTEnum{T}.InputOutput{T})</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.BytesRefFSTEnum.InputOutput-1.html">BytesRefFSTEnum.InputOutput&lt;T&gt;</a></h4>
<section><p>Holds a single input (<a class="xref" href="Lucene.Net.Util.BytesRef.html">BytesRef</a>) + output pair. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.BytesRefFSTEnum-1.html">BytesRefFSTEnum&lt;T&gt;</a></h4>
<section><p>Enumerates all input (<a class="xref" href="Lucene.Net.Util.BytesRef.html">BytesRef</a>) + output pairs in an
FST.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.CharSequenceOutputs.html">CharSequenceOutputs</a></h4>
<section><p>An FST <a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html">Outputs&lt;T&gt;</a> implementation where each output
is a sequence of characters.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.FST.html">FST</a></h4>
<section><p>LUCENENET specific: This new base class is to mimic Java&apos;s ability to use nested types without specifying
a type parameter. i.e. FST.BytesReader instead of FST&lt;BytesRef&gt;.BytesReader</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.FST.Arc-1.html">FST.Arc&lt;T&gt;</a></h4>
<section><p>Represents a single arc.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.FST.BytesReader.html">FST.BytesReader</a></h4>
<section><p>Reads bytes stored in an FST.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.FST-1.html">FST&lt;T&gt;</a></h4>
<section><p>Represents an finite state machine (FST), using a
compact <span class="xref">byte[]</span> format.
<p> The format is similar to what&apos;s used by Morfologik
(<a href="http://sourceforge.net/projects/morfologik">http://sourceforge.net/projects/morfologik</a>).</p>
<p><p> See the <a href="https://lucene.apache.org/core/4_8_0/core/org/apache/lucene/util/fst/package-summary.html">
FST package documentation</a> for some simple examples.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.FSTEnum-1.html">FSTEnum&lt;T&gt;</a></h4>
<section><p>Can Next() and Advance() through the terms in an FST
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Int32SequenceOutputs.html">Int32SequenceOutputs</a></h4>
<section><p>An FST <a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html">Outputs&lt;T&gt;</a> implementation where each output
is a sequence of <span class="xref">System.Int32</span>s.
<p>
NOTE: This was IntSequenceOutputs in Lucene
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Int32sRefFSTEnum.html">Int32sRefFSTEnum</a></h4>
<section><p>LUCENENET specific. This class is to mimic Java&apos;s ability to specify
nested classes of Generics without having to specify the generic type
(i.e. <code>Int32sRefFSTEnum.InputOutput{T}</code> rather than <code>Int32sRefFSTEnum{T}.InputOutput{T}</code>)
<p>
NOTE: This was Int32sRefFSTEnum{T} in Lucene</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Int32sRefFSTEnum.InputOutput-1.html">Int32sRefFSTEnum.InputOutput&lt;T&gt;</a></h4>
<section><p>Holds a single input (<a class="xref" href="Lucene.Net.Util.Int32sRef.html">Int32sRef</a>) + output pair. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Int32sRefFSTEnum-1.html">Int32sRefFSTEnum&lt;T&gt;</a></h4>
<section><p>Enumerates all input (<a class="xref" href="Lucene.Net.Util.Int32sRef.html">Int32sRef</a>) + output pairs in an
FST.
<p>
NOTE: This was IntsRefFSTEnum{T} in Lucene
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.NoOutputs.html">NoOutputs</a></h4>
<section><p>A null FST <a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html">Outputs&lt;T&gt;</a> implementation; use this if
you just want to build an FSA.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html">Outputs&lt;T&gt;</a></h4>
<section><p>Represents the outputs for an FST, providing the basic
algebra required for building and traversing the FST.</p>
<p>Note that any operation that returns NO_OUTPUT must
return the same singleton object from
<a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html#Lucene_Net_Util_Fst_Outputs_1_NoOutput">NoOutput</a>.</p>
<p>LUCENENET IMPORTANT: If <code data-dev-comment-type="typeparamref" class="typeparamref">T</code> is a collection type,
it must implement <span class="xref">System.Collections.IStructuralEquatable</span>
in order to properly compare its nested values.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.PairOutputs-2.html">PairOutputs&lt;A, B&gt;</a></h4>
<section><p>An FST <a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html">Outputs&lt;T&gt;</a> implementation, holding two other outputs.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.PairOutputs-2.Pair.html">PairOutputs&lt;A, B&gt;.Pair</a></h4>
<section><p>Holds a single pair of two outputs. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.PositiveInt32Outputs.html">PositiveInt32Outputs</a></h4>
<section><p>An FST <a class="xref" href="Lucene.Net.Util.Fst.Outputs-1.html">Outputs&lt;T&gt;</a> implementation where each output
is a non-negative <see cref="T:long?"></see> value.
<p>
NOTE: This was PositiveIntOutputs in Lucene
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Util.html">Util</a></h4>
<section><p>Static helper methods.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Util.FSTPath-1.html">Util.FSTPath&lt;T&gt;</a></h4>
<section><p>Represents a path in TopNSearcher.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Util.Result-1.html">Util.Result&lt;T&gt;</a></h4>
<section><p>Holds a single input (<a class="xref" href="Lucene.Net.Util.Int32sRef.html">Int32sRef</a>) + output, returned by
<a class="xref" href="Lucene.Net.Util.Fst.Util.html#Lucene_Net_Util_Fst_Util_ShortestPaths__1_Lucene_Net_Util_Fst_FST___0__Lucene_Net_Util_Fst_FST_Arc___0____0_System_Collections_Generic_IComparer___0__System_Int32_System_Boolean_">ShortestPaths&lt;T&gt;(FST&lt;T&gt;, FST.Arc&lt;T&gt;, T, IComparer&lt;T&gt;, Int32, Boolean)</a>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Util.TopNSearcher-1.html">Util.TopNSearcher&lt;T&gt;</a></h4>
<section><p>Utility class to find top N shortest paths from start
point(s).</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Util.TopResults-1.html">Util.TopResults&lt;T&gt;</a></h4>
<section><p>Holds the results for a top N search using <a class="xref" href="Lucene.Net.Util.Fst.Util.TopNSearcher-1.html">Util.TopNSearcher&lt;T&gt;</a></p>
</section>
<h3 id="interfaces">Interfaces
</h3>
<h4><a class="xref" href="Lucene.Net.Util.Fst.Builder.INode.html">Builder.INode</a></h4>
<section></section>
<h3 id="enums">Enums
</h3>
<h4><a class="xref" href="Lucene.Net.Util.Fst.FST.INPUT_TYPE.html">FST.INPUT_TYPE</a></h4>
<section><p>Specifies allowed range of each int input label for this FST.</p>
</section>
</article>
</div>
<div class="hidden-sm col-md-2" role="complementary">
<div class="sideaffix">
<div class="contribution">
<ul class="nav">
<li>
<a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00013/src/Lucene.Net/Util/Fst/package.md/#L2" class="contribution-link">Improve this Doc</a>
</li>
</ul>
</div>
<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
</nav>
</div>
</div>
</div>
</div>
<footer>
<div class="grad-bottom"></div>
<div class="footer">
<div class="container">
<span class="pull-right">
<a href="#top">Back to top</a>
</span>
Copyright © 2020 The Apache Software Foundation, Licensed under the <a href='http://www.apache.org/licenses/LICENSE-2.0' target='_blank'>Apache License, Version 2.0</a><br> <small>Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation. <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</small>
</div>
</div>
</footer>
</div>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
</body>
</html>