blob: 15d471f848aa345a883cba4fd3969853c9c83ebc [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE]><![endif]-->
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Namespace Lucene.Net.Util.Automaton
| Apache Lucene.NET 4.8.0-beta00010 Documentation </title>
<meta name="viewport" content="width=device-width">
<meta name="title" content="Namespace Lucene.Net.Util.Automaton
| Apache Lucene.NET 4.8.0-beta00010 Documentation ">
<meta name="generator" content="docfx 2.56.0.0">
<link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
<meta property="docfx:navrel" content="toc.html">
<meta property="docfx:tocrel" content="core/toc.html">
<meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">
</head>
<body data-spy="scroll" data-target="#affix" data-offset="120">
<div id="wrapper">
<header>
<nav id="autocollapse" class="navbar ng-scope" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">
<img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
</a>
</div>
<div class="collapse navbar-collapse" id="navbar">
<form class="navbar-form navbar-right" role="search" id="search">
<div class="form-group">
<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
</div>
</form>
</div>
</div>
</nav>
<div class="subnav navbar navbar-default">
<div class="container hide-when-search">
<ul class="level0 breadcrumb">
<li>
<a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
<span id="breadcrumb">
<ul class="breadcrumb">
<li></li>
</ul>
</span>
</li>
</ul>
</div>
</div>
</header>
<div class="container body-content">
<div id="search-results">
<div class="search-list"></div>
<div class="sr-items">
<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
</div>
<ul id="pagination"></ul>
</div>
</div>
<div role="main" class="container body-content hide-when-search">
<div class="sidenav hide-when-search">
<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
<div class="sidetoggle collapse" id="sidetoggle">
<div id="sidetoc"></div>
</div>
</div>
<div class="article row grid-right">
<div class="col-md-10">
<article class="content wrap" id="_content" data-uid="Lucene.Net.Util.Automaton">
<h1 id="Lucene_Net_Util_Automaton" data-uid="Lucene.Net.Util.Automaton" class="text-break">Namespace Lucene.Net.Util.Automaton
</h1>
<div class="markdown level0 summary"><!--
dk.brics.automaton
Copyright (c) 2001-2009 Anders Moeller
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-->
<p>Finite-state automaton for regular expressions.</p>
<p>This package contains a full DFA/NFA implementation with Unicode
alphabet and support for all standard (and a number of non-standard)
regular expression operations.</p>
<p>The most commonly used functionality is located in the classes
<tt><a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html">Automaton</a></tt> and
<tt><a class="xref" href="Lucene.Net.Util.Automaton.RegExp.html">RegExp</a></tt>.</p>
<p>For more information, go to the package home page at
<tt><a href="http://www.brics.dk/automaton/">http://www.brics.dk/automaton/</a></tt>.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></div>
<div class="markdown level0 conceptual"></div>
<div class="markdown level0 remarks"></div>
<h3 id="classes">Classes
</h3>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html">Automaton</a></h4>
<section><p>Finite-state automaton with regular expression operations.
<p>
Class invariants:
<ul><li>An automaton is either represented explicitly (with <a class="xref" href="Lucene.Net.Util.Automaton.State.html">State</a> and
<a class="xref" href="Lucene.Net.Util.Automaton.Transition.html">Transition</a> objects) or with a singleton string (see
<a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html#Lucene_Net_Util_Automaton_Automaton_Singleton">Singleton</a> and <a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html#Lucene_Net_Util_Automaton_Automaton_ExpandSingleton">ExpandSingleton()</a>) in case the automaton
is known to accept exactly one string. (Implicitly, all states and
transitions of an automaton are reachable from its initial state.)</li><li>Automata are always reduced (see <a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html#Lucene_Net_Util_Automaton_Automaton_Reduce">Reduce()</a>) and have no
transitions to dead states (see <a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html#Lucene_Net_Util_Automaton_Automaton_RemoveDeadTransitions">RemoveDeadTransitions()</a>).</li><li>If an automaton is nondeterministic, then <a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html#Lucene_Net_Util_Automaton_Automaton_IsDeterministic">IsDeterministic</a>
returns <code>false</code> (but the converse is not required).</li><li>Automata provided as input to operations are generally assumed to be
disjoint.</li></ul>
<p>
If the states or transitions are manipulated manually, the
<a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html#Lucene_Net_Util_Automaton_Automaton_RestoreInvariant">RestoreInvariant()</a> method and <a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html#Lucene_Net_Util_Automaton_Automaton_IsDeterministic">IsDeterministic</a> setter
should be used afterwards to restore representation invariants that are
assumed by the built-in automata operations.</p>
<p><p>
<p>
Note: this class has internal mutable state and is not thread safe. It is
the caller&apos;s responsibility to ensure any necessary synchronization if you
wish to use the same Automaton from multiple threads. In general it is instead
recommended to use a <a class="xref" href="Lucene.Net.Util.Automaton.RunAutomaton.html">RunAutomaton</a> for multithreaded matching: it is immutable,
thread safe, and much faster.
</p></p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.BasicAutomata.html">BasicAutomata</a></h4>
<section><p>Construction of basic automata.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.BasicOperations.html">BasicOperations</a></h4>
<section><p>Basic automata operations.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.ByteRunAutomaton.html">ByteRunAutomaton</a></h4>
<section><p>Automaton representation for matching UTF-8 <span class="xref">byte[]</span>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.CharacterRunAutomaton.html">CharacterRunAutomaton</a></h4>
<section><p>Automaton representation for matching <span class="xref">char[]</span>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.CompiledAutomaton.html">CompiledAutomaton</a></h4>
<section><p>Immutable class holding compiled details for a given
<a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html">Automaton</a>. The <a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html">Automaton</a> is deterministic, must not have
dead states but is not necessarily minimal.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.LevenshteinAutomata.html">LevenshteinAutomata</a></h4>
<section><p>Class to construct DFAs that match a word within some edit distance.
<p>
Implements the algorithm described in:
Schulz and Mihov: Fast String Correction with Levenshtein Automata
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.MinimizationOperations.html">MinimizationOperations</a></h4>
<section><p>Operations for minimizing automata.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.RegExp.html">RegExp</a></h4>
<section><p>Regular Expression extension to <a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html">Automaton</a>.
<p>
Regular expressions are built from the following abstract syntax:
<p>
<table><tbody><tr><td><em>regexp</em>::=<em>unionexp</em></td><td></td></tr><tr><td>|</td><td></td></tr><tr><td><em>unionexp</em>::=<em>interexp</em> <tt><strong>|</strong></tt> <em>unionexp</em>(union)</td><td></td></tr><tr><td>|<em>interexp</em></td><td></td></tr><tr><td><em>interexp</em>::=<em>concatexp</em> <tt><strong>&amp;</strong></tt> <em>interexp</em>(intersection)<small>[OPTIONAL]</small></td><td></td></tr><tr><td>|<em>concatexp</em></td><td></td></tr><tr><td><em>concatexp</em>::=<em>repeatexp</em> <em>concatexp</em>(concatenation)</td><td></td></tr><tr><td>|<em>repeatexp</em></td><td></td></tr><tr><td><em>repeatexp</em>::=<em>repeatexp</em> <tt><strong>?</strong></tt>(zero or one occurrence)</td><td></td></tr><tr><td>|<em>repeatexp</em> <tt><strong>*</strong></tt>(zero or more occurrences)</td><td></td></tr><tr><td>|<em>repeatexp</em> <tt><strong>+</strong></tt>(one or more occurrences)</td><td></td></tr><tr><td>|<em>repeatexp</em> <tt><strong>{</strong><em>n</em><strong>}</strong></tt>(<tt><em>n</em></tt> occurrences)</td><td></td></tr><tr><td>|<em>repeatexp</em> <tt><strong>{</strong><em>n</em><strong>,}</strong></tt>(<tt><em>n</em></tt> or more occurrences)</td><td></td></tr><tr><td>|<em>repeatexp</em> <tt><strong>{</strong><em>n</em><strong>,</strong><em>m</em><strong>}</strong></tt>(<tt><em>n</em></tt> to <tt><em>m</em></tt> occurrences, including both)</td><td></td></tr><tr><td>|<em>complexp</em></td><td></td></tr><tr><td><em>complexp</em>::=<tt><strong>~</strong></tt> <em>complexp</em>(complement)<small>[OPTIONAL]</small></td><td></td></tr><tr><td>|<em>charclassexp</em></td><td></td></tr><tr><td><em>charclassexp</em>::=<tt><strong>[</strong></tt> <em>charclasses</em> <tt><strong>]</strong></tt>(character class)</td><td></td></tr><tr><td>|<tt><strong>[^</strong></tt> <em>charclasses</em> <tt><strong>]</strong></tt>(negated character class)</td><td></td></tr><tr><td>|<em>simpleexp</em></td><td></td></tr><tr><td><em>charclasses</em>::=<em>charclass</em> <em>charclasses</em></td><td></td></tr><tr><td>|<em>charclass</em></td><td></td></tr><tr><td><em>charclass</em>::=<em>charexp</em> <tt><strong>-</strong></tt> <em>charexp</em>(character range, including end-points)</td><td></td></tr><tr><td>|<em>charexp</em></td><td></td></tr><tr><td><em>simpleexp</em>::=<em>charexp</em></td><td></td></tr><tr><td>|<tt><strong>.</strong></tt>(any single character)</td><td></td></tr><tr><td>|<tt><strong>#</strong></tt>(the empty language)<small>[OPTIONAL]</small></td><td></td></tr><tr><td>|<tt><strong>@</strong></tt>(any string)<small>[OPTIONAL]</small></td><td></td></tr><tr><td>|<tt><strong>&quot;</strong></tt> &lt;Unicode string without double-quotes&gt; <tt><strong>&quot;</strong></tt>(a string)</td><td></td></tr><tr><td>|<tt><strong>(</strong></tt> <tt><strong>)</strong></tt>(the empty string)</td><td></td></tr><tr><td>|<tt><strong>(</strong></tt> <em>unionexp</em> <tt><strong>)</strong></tt>(precedence override)</td><td></td></tr><tr><td>|<tt><strong>&lt;</strong></tt> &lt;identifier&gt; <tt><strong>&gt;</strong></tt>(named automaton)<small>[OPTIONAL]</small></td><td></td></tr><tr><td>|<tt><strong>&lt;</strong><em>n</em>-<em>m</em><strong>&gt;</strong></tt>(numerical interval)<small>[OPTIONAL]</small></td><td></td></tr><tr><td><em>charexp</em>::=&lt;Unicode character&gt;(a single non-reserved character)</td><td></td></tr><tr><td>|<tt><strong>&lt;/strong&gt;</strong></tt> &lt;Unicode character&gt; (a single character)</td><td></td></tr></tbody></table></p>
<p><p>
The productions marked <small>[OPTIONAL]</small> are only allowed if
specified by the syntax flags passed to the <a class="xref" href="Lucene.Net.Util.Automaton.RegExp.html">RegExp</a> constructor.
The reserved characters used in the (enabled) syntax must be escaped with
backslash (<code>&lt;/code&gt;) or double-quotes (<code>&quot;...&quot;</code>). (In
contrast to other regexp syntaxes, this is required also in character
classes.) Be aware that dash (<code>-</code>) has a special meaning in
<em>charclass</em> expressions. An identifier is a string not containing right
angle bracket (<code>&gt;</code>) or dash (<code>-</code>). Numerical
intervals are specified by non-negative decimal integers and include both end
points, and if <code>n</code> and <code>m</code> have the same number
of digits, then the conforming strings must have that length (i.e. prefixed
by 0&apos;s).
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></code></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.RunAutomaton.html">RunAutomaton</a></h4>
<section><p>Finite-state automaton with fast run operation.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.SpecialOperations.html">SpecialOperations</a></h4>
<section><p>Special automata operations.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.State.html">State</a></h4>
<section><p><a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html">Automaton</a> state.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.StatePair.html">StatePair</a></h4>
<section><p>Pair of states.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.Transition.html">Transition</a></h4>
<section><p><a class="xref" href="Lucene.Net.Util.Automaton.Automaton.html">Automaton</a> transition.
<p>
A transition, which belongs to a source state, consists of a Unicode
codepoint interval and a destination state.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.UTF32ToUTF8.html">UTF32ToUTF8</a></h4>
<section><p>Converts UTF-32 automata to the equivalent UTF-8 representation.
<p>
<div class="lucene-block lucene-internal">This is a Lucene.NET INTERNAL API, use at your own risk</div></section>
<h3 id="interfaces">Interfaces
</h3>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.IAutomatonProvider.html">IAutomatonProvider</a></h4>
<section><p>Automaton provider for <a class="xref" href="Lucene.Net.Util.Automaton.RegExp.html">RegExp</a>.
<p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h3 id="enums">Enums
</h3>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.CompiledAutomaton.AUTOMATON_TYPE.html">CompiledAutomaton.AUTOMATON_TYPE</a></h4>
<section><p>Automata are compiled into different internal forms for the
most efficient execution depending upon the language they accept.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Util.Automaton.RegExpSyntax.html">RegExpSyntax</a></h4>
<section></section>
</article>
</div>
<div class="hidden-sm col-md-2" role="complementary">
<div class="sideaffix">
<div class="contribution">
<ul class="nav">
<li>
<a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00010/src/Lucene.Net/Util/Automaton/package.md/#L2" class="contribution-link">Improve this Doc</a>
</li>
</ul>
</div>
<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
</nav>
</div>
</div>
</div>
</div>
<footer>
<div class="grad-bottom"></div>
<div class="footer">
<div class="container">
<span class="pull-right">
<a href="#top">Back to top</a>
</span>
Copyright © 2020 Licensed to the Apache Software Foundation (ASF)
</div>
</div>
</footer>
</div>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
</body>
</html>