blob: 0c2bf48162fa4d3f820adb5c473d4c7b3fae9f2a [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE]><![endif]-->
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Namespace Lucene.Net.Analysis
| Apache Lucene.NET 4.8.0 Documentation </title>
<meta name="viewport" content="width=device-width">
<meta name="title" content="Namespace Lucene.Net.Analysis
| Apache Lucene.NET 4.8.0 Documentation ">
<meta name="generator" content="docfx 2.47.0.0">
<link rel="shortcut icon" href="../../logo/favicon.ico">
<link rel="stylesheet" href="../../styles/docfx.vendor.css">
<link rel="stylesheet" href="../../styles/docfx.css">
<link rel="stylesheet" href="../../styles/main.css">
<meta property="docfx:navrel" content="../../toc.html">
<meta property="docfx:tocrel" content="../toc.html">
<meta property="docfx:rel" content="../../">
</head>
<body data-spy="scroll" data-target="#affix" data-offset="120">
<div id="wrapper">
<header>
<nav id="autocollapse" class="navbar ng-scope" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../../index.html">
<img id="logo" class="svg" src="../../logo/lucene-net-color.png" alt="">
</a>
</div>
<div class="collapse navbar-collapse" id="navbar">
<form class="navbar-form navbar-right" role="search" id="search">
<div class="form-group">
<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
</div>
</form>
</div>
</div>
</nav>
<div class="subnav navbar navbar-default">
<div class="container hide-when-search" id="breadcrumb">
<ul class="breadcrumb">
<li></li>
</ul>
</div>
</div>
</header>
<div class="container body-content">
<div id="search-results">
<div class="search-list"></div>
<div class="sr-items">
<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
</div>
<ul id="pagination"></ul>
</div>
</div>
<div role="main" class="container body-content hide-when-search">
<div class="sidenav hide-when-search">
<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
<div class="sidetoggle collapse" id="sidetoggle">
<div id="sidetoc"></div>
</div>
</div>
<div class="article row grid-right">
<div class="col-md-10">
<article class="content wrap" id="_content" data-uid="Lucene.Net.Analysis">
<h1 id="Lucene_Net_Analysis" data-uid="Lucene.Net.Analysis" class="text-break">Namespace Lucene.Net.Analysis
</h1>
<div class="markdown level0 summary"><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>Support for testing analysis components.</p>
<p> The main classes of interest are: * <a class="xref" href="Lucene.Net.Analysis.BaseTokenStreamTestCase.html">BaseTokenStreamTestCase</a>: Highly recommended to use its helper methods, (especially in conjunction with <a class="xref" href="Lucene.Net.Analysis.MockAnalyzer.html">MockAnalyzer</a> or <a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html">MockTokenizer</a>), as it contains many assertions and checks to catch bugs. * <a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html">MockTokenizer</a>: Tokenizer for testing. Tokenizer that serves as a replacement for WHITESPACE, SIMPLE, and KEYWORD tokenizers. If you are writing a component such as a TokenFilter, its a great idea to test it wrapping this tokenizer instead for extra checks. * <a class="xref" href="Lucene.Net.Analysis.MockAnalyzer.html">MockAnalyzer</a>: Analyzer for testing. Analyzer that uses MockTokenizer for additional verification. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, its a great idea to test it with this analyzer instead. </p>
</div>
<div class="markdown level0 conceptual"></div>
<div class="markdown level0 remarks"></div>
<h3 id="classes">Classes
</h3>
<h4><a class="xref" href="Lucene.Net.Analysis.BaseTokenStreamTestCase.html">BaseTokenStreamTestCase</a></h4>
<section><p>Base class for all Lucene unit tests that use <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenStream.html">TokenStream</a>s.
<p>
When writing unit tests for analysis components, its highly recommended
to use the helper methods here (especially in conjunction with <a class="xref" href="Lucene.Net.Analysis.MockAnalyzer.html">MockAnalyzer</a> or
<a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html">MockTokenizer</a>), as they contain many assertions and checks to
catch bugs.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.BinaryTermAttribute.html">BinaryTermAttribute</a></h4>
<section><p>Implementation for <a class="xref" href="Lucene.Net.Analysis.IBinaryTermAttribute.html">IBinaryTermAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.BinaryToken.html">BinaryToken</a></h4>
<section><p>Represents a binary token. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.CannedBinaryTokenStream.html">CannedBinaryTokenStream</a></h4>
<section><p><a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenStream.html">TokenStream</a> from a canned list of binary (<a class="xref" href="../Lucene.Net/Lucene.Net.Util.BytesRef.html">BytesRef</a>-based)
tokens.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.CannedTokenStream.html">CannedTokenStream</a></h4>
<section><p><a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenStream.html">TokenStream</a> from a canned list of <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.Token.html">Token</a>s.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.CheckClearAttributesAttribute.html">CheckClearAttributesAttribute</a></h4>
<section><p>Attribute that records if it was cleared or not. this is used
for testing that <a class="xref" href="../Lucene.Net/Lucene.Net.Util.AttributeSource.html#Lucene_Net_Util_AttributeSource_ClearAttributes">ClearAttributes()</a> was called correctly.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.CollationTestBase.html">CollationTestBase</a></h4>
<section><p>Base test class for testing Unicode collation.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.LookaheadTokenFilter.html">LookaheadTokenFilter</a></h4>
<section><p>LUCENENET specific abstraction so we can reference <a class="xref" href="Lucene.Net.Analysis.LookaheadTokenFilter.Position.html">LookaheadTokenFilter.Position</a> without
specifying a generic closing type.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.LookaheadTokenFilter.Position.html">LookaheadTokenFilter.Position</a></h4>
<section><p>Holds all state for a single position; subclass this
to record other state at each position.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.LookaheadTokenFilter-1.html">LookaheadTokenFilter&lt;T&gt;</a></h4>
<section><p>An abstract <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a> to make it easier to build graph
token filters requiring some lookahead. This class handles
the details of buffering up tokens, recording them by
position, restoring them, providing access to them, etc.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockAnalyzer.html">MockAnalyzer</a></h4>
<section><p>Analyzer for testing.
<p>
This analyzer is a replacement for Whitespace/Simple/KeywordAnalyzers
for unit tests. If you are testing a custom component such as a queryparser
or analyzer-wrapper that consumes analysis streams, its a great idea to test
it with this analyzer instead. MockAnalyzer has the following behavior:
<ul><li>
By default, the assertions in <a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html">MockTokenizer</a> are turned on for extra
checks that the consumer is consuming properly. These checks can be disabled
with <a class="xref" href="Lucene.Net.Analysis.MockAnalyzer.html#Lucene_Net_Analysis_MockAnalyzer_EnableChecks">EnableChecks</a>.
</li><li>
Payload data is randomly injected into the stream for more thorough testing
of payloads.
</li></ul></p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockBytesAnalyzer.html">MockBytesAnalyzer</a></h4>
<section><p><a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.Analyzer.html">Analyzer</a> for testing that encodes terms as UTF-16 bytes.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockBytesAttributeFactory.html">MockBytesAttributeFactory</a></h4>
<section><p><a class="xref" href="../Lucene.Net/Lucene.Net.Util.AttributeSource.AttributeFactory.html">AttributeSource.AttributeFactory</a> that implements <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenAttributes.ICharTermAttribute.html">ICharTermAttribute</a> with
<a class="xref" href="Lucene.Net.Analysis.MockUTF16TermAttributeImpl.html">MockUTF16TermAttributeImpl</a>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockCharFilter.html">MockCharFilter</a></h4>
<section><p>The purpose of this charfilter is to send offsets out of bounds
if the analyzer doesn&apos;t use <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.CharFilter.html#Lucene_Net_Analysis_CharFilter_CorrectOffset_System_Int32_">CorrectOffset(Int32)</a> or does incorrect offset math.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockFixedLengthPayloadFilter.html">MockFixedLengthPayloadFilter</a></h4>
<section><p><a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a> that adds random fixed-length payloads.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockGraphTokenFilter.html">MockGraphTokenFilter</a></h4>
<section><p>Randomly inserts overlapped (posInc=0) tokens with
posLength sometimes &gt; 1. The chain must have
an <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenAttributes.IOffsetAttribute.html">IOffsetAttribute</a>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockHoleInjectingTokenFilter.html">MockHoleInjectingTokenFilter</a></h4>
<section><p>Randomly injects holes (similar to what a stopfilter would do)</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockPayloadAnalyzer.html">MockPayloadAnalyzer</a></h4>
<section><p>Wraps a whitespace tokenizer with a filter that sets
the first token, and odd tokens to posinc=1, and all others
to 0, encoding the position as pos: XXX in the payload.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockRandomLookaheadTokenFilter.html">MockRandomLookaheadTokenFilter</a></h4>
<section><p>Uses <a class="xref" href="Lucene.Net.Analysis.LookaheadTokenFilter.html">LookaheadTokenFilter</a> to randomly peek at future tokens.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockReaderWrapper.html">MockReaderWrapper</a></h4>
<section><p>Wraps a <span class="xref">System.IO.TextReader</span>, and can throw random or fixed
exceptions, and spoon feed read chars.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockTokenFilter.html">MockTokenFilter</a></h4>
<section><p>A <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a> for testing that removes terms accepted by a DFA.
<ul><li>Union a list of singletons to act like a <a class="xref" href="../Lucene.Net.Analysis.Common/Lucene.Net.Analysis.Core.StopFilter.html">StopFilter</a>.</li><li>Use the complement to act like a <a class="xref" href="../Lucene.Net.Analysis.Common/Lucene.Net.Analysis.Miscellaneous.KeepWordFilter.html">KeepWordFilter</a>.</li><li>Use a regex like <code>.{12,}</code> to act like a <a class="xref" href="../Lucene.Net.Analysis.Common/Lucene.Net.Analysis.Miscellaneous.LengthFilter.html">LengthFilter</a>.</li></ul></p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html">MockTokenizer</a></h4>
<section><p>Tokenizer for testing.
<p>
This tokenizer is a replacement for <a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html#Lucene_Net_Analysis_MockTokenizer_WHITESPACE">WHITESPACE</a>, <a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html#Lucene_Net_Analysis_MockTokenizer_SIMPLE">SIMPLE</a>, and <a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html#Lucene_Net_Analysis_MockTokenizer_KEYWORD">KEYWORD</a>
tokenizers. If you are writing a component such as a <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a>, its a great idea to test
it wrapping this tokenizer instead for extra checks. This tokenizer has the following behavior:
<ul><li>
An internal state-machine is used for checking consumer consistency. These checks can
be disabled with <a class="xref" href="Lucene.Net.Analysis.MockTokenizer.html#Lucene_Net_Analysis_MockTokenizer_EnableChecks">EnableChecks</a>.
</li><li>
For convenience, optionally lowercases terms that it outputs.
</li></ul></p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockUTF16TermAttributeImpl.html">MockUTF16TermAttributeImpl</a></h4>
<section><p>Extension of <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenAttributes.CharTermAttribute.html">CharTermAttribute</a> that encodes the term
text as UTF-16 bytes instead of as UTF-8 bytes.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.MockVariableLengthPayloadFilter.html">MockVariableLengthPayloadFilter</a></h4>
<section><p><a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a> that adds random variable-length payloads.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenStreamToDot.html">TokenStreamToDot</a></h4>
<section><p>Consumes a <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenStream.html">TokenStream</a> and outputs the dot (graphviz) string (graph). </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.ValidatingTokenFilter.html">ValidatingTokenFilter</a></h4>
<section><p>A <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a> that checks consistency of the tokens (eg
offsets are consistent with one another).</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.VocabularyAssert.html">VocabularyAssert</a></h4>
<section><p>Utility class for doing vocabulary-based stemming tests. </p>
</section>
<h3 id="interfaces">Interfaces
</h3>
<h4><a class="xref" href="Lucene.Net.Analysis.IBinaryTermAttribute.html">IBinaryTermAttribute</a></h4>
<section><p>An attribute extending <a class="xref" href="../Lucene.Net/Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute.html">ITermToBytesRefAttribute</a>
but exposing <a class="xref" href="Lucene.Net.Analysis.IBinaryTermAttribute.html#Lucene_Net_Analysis_IBinaryTermAttribute_BytesRef">BytesRef</a> property.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.ICheckClearAttributesAttribute.html">ICheckClearAttributesAttribute</a></h4>
<section><p>Attribute that records if it was cleared or not. this is used
for testing that <a class="xref" href="../Lucene.Net/Lucene.Net.Util.AttributeSource.html#Lucene_Net_Util_AttributeSource_ClearAttributes">ClearAttributes()</a> was called correctly.</p>
</section>
</article>
</div>
<div class="hidden-sm col-md-2" role="complementary">
<div class="sideaffix">
<div class="contribution">
<ul class="nav">
<li>
<a href="https://github.com/apache/lucenenet/blob/docs-4.8.0-beta00007/src/Lucene.Net.TestFramework/Analysis/package.md/#L2" class="contribution-link">Improve this Doc</a>
</li>
</ul>
</div>
<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
</nav>
</div>
</div>
</div>
</div>
<footer>
<div class="grad-bottom"></div>
<div class="footer">
<div class="container">
<span class="pull-right">
<a href="#top">Back to top</a>
</span>
Copyright © 2020 Licensed to the Apache Software Foundation (ASF)
</div>
</div>
</footer>
</div>
<script type="text/javascript" src="../../styles/docfx.vendor.js"></script>
<script type="text/javascript" src="../../styles/docfx.js"></script>
<script type="text/javascript" src="../../styles/main.js"></script>
</body>
</html>