blob: 78381e7a8f49db0049a7aad05300ba85d9789a11 [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE]><![endif]-->
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Namespace Lucene.Net.Analysis.TokenAttributes
| Apache Lucene.NET 4.8.0-beta00013 Documentation </title>
<meta name="viewport" content="width=device-width">
<meta name="title" content="Namespace Lucene.Net.Analysis.TokenAttributes
| Apache Lucene.NET 4.8.0-beta00013 Documentation ">
<meta name="generator" content="docfx 2.56.2.0">
<link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
<meta property="docfx:navrel" content="toc.html">
<meta property="docfx:tocrel" content="core/toc.html">
<meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">
</head>
<body data-spy="scroll" data-target="#affix" data-offset="120">
<span id="forkongithub"><a href="https://github.com/apache/lucenenet" target="_blank">Fork me on GitHub</a></span>
<div id="wrapper">
<header>
<nav id="autocollapse" class="navbar ng-scope" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">
<img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
</a>
</div>
<div class="collapse navbar-collapse" id="navbar">
<form class="navbar-form navbar-right" role="search" id="search">
<div class="form-group">
<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
</div>
</form>
</div>
</div>
</nav>
<div class="subnav navbar navbar-default">
<div class="container hide-when-search">
<ul class="level0 breadcrumb">
<li>
<a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
<span id="breadcrumb">
<ul class="breadcrumb">
<li></li>
</ul>
</span>
</li>
</ul>
</div>
</div>
</header>
<div class="container body-content">
<div id="search-results">
<div class="search-list"></div>
<div class="sr-items">
<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
</div>
<ul id="pagination"></ul>
</div>
</div>
<div role="main" class="container body-content hide-when-search">
<div class="sidenav hide-when-search">
<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
<div class="sidetoggle collapse" id="sidetoggle">
<div id="sidetoc"></div>
</div>
</div>
<div class="article row grid-right">
<div class="col-md-10">
<article class="content wrap" id="_content" data-uid="Lucene.Net.Analysis.TokenAttributes">
<h1 id="Lucene_Net_Analysis_TokenAttributes" data-uid="Lucene.Net.Analysis.TokenAttributes" class="text-break">Namespace Lucene.Net.Analysis.TokenAttributes
</h1>
<div class="markdown level0 summary"><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>General-purpose attributes for text analysis.</p>
</div>
<div class="markdown level0 conceptual"></div>
<div class="markdown level0 remarks"></div>
<h3 id="classes">Classes
</h3>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.CharTermAttribute.html">CharTermAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ICharTermAttribute.html">ICharTermAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.FlagsAttribute.html">FlagsAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IFlagsAttribute.html">IFlagsAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.KeywordAttribute.html">KeywordAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html">IKeywordAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.OffsetAttribute.html">OffsetAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IOffsetAttribute.html">IOffsetAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.PayloadAttribute.html">PayloadAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPayloadAttribute.html">IPayloadAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.PositionIncrementAttribute.html">PositionIncrementAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionIncrementAttribute.html">IPositionIncrementAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.PositionLengthAttribute.html">PositionLengthAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionLengthAttribute.html">IPositionLengthAttribute</a>. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.TypeAttribute.html">TypeAttribute</a></h4>
<section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITypeAttribute.html">ITypeAttribute</a>. </p>
</section>
<h3 id="interfaces">Interfaces
</h3>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ICharTermAttribute.html">ICharTermAttribute</a></h4>
<section><p>The term text of a <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IFlagsAttribute.html">IFlagsAttribute</a></h4>
<section><p>This attribute can be used to pass different flags down the <a class="xref" href="Lucene.Net.Analysis.Tokenizer.html">Tokenizer</a> chain,
eg from one TokenFilter to another one.
<p>
This is completely distinct from <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.TypeAttribute.html">TypeAttribute</a>, although they do share similar purposes.
The flags can be used to encode information about the token for use by other
<a class="xref" href="Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a>s.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p> While we think this is here to stay, we may want to change it to be a long.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html">IKeywordAttribute</a></h4>
<section><p>This attribute can be used to mark a token as a keyword. Keyword aware
<a class="xref" href="Lucene.Net.Analysis.TokenStream.html">TokenStream</a>s can decide to modify a token based on the return value
of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html#Lucene_Net_Analysis_TokenAttributes_IKeywordAttribute_IsKeyword">IsKeyword</a> if the token is modified. Stemming filters for
instance can use this attribute to conditionally skip a term if
<a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html#Lucene_Net_Analysis_TokenAttributes_IKeywordAttribute_IsKeyword">IsKeyword</a> returns <code>true</code>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IOffsetAttribute.html">IOffsetAttribute</a></h4>
<section><p>The start and end character offset of a <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a>.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPayloadAttribute.html">IPayloadAttribute</a></h4>
<section><p>The payload of a Token.
<p>
The payload is stored in the index at each position, and can
be used to influence scoring when using Payload-based queries
in the <a class="xref" href="Lucene.Net.Search.Payloads.html">Lucene.Net.Search.Payloads</a> and
<a class="xref" href="Lucene.Net.Search.Spans.html">Lucene.Net.Search.Spans</a> namespaces.
<p>
NOTE: because the payload will be stored at each position, its usually
best to use the minimum number of bytes necessary. Some codec implementations
may optimize payload storage when all payloads have the same length.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionIncrementAttribute.html">IPositionIncrementAttribute</a></h4>
<section><p>Determines the position of this token
relative to the previous <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a> in a <a class="xref" href="Lucene.Net.Analysis.TokenStream.html">TokenStream</a>, used in phrase
searching.</p>
<p><p>The default value is one.</p>
<p><p>Some common uses for this are:</p>
<ul><li>Set it to zero to put multiple terms in the same position. this is
useful if, e.g., a word has multiple stems. Searches for phrases
including either stem will match. In this case, all but the first stem&apos;s
increment should be set to zero: the increment of the first instance
should be one. Repeating a token with an increment of zero can also be
used to boost the scores of matches on that token.</li><li>Set it to values greater than one to inhibit exact phrase matches.
If, for example, one does not want phrases to match across removed stop
words, then one could build a stop word filter that removes stop words and
also sets the increment to the number of stop words removed before each
non-stop word. Then exact phrase queries will only match when the terms
occur with no intervening stop words.</li></ul>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionLengthAttribute.html">IPositionLengthAttribute</a></h4>
<section><p>Determines how many positions this
token spans. Very few analyzer components actually
produce this attribute, and indexing ignores it, but
it&apos;s useful to express the graph structure naturally
produced by decompounding, word splitting/joining,
synonym filtering, etc.</p>
<p><p>NOTE: this is optional, and most analyzers
don&apos;t change the default value (1).</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute.html">ITermToBytesRefAttribute</a></h4>
<section><p>This attribute is requested by TermsHashPerField to index the contents.
This attribute can be used to customize the final byte[] encoding of terms.
<p>
Consumers of this attribute call <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute.html#Lucene_Net_Analysis_TokenAttributes_ITermToBytesRefAttribute_BytesRef">BytesRef</a> up-front, and then
invoke <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute.html#Lucene_Net_Analysis_TokenAttributes_ITermToBytesRefAttribute_FillBytesRef">FillBytesRef()</a> for each term. Example:</p>
<pre><code> TermToBytesRefAttribute termAtt = tokenStream.GetAttribute&lt;TermToBytesRefAttribute>;
BytesRef bytes = termAtt.BytesRef;
while (tokenStream.IncrementToken()
{
// you must call termAtt.FillBytesRef() before doing something with the bytes.
// this encodes the term value (internally it might be a char[], etc) into the bytes.
int hashCode = termAtt.FillBytesRef();
if (IsInteresting(bytes))
{
// because the bytes are reused by the attribute (like CharTermAttribute&apos;s char[] buffer),
// you should make a copy if you need persistent access to the bytes, otherwise they will
// be rewritten across calls to IncrementToken()
DoSomethingWith(new BytesRef(bytes));
}
}
...</code></pre>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p> this is a very expert API, please use
<a class="xref" href="Lucene.Net.Analysis.TokenAttributes.CharTermAttribute.html">CharTermAttribute</a> and its implementation of this method
for UTF-8 terms.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITypeAttribute.html">ITypeAttribute</a></h4>
<section><p>A <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a>&apos;s lexical type. The Default value is &quot;word&quot;.</p>
</section>
</article>
</div>
<div class="hidden-sm col-md-2" role="complementary">
<div class="sideaffix">
<div class="contribution">
<ul class="nav">
<li>
<a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00013/src/Lucene.Net/Analysis/TokenAttributes/package.md/#L2" class="contribution-link">Improve this Doc</a>
</li>
</ul>
</div>
<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
</nav>
</div>
</div>
</div>
</div>
<footer>
<div class="grad-bottom"></div>
<div class="footer">
<div class="container">
<span class="pull-right">
<a href="#top">Back to top</a>
</span>
Copyright © 2020 The Apache Software Foundation, Licensed under the <a href='http://www.apache.org/licenses/LICENSE-2.0' target='_blank'>Apache License, Version 2.0</a><br> <small>Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation. <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</small>
</div>
</div>
</footer>
</div>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
</body>
</html>