| <!DOCTYPE html> |
| <!--[if IE]><![endif]--> |
| <html> |
| |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> |
| <title>Namespace Lucene.Net.Analysis.TokenAttributes |
| | Apache Lucene.NET 4.8.0-beta00010 Documentation </title> |
| <meta name="viewport" content="width=device-width"> |
| <meta name="title" content="Namespace Lucene.Net.Analysis.TokenAttributes |
| | Apache Lucene.NET 4.8.0-beta00010 Documentation "> |
| <meta name="generator" content="docfx 2.56.0.0"> |
| |
| <link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css"> |
| <meta property="docfx:navrel" content="toc.html"> |
| <meta property="docfx:tocrel" content="core/toc.html"> |
| |
| <meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/"> |
| |
| </head> |
| <body data-spy="scroll" data-target="#affix" data-offset="120"> |
| <div id="wrapper"> |
| <header> |
| |
| <nav id="autocollapse" class="navbar ng-scope" role="navigation"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| |
| <a class="navbar-brand" href="/"> |
| <img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt=""> |
| </a> |
| </div> |
| <div class="collapse navbar-collapse" id="navbar"> |
| <form class="navbar-form navbar-right" role="search" id="search"> |
| <div class="form-group"> |
| <input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off"> |
| </div> |
| </form> |
| </div> |
| </div> |
| </nav> |
| |
| <div class="subnav navbar navbar-default"> |
| <div class="container hide-when-search"> |
| <ul class="level0 breadcrumb"> |
| <li> |
| <a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a> |
| <span id="breadcrumb"> |
| <ul class="breadcrumb"> |
| <li></li> |
| </ul> |
| </span> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </header> |
| <div class="container body-content"> |
| |
| <div id="search-results"> |
| <div class="search-list"></div> |
| <div class="sr-items"> |
| <p><i class="glyphicon glyphicon-refresh index-loading"></i></p> |
| </div> |
| <ul id="pagination"></ul> |
| </div> |
| </div> |
| <div role="main" class="container body-content hide-when-search"> |
| |
| <div class="sidenav hide-when-search"> |
| <a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a> |
| <div class="sidetoggle collapse" id="sidetoggle"> |
| <div id="sidetoc"></div> |
| </div> |
| </div> |
| <div class="article row grid-right"> |
| <div class="col-md-10"> |
| <article class="content wrap" id="_content" data-uid="Lucene.Net.Analysis.TokenAttributes"> |
| |
| <h1 id="Lucene_Net_Analysis_TokenAttributes" data-uid="Lucene.Net.Analysis.TokenAttributes" class="text-break">Namespace Lucene.Net.Analysis.TokenAttributes |
| </h1> |
| <div class="markdown level0 summary"><!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <p>General-purpose attributes for text analysis.</p> |
| </div> |
| <div class="markdown level0 conceptual"></div> |
| <div class="markdown level0 remarks"></div> |
| <h3 id="classes">Classes |
| </h3> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.CharTermAttribute.html">CharTermAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ICharTermAttribute.html">ICharTermAttribute</a>. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.FlagsAttribute.html">FlagsAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IFlagsAttribute.html">IFlagsAttribute</a>. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.KeywordAttribute.html">KeywordAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html">IKeywordAttribute</a>. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.OffsetAttribute.html">OffsetAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IOffsetAttribute.html">IOffsetAttribute</a>. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.PayloadAttribute.html">PayloadAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPayloadAttribute.html">IPayloadAttribute</a>. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.PositionIncrementAttribute.html">PositionIncrementAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionIncrementAttribute.html">IPositionIncrementAttribute</a>. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.PositionLengthAttribute.html">PositionLengthAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionLengthAttribute.html">IPositionLengthAttribute</a>. </p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.TypeAttribute.html">TypeAttribute</a></h4> |
| <section><p>Default implementation of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITypeAttribute.html">ITypeAttribute</a>. </p> |
| </section> |
| <h3 id="interfaces">Interfaces |
| </h3> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ICharTermAttribute.html">ICharTermAttribute</a></h4> |
| <section><p>The term text of a <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a>.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IFlagsAttribute.html">IFlagsAttribute</a></h4> |
| <section><p>This attribute can be used to pass different flags down the <a class="xref" href="Lucene.Net.Analysis.Tokenizer.html">Tokenizer</a> chain, |
| eg from one TokenFilter to another one. |
| <p> |
| This is completely distinct from <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.TypeAttribute.html">TypeAttribute</a>, although they do share similar purposes. |
| The flags can be used to encode information about the token for use by other |
| <a class="xref" href="Lucene.Net.Analysis.TokenFilter.html">TokenFilter</a>s.</p> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p> While we think this is here to stay, we may want to change it to be a long.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html">IKeywordAttribute</a></h4> |
| <section><p>This attribute can be used to mark a token as a keyword. Keyword aware |
| <a class="xref" href="Lucene.Net.Analysis.TokenStream.html">TokenStream</a>s can decide to modify a token based on the return value |
| of <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html#Lucene_Net_Analysis_TokenAttributes_IKeywordAttribute_IsKeyword">IsKeyword</a> if the token is modified. Stemming filters for |
| instance can use this attribute to conditionally skip a term if |
| <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute.html#Lucene_Net_Analysis_TokenAttributes_IKeywordAttribute_IsKeyword">IsKeyword</a> returns <code>true</code>.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IOffsetAttribute.html">IOffsetAttribute</a></h4> |
| <section><p>The start and end character offset of a <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a>.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPayloadAttribute.html">IPayloadAttribute</a></h4> |
| <section><p>The payload of a Token. |
| <p> |
| The payload is stored in the index at each position, and can |
| be used to influence scoring when using Payload-based queries |
| in the <a class="xref" href="Lucene.Net.Search.Payloads.html">Lucene.Net.Search.Payloads</a> and |
| <a class="xref" href="Lucene.Net.Search.Spans.html">Lucene.Net.Search.Spans</a> namespaces. |
| <p> |
| NOTE: because the payload will be stored at each position, its usually |
| best to use the minimum number of bytes necessary. Some codec implementations |
| may optimize payload storage when all payloads have the same length.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionIncrementAttribute.html">IPositionIncrementAttribute</a></h4> |
| <section><p>Determines the position of this token |
| relative to the previous <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a> in a <a class="xref" href="Lucene.Net.Analysis.TokenStream.html">TokenStream</a>, used in phrase |
| searching.</p> |
| <p><p>The default value is one.</p> |
| <p><p>Some common uses for this are:</p> |
| <ul><li>Set it to zero to put multiple terms in the same position. this is |
| useful if, e.g., a word has multiple stems. Searches for phrases |
| including either stem will match. In this case, all but the first stem's |
| increment should be set to zero: the increment of the first instance |
| should be one. Repeating a token with an increment of zero can also be |
| used to boost the scores of matches on that token.</li><li>Set it to values greater than one to inhibit exact phrase matches. |
| If, for example, one does not want phrases to match across removed stop |
| words, then one could build a stop word filter that removes stop words and |
| also sets the increment to the number of stop words removed before each |
| non-stop word. Then exact phrase queries will only match when the terms |
| occur with no intervening stop words.</li></ul> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.IPositionLengthAttribute.html">IPositionLengthAttribute</a></h4> |
| <section><p>Determines how many positions this |
| token spans. Very few analyzer components actually |
| produce this attribute, and indexing ignores it, but |
| it's useful to express the graph structure naturally |
| produced by decompounding, word splitting/joining, |
| synonym filtering, etc.</p> |
| <p><p>NOTE: this is optional, and most analyzers |
| don't change the default value (1).</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute.html">ITermToBytesRefAttribute</a></h4> |
| <section><p>This attribute is requested by TermsHashPerField to index the contents. |
| This attribute can be used to customize the final byte[] encoding of terms. |
| <p> |
| Consumers of this attribute call <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute.html#Lucene_Net_Analysis_TokenAttributes_ITermToBytesRefAttribute_BytesRef">BytesRef</a> up-front, and then |
| invoke <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute.html#Lucene_Net_Analysis_TokenAttributes_ITermToBytesRefAttribute_FillBytesRef">FillBytesRef()</a> for each term. Example:</p> |
| <pre><code> TermToBytesRefAttribute termAtt = tokenStream.GetAttribute<TermToBytesRefAttribute>; |
| BytesRef bytes = termAtt.BytesRef; |
| |
| while (tokenStream.IncrementToken() |
| { |
| // you must call termAtt.FillBytesRef() before doing something with the bytes. |
| // this encodes the term value (internally it might be a char[], etc) into the bytes. |
| int hashCode = termAtt.FillBytesRef(); |
| |
| if (IsInteresting(bytes)) |
| { |
| // because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer), |
| // you should make a copy if you need persistent access to the bytes, otherwise they will |
| // be rewritten across calls to IncrementToken() |
| |
| DoSomethingWith(new BytesRef(bytes)); |
| } |
| } |
| ...</code></pre> |
| <div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div><p> this is a very expert API, please use |
| <a class="xref" href="Lucene.Net.Analysis.TokenAttributes.CharTermAttribute.html">CharTermAttribute</a> and its implementation of this method |
| for UTF-8 terms.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.TokenAttributes.ITypeAttribute.html">ITypeAttribute</a></h4> |
| <section><p>A <a class="xref" href="Lucene.Net.Analysis.Token.html">Token</a>'s lexical type. The Default value is "word".</p> |
| </section> |
| </article> |
| </div> |
| |
| <div class="hidden-sm col-md-2" role="complementary"> |
| <div class="sideaffix"> |
| <div class="contribution"> |
| <ul class="nav"> |
| <li> |
| <a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00010/src/Lucene.Net/Analysis/TokenAttributes/package.md/#L2" class="contribution-link">Improve this Doc</a> |
| </li> |
| </ul> |
| </div> |
| <nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix"> |
| <!-- <p><a class="back-to-top" href="#top">Back to top</a><p> --> |
| </nav> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <footer> |
| <div class="grad-bottom"></div> |
| <div class="footer"> |
| <div class="container"> |
| <span class="pull-right"> |
| <a href="#top">Back to top</a> |
| </span> |
| Copyright © 2020 Licensed to the Apache Software Foundation (ASF) |
| |
| </div> |
| </div> |
| </footer> |
| </div> |
| |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script> |
| </body> |
| </html> |