| <!DOCTYPE html> |
| <!--[if IE]><![endif]--> |
| <html> |
| |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> |
| <title>Namespace Lucene.Net.Analysis.Pt |
| | Apache Lucene.NET 4.8.0-beta00011 Documentation </title> |
| <meta name="viewport" content="width=device-width"> |
| <meta name="title" content="Namespace Lucene.Net.Analysis.Pt |
| | Apache Lucene.NET 4.8.0-beta00011 Documentation "> |
| <meta name="generator" content="docfx 2.56.0.0"> |
| |
| <link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css"> |
| <link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css"> |
| <meta property="docfx:navrel" content="toc.html"> |
| <meta property="docfx:tocrel" content="analysis-common/toc.html"> |
| |
| <meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/"> |
| |
| </head> |
| <body data-spy="scroll" data-target="#affix" data-offset="120"> |
| <div id="wrapper"> |
| <header> |
| |
| <nav id="autocollapse" class="navbar ng-scope" role="navigation"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| |
| <a class="navbar-brand" href="/"> |
| <img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt=""> |
| </a> |
| </div> |
| <div class="collapse navbar-collapse" id="navbar"> |
| <form class="navbar-form navbar-right" role="search" id="search"> |
| <div class="form-group"> |
| <input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off"> |
| </div> |
| </form> |
| </div> |
| </div> |
| </nav> |
| |
| <div class="subnav navbar navbar-default"> |
| <div class="container hide-when-search"> |
| <ul class="level0 breadcrumb"> |
| <li> |
| <a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a> |
| <span id="breadcrumb"> |
| <ul class="breadcrumb"> |
| <li></li> |
| </ul> |
| </span> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </header> |
| <div class="container body-content"> |
| |
| <div id="search-results"> |
| <div class="search-list"></div> |
| <div class="sr-items"> |
| <p><i class="glyphicon glyphicon-refresh index-loading"></i></p> |
| </div> |
| <ul id="pagination"></ul> |
| </div> |
| </div> |
| <div role="main" class="container body-content hide-when-search"> |
| |
| <div class="sidenav hide-when-search"> |
| <a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a> |
| <div class="sidetoggle collapse" id="sidetoggle"> |
| <div id="sidetoc"></div> |
| </div> |
| </div> |
| <div class="article row grid-right"> |
| <div class="col-md-10"> |
| <article class="content wrap" id="_content" data-uid="Lucene.Net.Analysis.Pt"> |
| |
| <h1 id="Lucene_Net_Analysis_Pt" data-uid="Lucene.Net.Analysis.Pt" class="text-break">Namespace Lucene.Net.Analysis.Pt |
| </h1> |
| <div class="markdown level0 summary"><!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <p>Analyzer for Portuguese.</p> |
| </div> |
| <div class="markdown level0 conceptual"></div> |
| <div class="markdown level0 remarks"></div> |
| <h3 id="classes">Classes |
| </h3> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseAnalyzer.html">PortugueseAnalyzer</a></h4> |
| <section><p><span class="xref">Lucene.Net.Analysis.Analyzer</span> for Portuguese. |
| <p>You must specify the required <span class="xref">Lucene.Net.Util.LuceneVersion</span> |
| compatibility when creating <a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseAnalyzer.html">PortugueseAnalyzer</a>: |
| <ul><li> As of 3.6, PortugueseLightStemFilter is used for less aggressive stemming.</li></ul> |
| </p></p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseLightStemFilter.html">PortugueseLightStemFilter</a></h4> |
| <section><p>A <span class="xref">Lucene.Net.Analysis.TokenFilter</span> that applies <a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseLightStemmer.html">PortugueseLightStemmer</a> to stem |
| Portuguese words. |
| <p> |
| To prevent terms from being stemmed use an instance of |
| <a class="xref" href="Lucene.Net.Analysis.Miscellaneous.SetKeywordMarkerFilter.html">SetKeywordMarkerFilter</a> or a custom <span class="xref">Lucene.Net.Analysis.TokenFilter</span> that sets |
| the <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Analysis.TokenAttributes.KeywordAttribute.html">KeywordAttribute</a> before this <span class="xref">Lucene.Net.Analysis.TokenStream</span>. |
| </p></p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseLightStemFilterFactory.html">PortugueseLightStemFilterFactory</a></h4> |
| <section><p>Factory for <a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseLightStemFilter.html">PortugueseLightStemFilter</a>.</p> |
| <pre><code><fieldType name="text_ptlgtstem" class="solr.TextField" positionIncrementGap="100"> |
| <analyzer> |
| <tokenizer class="solr.StandardTokenizerFactory"/> |
| <filter class="solr.LowerCaseFilterFactory"/> |
| <filter class="solr.PortugueseLightStemFilterFactory"/> |
| </analyzer> |
| </fieldType></code></pre> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseLightStemmer.html">PortugueseLightStemmer</a></h4> |
| <section><p>Light Stemmer for Portuguese |
| <p> |
| This stemmer implements the "UniNE" algorithm in: |
| <code>Light Stemming Approaches for the French, Portuguese, German and Hungarian Languages</code> |
| Jacques Savoy |
| </p></p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseMinimalStemFilter.html">PortugueseMinimalStemFilter</a></h4> |
| <section><p>A <span class="xref">Lucene.Net.Analysis.TokenFilter</span> that applies <a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseMinimalStemmer.html">PortugueseMinimalStemmer</a> to stem |
| Portuguese words. |
| <p> |
| To prevent terms from being stemmed use an instance of |
| <a class="xref" href="Lucene.Net.Analysis.Miscellaneous.SetKeywordMarkerFilter.html">SetKeywordMarkerFilter</a> or a custom <span class="xref">Lucene.Net.Analysis.TokenFilter</span> that sets |
| the <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Analysis.TokenAttributes.KeywordAttribute.html">KeywordAttribute</a> before this <span class="xref">Lucene.Net.Analysis.TokenStream</span>. |
| </p></p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseMinimalStemFilterFactory.html">PortugueseMinimalStemFilterFactory</a></h4> |
| <section><p>Factory for <a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseMinimalStemFilter.html">PortugueseMinimalStemFilter</a>.</p> |
| <pre><code><fieldType name="text_ptminstem" class="solr.TextField" positionIncrementGap="100"> |
| <analyzer> |
| <tokenizer class="solr.StandardTokenizerFactory"/> |
| <filter class="solr.LowerCaseFilterFactory"/> |
| <filter class="solr.PortugueseMinimalStemFilterFactory"/> |
| </analyzer> |
| </fieldType></code></pre> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseMinimalStemmer.html">PortugueseMinimalStemmer</a></h4> |
| <section><p>Minimal Stemmer for Portuguese |
| <p> |
| This follows the "RSLP-S" algorithm presented in: |
| <code>A study on the Use of Stemming for Monolingual Ad-Hoc Portuguese |
| Information Retrieval</code> (Orengo, et al) |
| which is just the plural reduction step of the RSLP |
| algorithm from <code>A Stemming Algorithm for the Portuguese Language</code>, |
| Orengo et al. |
| </p></p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseStemFilter.html">PortugueseStemFilter</a></h4> |
| <section><p>A <span class="xref">Lucene.Net.Analysis.TokenFilter</span> that applies <a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseStemmer.html">PortugueseStemmer</a> to stem |
| Portuguese words. |
| <p> |
| To prevent terms from being stemmed use an instance of |
| <a class="xref" href="Lucene.Net.Analysis.Miscellaneous.SetKeywordMarkerFilter.html">SetKeywordMarkerFilter</a> or a custom <span class="xref">Lucene.Net.Analysis.TokenFilter</span> that sets |
| the <a class="xref" href="http://localhost:8080/api/core/Lucene.Net.Analysis.TokenAttributes.KeywordAttribute.html">KeywordAttribute</a> before this <span class="xref">Lucene.Net.Analysis.TokenStream</span>. |
| </p></p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseStemFilterFactory.html">PortugueseStemFilterFactory</a></h4> |
| <section><p>Factory for <a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseStemFilter.html">PortugueseStemFilter</a>. </p> |
| <pre><code><fieldType name="text_ptstem" class="solr.TextField" positionIncrementGap="100"> |
| <analyzer> |
| <tokenizer class="solr.StandardTokenizerFactory"/> |
| <filter class="solr.LowerCaseFilterFactory"/> |
| <filter class="solr.PortugueseStemFilterFactory"/> |
| </analyzer> |
| </fieldType></code></pre> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.PortugueseStemmer.html">PortugueseStemmer</a></h4> |
| <section><p>Portuguese stemmer implementing the RSLP (Removedor de Sufixos da Lingua Portuguesa) |
| algorithm. This is sometimes also referred to as the Orengo stemmer.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.html">RSLPStemmerBase</a></h4> |
| <section><p>Base class for stemmers that use a set of RSLP-like stemming steps. |
| <p> |
| RSLP (Removedor de Sufixos da Lingua Portuguesa) is an algorithm designed |
| originally for stemming the Portuguese language, described in the paper |
| <code>A Stemming Algorithm for the Portuguese Language</code>, Orengo et. al. |
| </p> |
| <p> |
| Since this time a plural-only modification (RSLP-S) as well as a modification |
| for the Galician language have been implemented. This class parses a configuration |
| file that describes <a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.Step.html">RSLPStemmerBase.Step</a>s, where each <a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.Step.html">RSLPStemmerBase.Step</a> contains a set of <a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.Rule.html">RSLPStemmerBase.Rule</a>s. |
| </p> |
| <p> |
| The general rule format is: </p> |
| <pre><code>{ "suffix", N, "replacement", { "exception1", "exception2", ...}}</code></pre> |
| <p>where: |
| <ul><li><code>suffix</code> is the suffix to be removed (such as "inho").</li><li><code>N</code> is the min stem size, where stem is defined as the candidate stem |
| after removing the suffix (but before appending the replacement!)</li><li><code>replacement</code> is an optimal string to append after removing the suffix. |
| This can be the empty string.</li><li><code>exceptions</code> is an optional list of exceptions, patterns that should |
| not be stemmed. These patterns can be specified as whole word or suffix (ends-with) |
| patterns, depending upon the exceptions format flag in the step header.</li></ul> |
| </p> |
| <p> |
| A step is an ordered list of rules, with a structure in this format: |
| <blockquote>{ "name", N, B, { "cond1", "cond2", ... } |
| ... rules ... }; |
| </blockquote> |
| where: |
| <ul><li><code>name</code> is a name for the step (such as "Plural").</li><li><code>N</code> is the min word size. Words that are less than this length bypass |
| the step completely, as an optimization. Note: N can be zero, in this case this |
| implementation will automatically calculate the appropriate value from the underlying |
| rules.</li><li><code>B</code> is a "boolean" flag specifying how exceptions in the rules are matched. |
| A value of 1 indicates whole-word pattern matching, a value of 0 indicates that |
| exceptions are actually suffixes and should be matched with ends-with.</li><li><code>conds</code> are an optional list of conditions to enter the step at all. If |
| the list is non-empty, then a word must end with one of these conditions or it will |
| bypass the step completely as an optimization.</li></ul> |
| </p> |
| <a href="http://www.inf.ufrgs.br/~viviane/rslp/index.htm">RSLP description</a></p> |
| <div class="lucene-block lucene-internal">This is a Lucene.NET INTERNAL API, use at your own risk</div></section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.Rule.html">RSLPStemmerBase.Rule</a></h4> |
| <section><p>A basic rule, with no exceptions.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.RuleWithSetExceptions.html">RSLPStemmerBase.RuleWithSetExceptions</a></h4> |
| <section><p>A rule with a set of whole-word exceptions.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.RuleWithSuffixExceptions.html">RSLPStemmerBase.RuleWithSuffixExceptions</a></h4> |
| <section><p>A rule with a set of exceptional suffixes.</p> |
| </section> |
| <h4><a class="xref" href="Lucene.Net.Analysis.Pt.RSLPStemmerBase.Step.html">RSLPStemmerBase.Step</a></h4> |
| <section><p>A step containing a list of rules.</p> |
| </section> |
| </article> |
| </div> |
| |
| <div class="hidden-sm col-md-2" role="complementary"> |
| <div class="sideaffix"> |
| <div class="contribution"> |
| <ul class="nav"> |
| <li> |
| <a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00011/src/Lucene.Net.Analysis.Common/Analysis/Pt/package.md/#L2" class="contribution-link">Improve this Doc</a> |
| </li> |
| </ul> |
| </div> |
| <nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix"> |
| <!-- <p><a class="back-to-top" href="#top">Back to top</a><p> --> |
| </nav> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <footer> |
| <div class="grad-bottom"></div> |
| <div class="footer"> |
| <div class="container"> |
| <span class="pull-right"> |
| <a href="#top">Back to top</a> |
| </span> |
| Copyright © 2020 Licensed to the Apache Software Foundation (ASF) |
| |
| </div> |
| </div> |
| </footer> |
| </div> |
| |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script> |
| <script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script> |
| </body> |
| </html> |