blob: 5dadb981cee0ebb656c2c9363a6ce3aedd6721a5 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia Site Renderer 1.11.1 from src/site/apt/matcher_def.apt.vm at 2024-05-14
| Rendered using Apache Maven Default Skin
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="generator" content="Apache Maven Doxia Site Renderer 1.11.1" />
<title>Apache Rat&trade; &#x2013; How to define new Matchers</title>
<link rel="stylesheet" href="./css/maven-base.css" />
<link rel="stylesheet" href="./css/maven-theme.css" />
<link rel="stylesheet" href="./css/site.css" />
<link rel="stylesheet" href="./css/print.css" media="print" />
<link href="https://creadur.apache.org/font/matesc.css" type="text/css" rel="stylesheet" />
</head>
<body class="composite">
<div id="banner">
<a href="https://www.apache.org/" id="bannerLeft"><img src="https://www.apache.org/img/asf_logo.png" alt="The Apache Software Foundation" title="The Apache Software Foundation"/></a> <div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xleft">
<span id="publishDate">Last Published: 2024-05-14</span>
| <span id="projectVersion">Version: 0.17-SNAPSHOT</span>
| <a href="https://www.apache.org/" class="externalLink" title="Apache">Apache</a> &gt;
<a href="https://creadur.apache.org/" class="externalLink" title="Creadur">Creadur</a> &gt;
<a href="https://creadur.apache.org/rat/" class="externalLink" title="Rat">Rat</a> &gt;
How to define new Matchers
</div>
<div class="xright"> </div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>Apache Ratâ„¢</h5>
<ul>
<li class="none"><a href="index.html" title="Introducing Rat">Introducing Rat</a></li>
<li class="none"><a href="apidocs/index.html" title="Javadocs">Javadocs</a></li>
<li class="none"><a href="download_rat.cgi" title="Downloads">Downloads</a></li>
<li class="none"><a href="RELEASE_NOTES.txt" title="Changes">Changes</a></li>
</ul>
<h5>Running Rat</h5>
<ul>
<li class="none"><a href="apache-rat/index.html" title="From The Command Line">From The Command Line</a></li>
<li class="none"><a href="apache-rat-tasks/index.html" title="With Ant">With Ant</a></li>
<li class="none"><a href="apache-rat-plugin/index.html" title="With Maven">With Maven</a></li>
</ul>
<h5>Apache Creadurâ„¢</h5>
<ul>
<li class="none"><a href="https://creadur.apache.org" class="externalLink" title="Creadur Project Home">Creadur Project Home</a></li>
<li class="none"><a href="https://creadur.apache.org/tentacles" class="externalLink" title="Apache Tentacles">Apache Tentacles</a></li>
<li class="none"><a href="https://creadur.apache.org/whisker" class="externalLink" title="Apache Whisker">Apache Whisker</a></li>
<li class="none"><a href="https://www.apache.org/security/" class="externalLink" title="Security">Security</a></li>
<li class="none"><a href="https://www.apache.org/licenses/" class="externalLink" title="License">License</a></li>
<li class="none"><a href="https://privacy.apache.org/policies/privacy-policy-public.html" class="externalLink" title="Privacy">Privacy</a></li>
<li class="none"><a href="https://www.apache.org/foundation/sponsorship.html" class="externalLink" title="Sponsorship">Sponsorship</a></li>
<li class="none"><a href="https://www.apache.org/foundation/thanks.html" class="externalLink" title="Thanks">Thanks</a></li>
</ul>
<h5>The Apache Software Foundation</h5>
<ul>
<li class="none"><a href="https://www.apache.org/foundation" class="externalLink" title="About the Foundation">About the Foundation</a></li>
<li class="none"><a href="https://projects.apache.org" class="externalLink" title="The projects">The projects</a></li>
<li class="none"><a href="https://people.apache.org" class="externalLink" title="The people">The people</a></li>
<li class="none"><a href="https://www.apache.org/foundation/how-it-works.html" class="externalLink" title="How we work">How we work</a></li>
<li class="none"><a href="https://www.apache.org/foundation/how-it-works.html#history" class="externalLink" title="Our history">Our history</a></li>
<li class="none"><a href="https://blogs.apache.org/foundation/" class="externalLink" title="News">News</a></li>
</ul>
<h5>Contribute</h5>
<ul>
<li class="none"><a href="https://www.apache.org/foundation/getinvolved.html" class="externalLink" title="Get Involved">Get Involved</a></li>
</ul>
<h5>Committer Info</h5>
<ul>
<li class="none"><a href="https://www.apache.org/dev/committers.html" class="externalLink" title="ASF Committers' FAQ">ASF Committers' FAQ</a></li>
<li class="none"><a href="https://www.apache.org/dev/new-committers-guide.html" class="externalLink" title="New Committers Guide">New Committers Guide</a></li>
<li class="none"><a href="site-publish.html" title="Howto publish this site">Howto publish this site</a></li>
<li class="none"><a href="https://community.apache.org/" class="externalLink" title="Community">Community</a></li>
<li class="none"><a href="https://www.apache.org/legal/" class="externalLink" title="Legal">Legal</a></li>
<li class="none"><a href="https://www.apache.org/foundation/marks/" class="externalLink" title="Branding">Branding</a></li>
<li class="none"><a href="https://www.apache.org/press/" class="externalLink" title="Media Relations">Media Relations</a></li>
</ul>
<h5>Modules</h5>
<ul>
<li class="none"><a href="apache-rat-core/index.html" title="Apache Creadur Rat::Core">Apache Creadur Rat::Core</a></li>
<li class="none"><a href="apache-rat-plugin/index.html" title="Apache Creadur Rat::Plugin4Maven">Apache Creadur Rat::Plugin4Maven</a></li>
<li class="none"><a href="apache-rat-tasks/index.html" title="Apache Creadur Rat::Tasks4Ant">Apache Creadur Rat::Tasks4Ant</a></li>
<li class="none"><a href="apache-rat/index.html" title="Apache Creadur Rat::Command Line">Apache Creadur Rat::Command Line</a></li>
<li class="none"><a href="apache-rat-tools/index.html" title="Apache Creadur Rat::Tools">Apache Creadur Rat::Tools</a></li>
</ul>
<h5>Project Documentation</h5>
<ul>
<li class="collapsed"><a href="project-info.html" title="Project Information">Project Information</a></li>
<li class="collapsed"><a href="project-reports.html" title="Project Reports">Project Reports</a></li>
</ul>
<a href="https://maven.apache.org/" title="Maven" class="poweredBy">
<img class="poweredBy" alt="Maven" src="https://maven.apache.org/images/logos/maven-feather.png" />
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<section>
<h2><a name="How_to_define_matchers_in_Apache_Rat"></a>How to define matchers in Apache Rat</h2>
<p>Matchers in Apache Rat are paired with builders. A matcher must implement the <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/analysis/IHeaderMatcher.java">IHeaderMatcher</a> interface and its associated builder must implement the IHeaderMatcher.Builder interface.</p><section>
<h3><a name="A_simple_example"></a>A simple example</h3><section>
<h4><a name="The_Matcher_implementation"></a>The Matcher implementation</h4>
<p>For our example we will implement a Matcher that implements the phrase &quot;Quality, speed and cost, pick any two&#x201d; by looking for the occurrence of all three words anywhere in the header. In most cases is it simplest to extend the <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/analysis/matchers/AbstractHeaderMatcher.java">AbstractHeaderMatcher</a> class as this class will handle setting of the unique id for instances that do not otherwise have a unique id.</p>
<p>So lets start by creating our matcher class and implementing the matches method. The matches method takes an <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/IHeaders.java">IHeaders</a> argument. IHeaders is an object that contains the header text in two formats:</p>
<ul>
<li>raw - just as read from the file.</li>
<li>pruned - containing only letters and digits, and with all letters lowercased.
<div class="source">
<pre>package com.example.ratMatcher;
import org.apache.rat.analysis.IHeaders;
import org.apache.rat.analysis.matchers.AbstractHeaderMatcher;
import org.apache.rat.config.parameters.Component;
import org.apache.rat.config.parameters.ConfigComponent;
@ConfigComponent(type = Component.Type.MATCHER, name = &quot;QSC&quot;, desc = &quot;Reports if the 'Quality, speed and cost, pick any two' rule is violated&quot;)
public class QSCMatcher extends AbstractHeaderMatcher {
public QSCMatcher(String id) {
super(id);
}
@Override
public boolean matches(IHeaders headers) {
String text = headers.prune()
return text.contains(&quot;quality&quot;) &amp;&amp; text.contains(&quot;speed&quot;) &amp;&amp; text.contains(&quot;cost&quot;);
}
}</pre></div></li></ul>
<p>In the above example we use the <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/config/parameters/ConfigComponent.java">ConfigComponent</a> annotation to identify that this is a MATCHER component, that it has the name 'QSC' and a description of what it does. If the &quot;name&quot; was not specified the name would have been extracted from the class name by removing the &quot;Matcher&quot; from &quot;QSCMatcher&quot; and making the first character lowercase: &quot;qSC&quot;.</p>
<p>The Constructor calls the AbstractHeaderMatcher constructor with an id value. A null argument passed to AbstractHeaderMatcher will generate a UUID based id.</p>
<p>The matcher uses the pruned text to check for the strings. There is an issue with this matcher in that it would match the string: &quot;The quality of Dennis Hopper's acting, as Keanu Reeves costar in 'Speed', is outstanding.&quot;</p>
<p>The correction of that is left as an exercise for the reader. Hint: matching the pruned text can be a quick gating check for a set of more expensive regular expression checks against the raw text. </p></section><section>
<h4><a name="The_Matcher.Builder_implementation"></a>The Matcher.Builder implementation</h4>
<p>The builder must implement the IHeaderMatcher.Builder interface.</p>
<p>The work of handling the id and some other tasks is handled by the <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/configuration/builders/AbstractBuilder.java">AbstractBuilder</a>.</p>
<p>So we have:</p>
<div class="source">
<pre>package com.example.ratMatcher;
import org.apache.rat.configuration.builders.AbstractBuilder;
public class QSCBuilder extends AbstractBuilder. {
QSCMatcher build() {
return new QSCMatcher(getId());
}
}</pre></div></section><section>
<h4><a name="Registering_the_builder_for_use_in_XML_configuration"></a>Registering the builder for use in XML configuration </h4>
<p>In order to use the matcher in a Rat configuration it has to be registered with the system. This can be done by creating an XML configuration file with the builder specified and passing it to the command line runner as a license ( &quot;--licenses&quot; option) file. The name of the matcher is &quot;QSC&quot; so &quot;QSC&quot; is also the xml element that the parser will accept in license definitions. Since this is a joke license we should create a &quot;Joke&quot; family to contain any such licenses and a QSC license that uses the QSC matcher. The new configuration file now looks like:</p>
<div class="source">
<pre>&lt;rat-config&gt;
&lt;families&gt;
&lt;family id=&quot;Joke&quot; name=&quot;A joke license&quot; /&gt;
&lt;/families&gt;
&lt;licenses&gt;
&lt;!-- the family attribute below references the family id specified above --&gt;
&lt;license family=&quot;Joke&quot; id=&quot;QSC&quot; name=&quot;The QSC license&quot;&gt;
&lt;!-- the QSC below is the name specified in the QSCMatcher ConfigComponent annotation --&gt;
&lt;QSC/&gt;
&lt;/license&gt;
&lt;/licenses&gt;
&lt;matchers&gt;
&lt;!-- establishes QSC as a matcher --&gt;
&lt;matcher class=&quot;com.example.ratMatcher.QSCBuilder&quot; /&gt;
&lt;/matchers&gt;
&lt;/rat-config&gt;</pre></div>
<p>If the license entry did not have an &quot;id&quot; attribute its id would be the same as the family. If it did not have a name attribute the name would be the same as the family.</p></section></section><section>
<h3><a name="A_more_complex_example"></a>A more complex example</h3>
<p>In many cases it is necessary to set properties on the matcher. So let's write a generalized version of the QSC matcher that accepts any 3 strings and triggers if all 3 are found in the header.</p>
<div class="source">
<pre>package com.example.ratMatcher;
import org.apache.commons.lang3.StringUtils;
import org.apache.rat.analysis.IHeaders;
import org.apache.rat.analysis.matchers.AbstractHeaderMatcher;
import org.apache.rat.config.parameters.Component;
import org.apache.rat.config.parameters.ConfigComponent;
@ConfigComponent(type = Component.Type.MATCHER, name = &quot;TPM&quot;, desc = &quot;Checks that the three string are found in the header&quot;)
public class TPMatcher extends AbstractHeaderMatcher {
@ConfigComponent(type = Component.Type.PARAMETER, desc = &quot;The first parameter&quot; required = &quot;true&quot;)
private final String one;
@ConfigComponent(type = Component.Type.PARAMETER, desc = &quot;The second parameter&quot; required = &quot;true&quot;)
private final String two;
@ConfigComponent(type = Component.Type.PARAMETER, desc = &quot;The third parameter&quot; required = &quot;true&quot;)
private final String three;
public TPMatcher(String id, String one, String two, String three) {
super(id);
if (StringUtils.isEmpthy(one) || StringUtils.isEmpty(two) || StringUtils.isEmpty(three) {
throw new ConfigurationException( &quot;None of the three properties (one, two, or three) may be empty&quot;);
}
this.one = one;
this.two = two;
this.three = three;
}
public String getOne() { return one; }
public String getThree() { return two; }
public String getTwo() { return three; }
@Override
public boolean matches(IHeaders headers) {
String text = headers.prune()
return text.contains(one) &amp;&amp; text.contains(two) &amp;&amp; text.contains(three);
}
}</pre></div>
<p>The ConfigComponents with the PARAMETER type indicate that the members specify properties of the component. The matcher must have a &quot;get&quot; method for each parameter and the builder must have a corresponding &quot;set&quot; method. The names of the methods and the attributes in the XML parser can be changed by adding a 'name' attribute to the ConfigComponent.</p>
<p>The builder now looks like:</p>
<div class="source">
<pre>package com.example.ratMatcher;
import org.apache.rat.configuration.builders.AbstractBuilder;
public class TPBuilder extends AbstractBuilder {
private String one;
private String two;
private String three;
TPMatcher build() {
return new TPMatcher(one, two, three);
}
public TPBuilder setOne(String one) { this.one = one; }
public TPBuilder setTwo(String two) { this.two = two; }
public TPBuilder setThree(String three) { this.three = three; }
}</pre></div>
<p>And the new configuration file looks like:</p>
<div class="source">
<pre>&lt;rat-config&gt;
&lt;families&gt;
&lt;family id=&quot;Joke&quot; name=&quot;A joke license&quot; /&gt;
&lt;/families&gt;
&lt;licenses&gt;
&lt;license family=&quot;Joke&quot; id=&quot;QSC&quot; name=&quot;The QSC license&quot;&gt;
&lt;QSC/&gt;
&lt;/license&gt;
&lt;license family=&quot;Joke&quot; id=&quot;TPM&quot; name=&quot;The TPM Check&quot;&gt;
&lt;TPM one=&quot;once&quot; two=&quot;upon&quot; three=&quot;time&quot;&gt;
&lt;/license&gt;
&lt;/licenses&gt;
&lt;matchers&gt;
&lt;matcher class=&quot;com.example.ratMatcher.QSCBuilder&quot; /&gt;
&lt;matcher class=&quot;com.example.ratMatcher.TPBuilder&quot; /&gt;
&lt;/matchers&gt;
&lt;/rat-config&gt;</pre></div></section><section>
<h3><a name="Embedded_matchers."></a>Embedded matchers.</h3>
<p>It is possible to create matchers that embed other matchers. The examples in the codebase are the <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/analysis/matchers/AndMatcher.java">All</a>, <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/analysis/matchers/OrMatcher.java">Any</a> and <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/analysis/matchers/NotMatcher.java">Not</a> matchers and their associated builders. As an example we will build a &quot;Highlander&quot; matcher that will be true if one and only one enclosed matcher is true; there can be only one. The Highlander matcher will extend <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/analysis/matchers/AbstractMatcherContainer">AbstractMatcherContainer</a> which will handle the enclosed resources and the option of reading text matchers from a file.</p>
<div class="source">
<pre>package com.example.ratMatcher;
import org.apache.commons.lang3.StringUtils;
import org.apache.rat.analysis.IHeaders;
import org.apache.rat.analysis.matchers.AbstractHeaderMatcher;
import org.apache.rat.config.parameters.Component;
import org.apache.rat.config.parameters.ConfigComponent;
@ConfigComponent(type = Component.Type.MATCHER, desc = &quot;Checks that there can be only one matching enclosed matcher&quot;)
public class HighlanderMatcher extends AbstractMatcherContainer {
public HighlanderMatcher(String id, Collection&lt;? extends IHeaderMatcher&gt; enclosed, String resource) {
super(id, enclosed, resource);
}
@Override
public boolean matches(IHeaders headers) {
boolean foundOne = false;
for (IHeaderMatcher matcher : getEnclosed()) {
if (matcher.matches(headers)) {
if (foundOne) {
return false;
}
foundOne = true;
}
}
return foundOne;
}
}</pre></div>
<p>We create a simple builder that extends <a class="externalLink" href="https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/configuration/builders/ChildContainerBuilder.java">ChildContainerBuilder</a> which will handle setting the id, enclosed matchers, and the resource.</p>
<div class="source">
<pre>package com.example.ratMatcher;
import org.apache.rat.configuration.builders.AbstractBuilder;
public class HighlanderBuilder extends ChildContainerBuilder {
@Override
public Highlander build() {
return new Highlander(getId(), getEnclosed(), resource);
}</pre></div></section></section><section>
<h2><a name="Add_the_above_to_the_configuration_and_we_have:"></a>Add the above to the configuration and we have:</h2>
<div class="source">
<pre>&lt;rat-config&gt;
&lt;families&gt;
&lt;family id=&quot;Joke&quot; name=&quot;A joke license&quot; /&gt;
&lt;/families&gt;
&lt;licenses&gt;
&lt;license family=&quot;Joke&quot; id=&quot;QSC&quot; name=&quot;The QSC license&quot;&gt;
&lt;QSC/&gt;
&lt;/license&gt;
&lt;license family=&quot;Joke&quot; id=&quot;TPM&quot; name=&quot;The TPM Check&quot;&gt;
&lt;TPM one=&quot;once&quot; two=&quot;upon&quot; three=&quot;time&quot;&gt;
&lt;/license&gt;
&lt;license family=&quot;Joke&quot;&gt;
&lt;highlander&gt;
&lt;QSC/&gt;
&lt;TPM one=&quot;once&quot; two=&quot;upon&quot; three=&quot;time&quot;&gt;
&lt;/highlander&gt;
&lt;/license&gt;
&lt;/licenses&gt;
&lt;matchers&gt;
&lt;matcher class=&quot;com.example.ratMatcher.HighlanderBuilder&quot; /&gt;
&lt;matcher class=&quot;com.example.ratMatcher.QSCBuilder&quot; /&gt;
&lt;matcher class=&quot;com.example.ratMatcher.TPBuilder&quot; /&gt;
&lt;/matchers&gt;
&lt;/rat-config&gt;</pre></div>
<p>The HighlanderBuilder builds a Highlander object. The Highlander object is annotated with a ConfigComponent that does not specify the name, the system strips the &quot;Matcher&quot; fom the simple class name and lowercases the first character, so the Highlander matcher has the name &quot;highlander&quot;. The last &quot;license&quot; entry does not have an id or name set so it will have the id of &quot;Joke&quot; and the name of &quot;A joke license&quot; inherited from the family.</p>
<p>Since there is no &quot;approved&quot; section of the rat-conf all the licenses are assumed to be approved.</p></section>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">
Copyright &copy; 2016-2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0.
Apache Creadur, Creadur, Apache Rat, Apache Tentacles, Apache Whisker, Apache and the Apache feather logo are trademarks
of The Apache Software Foundation.
Oracle and Java are registered trademarks of Oracle and/or its affiliates.
All other marks mentioned may be trademarks or registered trademarks of their respective owners.
</div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>