blob: 0f787a59d1a621b5b86486586614f4d97cb68179 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia Site Renderer 1.8 from src/site/markdown/metron-platform/metron-solr/index.md at 2019-05-14
| Rendered using Apache Maven Fluido Skin 1.7
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20190514" />
<meta http-equiv="Content-Language" content="en" />
<title>Metron &#x2013; Solr in Metron</title>
<link rel="stylesheet" href="../../css/apache-maven-fluido-1.7.min.css" />
<link rel="stylesheet" href="../../css/site.css" />
<link rel="stylesheet" href="../../css/print.css" media="print" />
<script type="text/javascript" src="../../js/apache-maven-fluido-1.7.min.js"></script>
<script type="text/javascript">
$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );
</script>
</head>
<body class="topBarDisabled">
<div class="container-fluid">
<div id="banner">
<div class="pull-left"><a href="http://metron.apache.org/" id="bannerLeft"><img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/></a></div>
<div class="pull-right"></div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class=""><a href="http://www.apache.org" class="externalLink" title="Apache">Apache</a><span class="divider">/</span></li>
<li class=""><a href="http://metron.apache.org/" class="externalLink" title="Metron">Metron</a><span class="divider">/</span></li>
<li class=""><a href="../../index.html" title="Documentation">Documentation</a><span class="divider">/</span></li>
<li class="active ">Solr in Metron</li>
<li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2019-05-14</li>
<li id="projectVersion" class="pull-right">Version: 0.7.1</li>
</ul>
</div>
<div class="row-fluid">
<div id="leftColumn" class="span2">
<div class="well sidebar-nav">
<ul class="nav nav-list">
<li class="nav-header">User Documentation</li>
<li><a href="../../index.html" title="Metron"><span class="icon-chevron-down"></span>Metron</a>
<ul class="nav nav-list">
<li><a href="../../CONTRIBUTING.html" title="CONTRIBUTING"><span class="none"></span>CONTRIBUTING</a></li>
<li><a href="../../Upgrading.html" title="Upgrading"><span class="none"></span>Upgrading</a></li>
<li><a href="../../metron-analytics/index.html" title="Analytics"><span class="icon-chevron-right"></span>Analytics</a></li>
<li><a href="../../metron-contrib/metron-docker/index.html" title="Docker"><span class="none"></span>Docker</a></li>
<li><a href="../../metron-contrib/metron-performance/index.html" title="Performance"><span class="none"></span>Performance</a></li>
<li><a href="../../metron-deployment/index.html" title="Deployment"><span class="icon-chevron-right"></span>Deployment</a></li>
<li><a href="../../metron-interface/index.html" title="Interface"><span class="icon-chevron-right"></span>Interface</a></li>
<li><a href="../../metron-platform/index.html" title="Platform"><span class="icon-chevron-down"></span>Platform</a>
<ul class="nav nav-list">
<li><a href="../../metron-platform/Performance-tuning-guide.html" title="Performance-tuning-guide"><span class="none"></span>Performance-tuning-guide</a></li>
<li><a href="../../metron-platform/metron-common/index.html" title="Common"><span class="none"></span>Common</a></li>
<li><a href="../../metron-platform/metron-data-management/index.html" title="Data-management"><span class="none"></span>Data-management</a></li>
<li><a href="../../metron-platform/metron-elasticsearch/index.html" title="Elasticsearch"><span class="none"></span>Elasticsearch</a></li>
<li><a href="../../metron-platform/metron-enrichment/index.html" title="Enrichment"><span class="icon-chevron-right"></span>Enrichment</a></li>
<li><a href="../../metron-platform/metron-hbase-server/index.html" title="Hbase-server"><span class="none"></span>Hbase-server</a></li>
<li><a href="../../metron-platform/metron-indexing/index.html" title="Indexing"><span class="none"></span>Indexing</a></li>
<li><a href="../../metron-platform/metron-job/index.html" title="Job"><span class="none"></span>Job</a></li>
<li><a href="../../metron-platform/metron-management/index.html" title="Management"><span class="none"></span>Management</a></li>
<li><a href="../../metron-platform/metron-parsing/index.html" title="Parsing"><span class="icon-chevron-right"></span>Parsing</a></li>
<li><a href="../../metron-platform/metron-pcap-backend/index.html" title="Pcap-backend"><span class="none"></span>Pcap-backend</a></li>
<li class="active"><a href="#"><span class="none"></span>Solr</a></li>
<li><a href="../../metron-platform/metron-writer/index.html" title="Writer"><span class="none"></span>Writer</a></li>
</ul>
</li>
<li><a href="../../metron-sensors/index.html" title="Sensors"><span class="icon-chevron-right"></span>Sensors</a></li>
<li><a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example"><span class="none"></span>Stellar-3rd-party-example</a></li>
<li><a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common"><span class="icon-chevron-right"></span>Stellar-common</a></li>
<li><a href="../../metron-stellar/stellar-zeppelin/index.html" title="Stellar-zeppelin"><span class="none"></span>Stellar-zeppelin</a></li>
<li><a href="../../use-cases/index.html" title="Use-cases"><span class="icon-chevron-right"></span>Use-cases</a></li>
</ul>
</li>
</ul>
<hr />
<div id="poweredBy">
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /></a>
</div>
</div>
</div>
<div id="bodyColumn" class="span10" >
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<h1>Solr in Metron</h1>
<p><a name="Solr_in_Metron"></a></p>
<div class="section">
<h2><a name="Table_of_Contents"></a>Table of Contents</h2>
<ul>
<li><a href="#Introduction">Introduction</a></li>
<li><a href="#Configuration">Configuration</a></li>
<li><a href="#Installing">Installing</a></li>
<li><a href="#Schemas">Schemas</a></li>
<li><a href="#Collections">Collections</a></li>
</ul></div>
<div class="section">
<h2><a name="Introduction"></a>Introduction</h2>
<p>Metron ships with Solr 6.6.2 support. Solr Cloud can be used as the real-time portion of the datastore resulting from <a href="../metron-indexing/index.html">metron-indexing</a>.</p></div>
<div class="section">
<h2><a name="Configuration"></a>Configuration</h2>
<div class="section">
<h3><a name="The_Indexing_Topology"></a>The Indexing Topology</h3>
<p>Solr is a viable option for the <tt>random access topology</tt> and, similar to the Elasticsearch Writer, can be configured via the global config. The following settings are possible as part of the global config:</p>
<ul>
<li><tt>solr.zookeeper</tt>
<ul>
<li>The zookeeper quorum associated with the SolrCloud instance. This is a required field with no default.</li>
</ul>
</li>
<li><tt>solr.commitPerBatch</tt>
<ul>
<li>This is a boolean which defines whether the writer commits every batch. The default is <tt>true</tt>.</li>
<li><i>WARNING</i>: If you set this to <tt>false</tt>, then commits will happen based on the SolrClient&#x2019;s internal mechanism and worker failure <i>may</i> result data being acknowledged in storm but not written in Solr.</li>
</ul>
</li>
<li><tt>solr.commit.soft</tt>
<ul>
<li>This is a boolean which defines whether the writer makes a soft commit or a durable commit. See <a class="externalLink" href="https://lucene.apache.org/solr/guide/6_6/near-real-time-searching.html#NearRealTimeSearching-AutoCommits">here</a> The default is <tt>false</tt>.</li>
<li><i>WARNING</i>: If you set this to <tt>true</tt>, then commits will happen based on the SolrClient&#x2019;s internal mechanism and worker failure <i>may</i> result data being acknowledged in storm but not written in Solr.</li>
</ul>
</li>
<li><tt>solr.commit.waitSearcher</tt>
<ul>
<li>This is a boolean which defines whether the writer blocks the commit until the data is available to search. See <a class="externalLink" href="https://lucene.apache.org/solr/guide/6_6/near-real-time-searching.html#NearRealTimeSearching-AutoCommits">here</a> The default is <tt>true</tt>.</li>
<li><i>WARNING</i>: If you set this to <tt>false</tt>, then commits will happen based on the SolrClient&#x2019;s internal mechanism and worker failure <i>may</i> result data being acknowledged in storm but not written in Solr.</li>
</ul>
</li>
<li><tt>solr.commit.waitFlush</tt>
<ul>
<li>This is a boolean which defines whether the writer blocks the commit until the data is flushed. See <a class="externalLink" href="https://lucene.apache.org/solr/guide/6_6/near-real-time-searching.html#NearRealTimeSearching-AutoCommits">here</a> The default is <tt>true</tt>.</li>
<li><i>WARNING</i>: If you set this to <tt>false</tt>, then commits will happen based on the SolrClient&#x2019;s internal mechanism and worker failure <i>may</i> result data being acknowledged in storm but not written in Solr.</li>
</ul>
</li>
<li><tt>solr.collection</tt>
<ul>
<li>The default solr collection (if unspecified, the name is <tt>metron</tt>). By default, sensors will write to a collection associated with the index name in the indexing config for that sensor. If that index name is the empty string, then the default collection will be used.</li>
</ul>
</li>
<li><tt>solr.http.config</tt>
<ul>
<li>This is a map which allows users to configure the Solr client&#x2019;s HTTP client.</li>
<li>Possible fields here are:
<ul>
<li><tt>socketTimeout</tt> : Socket timeout measured in ms, closes a socket if read takes longer than x ms to complete throws <tt>java.net.SocketTimeoutException: Read timed out exception</tt></li>
<li><tt>connTimeout</tt> : Connection timeout measures in ms, closes a socket if connection cannot be established within x ms with a <tt>java.net.SocketTimeoutException: Connection timed out</tt></li>
<li><tt>maxConectionsPerHost</tt> : Maximum connections allowed per host</li>
<li><tt>maxConnections</tt> : Maximum total connections allowed</li>
<li><tt>retry</tt> : Retry http requests on error</li>
<li><tt>allowCompression</tt> : Allow compression (deflate,gzip) if server supports it</li>
<li><tt>followRedirects</tt> : Follow redirects</li>
<li><tt>httpBasicAuthUser</tt> : Basic auth username</li>
<li><tt>httpBasicAuthPassword</tt> : Basic auth password</li>
<li><tt>solr.ssl.checkPeerName</tt> : Check peer name</li>
</ul>
</li>
</ul>
</li>
</ul></div></div>
<div class="section">
<h2><a name="Installing"></a>Installing</h2>
<p>Solr is installed in the <a href="../../metron-deployment/development/centos6/index.html">full dev environment for CentOS</a> by default but is not started initially. Navigate to <tt>$METRON_HOME/bin</tt> and start Solr Cloud by running <tt>start_solr.sh</tt>.</p>
<p>Metron&#x2019;s Ambari MPack installs several scripts in <tt>$METRON_HOME/bin</tt> that can be used to manage Solr. A script is also provided for installing Solr Cloud outside of full dev. The script performs the following tasks</p>
<ul>
<li>Stops ES and Kibana</li>
<li>Downloads Solr</li>
<li>Installs Solr</li>
<li>Starts Solr Cloud</li>
</ul>
<p><i>Note: for details on setting up Solr Cloud in production mode, see <a class="externalLink" href="https://lucene.apache.org/solr/guide/6_6/taking-solr-to-production.html">https://lucene.apache.org/solr/guide/6_6/taking-solr-to-production.html</a></i></p>
<p>Navigate to <tt>$METRON_HOME/bin</tt> and spin up Solr Cloud by running <tt>install_solr.sh</tt>. After running this script, Elasticsearch and Kibana will have been stopped and you should now have an instance of Solr Cloud up and running at <a class="externalLink" href="http://localhost:8983/solr/#/~cloud">http://localhost:8983/solr/#/~cloud</a>. This manner of starting Solr will also spin up an embedded Zookeeper instance at port 9983. More information can be found <a class="externalLink" href="https://lucene.apache.org/solr/guide/6_6/getting-started-with-solrcloud.html">here</a></p>
<p>Solr can also be installed using <a class="externalLink" href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_solr-search-installation/content/ch_hdp_search_30.html">HDP Search 3</a>. HDP Search 3 sets the Zookeeper root to <tt>/solr</tt> so this will need to be added to each url in the comma-separated list in Ambari UI -&gt; Services -&gt; Metron -&gt; Configs -&gt; Index Settings -&gt; Solr Zookeeper Urls. For example, in full dev this would be <tt>node1:2181/solr</tt>.</p></div>
<div class="section">
<h2><a name="Enabling_Solr"></a>Enabling Solr</h2>
<p>Elasticsearch is the real-time store used by default in Metron. Solr can be enabled following these steps:</p>
<ol style="list-style-type: decimal">
<li>Stop the Metron Indexing component in Ambari.</li>
<li>Update Ambari UI -&gt; Services -&gt; Metron -&gt; Configs -&gt; Index Settings -&gt; Solr Zookeeper Urls to match the Solr installation described in the previous section.</li>
<li>Change Ambari UI -&gt; Services -&gt; Metron -&gt; Configs -&gt; Indexing -&gt; Index Writer - Random Access -&gt; Random Access Search Engine to <tt>Solr</tt>.</li>
<li>Change Ambari UI -&gt; Services -&gt; Metron -&gt; Configs -&gt; REST -&gt; Source Type Field Name to <tt>source.type</tt>.</li>
<li>Change Ambari UI -&gt; Services -&gt; Metron -&gt; Configs -&gt; REST -&gt; Threat Triage Score Field Name to <tt>threat.triage.score</tt>.</li>
<li>Start the Metron Indexing component in Ambari.</li>
<li>Restart Metron REST and the Alerts UI in Ambari.</li>
</ol>
<p>This will automatically create collections for the schemas shipped with Metron:</p>
<ul>
<li>bro</li>
<li>snort</li>
<li>yaf</li>
<li>error (used internally by Metron)</li>
<li>metaalert (used internall by Metron)</li>
</ul>
<p>Any other collections must be created manually before starting the Indexing component. Alerts should be present in the Alerts UI after enabling Solr.</p></div>
<div class="section">
<h2><a name="Schemas"></a>Schemas</h2>
<p>As of now, we have mapped out the Schemas in <tt>src/main/config/schema</tt>. Ambari will eventually install these, but at the moment it&#x2019;s manual and you should refer to the Solr documentation <a href="here/index.html">https://lucene.apache.org/solr/guide/6_6</a> in general and <a class="externalLink" href="https://lucene.apache.org/solr/guide/6_6/documents-fields-and-schema-design.html">here</a> if you&#x2019;d like to know more about schemas in Solr.</p>
<p>In Metron&#x2019;s Solr DAO implementation, document updates involve reading a document, applying the update and replacing the original by reindexing the whole document.<br />
Indexing LatLonType and PointType field types stores data in internal fields that should not be returned in search results. For these fields a dynamic field type matching the suffix needs to be added to store the data points. Solr 6+ comes with a new LatLonPointSpatialField field type that should be used instead of LatLonType if possible. Otherwise, a LatLongType field should be defined as:</p>
<div>
<div>
<pre class="source">&lt;dynamicField name=&quot;*.location_point&quot; type=&quot;location&quot; multiValued=&quot;false&quot; docValues=&quot;false&quot;/&gt;
&lt;dynamicField name=&quot;*_coordinate&quot; type=&quot;pdouble&quot; indexed=&quot;true&quot; stored=&quot;false&quot; docValues=&quot;false&quot;/&gt;
&lt;fieldType name=&quot;location&quot; class=&quot;solr.LatLonType&quot; subFieldSuffix=&quot;_coordinate&quot;/&gt;
</pre></div></div>
<p>A PointType field should be defined as:</p>
<div>
<div>
<pre class="source">&lt;dynamicField name=&quot;*.point&quot; type=&quot;point&quot; multiValued=&quot;false&quot; docValues=&quot;false&quot;/&gt;
&lt;dynamicField name=&quot;*_point&quot; type=&quot;pdouble&quot; indexed=&quot;true&quot; stored=&quot;false&quot; docValues=&quot;false&quot;/&gt;
&lt;fieldType name=&quot;point&quot; class=&quot;solr.PointType&quot; subFieldSuffix=&quot;_point&quot;/&gt;
</pre></div></div>
<p>If any copy fields are defined, stored and docValues should be set to false.</p></div>
<div class="section">
<h2><a name="Collections"></a>Collections</h2>
<p>Convenience scripts are provided with Metron to create and delete collections. Ambari uses these scripts to automatically create collections. To use them outside of Ambari, a few environment variables must be set first:</p>
<div>
<div>
<pre class="source"># Path to the zookeeper node used by Solr
export ZOOKEEPER=node1:2181/solr
# Set to true if Kerberos is enabled
export SECURITY_ENABLED=true
</pre></div></div>
<p>The scripts can then be called directly with the collection name as the first argument . For example, to create the bro collection:</p>
<div>
<div>
<pre class="source">$METRON_HOME/bin/create_collection.sh bro
</pre></div></div>
<p>To delete the bro collection:</p>
<div>
<div>
<pre class="source">$METRON_HOME/bin/delete_collection.sh bro
</pre></div></div>
<p>The <tt>create_collection.sh</tt> script depends on schemas installed in <tt>$METRON_HOME/config/schema</tt>. There are several schemas that come with Metron:</p>
<ul>
<li>bro</li>
<li>snort</li>
<li>yaf</li>
<li>metaalert</li>
<li>error</li>
</ul>
<p>Additional schemas should be installed in that location if using the <tt>create_collection.sh</tt> script. Any collection can be deleted with the <tt>delete_collection.sh</tt> script. These scripts use the <a class="externalLink" href="http://lucene.apache.org/solr/guide/6_6/collections-api.html">Solr Collection API</a>.</p></div>
</div>
</div>
</div>
<hr/>
<footer>
<div class="container-fluid">
<div class="row-fluid">
© 2015-2016 The Apache Software Foundation. Apache Metron, Metron, Apache, the Apache feather logo,
and the Apache Metron project logo are trademarks of The Apache Software Foundation.
</div>
</div>
</footer>
</body>
</html>