blob: 31d0dfa261fb750a31f5c2fe27c00a00e7d05c87 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia Site Renderer 1.8 from src/site/markdown/metron-analytics/metron-profiler-storm/index.md at 2019-05-14
| Rendered using Apache Maven Fluido Skin 1.7
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20190514" />
<meta http-equiv="Content-Language" content="en" />
<title>Metron &#x2013; Metron Profiler for Storm</title>
<link rel="stylesheet" href="../../css/apache-maven-fluido-1.7.min.css" />
<link rel="stylesheet" href="../../css/site.css" />
<link rel="stylesheet" href="../../css/print.css" media="print" />
<script type="text/javascript" src="../../js/apache-maven-fluido-1.7.min.js"></script>
<script type="text/javascript">
$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );
</script>
</head>
<body class="topBarDisabled">
<div class="container-fluid">
<div id="banner">
<div class="pull-left"><a href="http://metron.apache.org/" id="bannerLeft"><img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/></a></div>
<div class="pull-right"></div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class=""><a href="http://www.apache.org" class="externalLink" title="Apache">Apache</a><span class="divider">/</span></li>
<li class=""><a href="http://metron.apache.org/" class="externalLink" title="Metron">Metron</a><span class="divider">/</span></li>
<li class=""><a href="../../index.html" title="Documentation">Documentation</a><span class="divider">/</span></li>
<li class="active ">Metron Profiler for Storm</li>
<li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2019-05-14</li>
<li id="projectVersion" class="pull-right">Version: 0.7.1</li>
</ul>
</div>
<div class="row-fluid">
<div id="leftColumn" class="span2">
<div class="well sidebar-nav">
<ul class="nav nav-list">
<li class="nav-header">User Documentation</li>
<li><a href="../../index.html" title="Metron"><span class="icon-chevron-down"></span>Metron</a>
<ul class="nav nav-list">
<li><a href="../../CONTRIBUTING.html" title="CONTRIBUTING"><span class="none"></span>CONTRIBUTING</a></li>
<li><a href="../../Upgrading.html" title="Upgrading"><span class="none"></span>Upgrading</a></li>
<li><a href="../../metron-analytics/index.html" title="Analytics"><span class="icon-chevron-down"></span>Analytics</a>
<ul class="nav nav-list">
<li><a href="../../metron-analytics/metron-maas-service/index.html" title="Maas-service"><span class="none"></span>Maas-service</a></li>
<li><a href="../../metron-analytics/metron-profiler-client/index.html" title="Profiler-client"><span class="none"></span>Profiler-client</a></li>
<li><a href="../../metron-analytics/metron-profiler-common/index.html" title="Profiler-common"><span class="none"></span>Profiler-common</a></li>
<li><a href="../../metron-analytics/metron-profiler-repl/index.html" title="Profiler-repl"><span class="none"></span>Profiler-repl</a></li>
<li><a href="../../metron-analytics/metron-profiler-spark/index.html" title="Profiler-spark"><span class="none"></span>Profiler-spark</a></li>
<li class="active"><a href="#"><span class="none"></span>Profiler-storm</a></li>
<li><a href="../../metron-analytics/metron-statistics/index.html" title="Statistics"><span class="icon-chevron-right"></span>Statistics</a></li>
</ul>
</li>
<li><a href="../../metron-contrib/metron-docker/index.html" title="Docker"><span class="none"></span>Docker</a></li>
<li><a href="../../metron-contrib/metron-performance/index.html" title="Performance"><span class="none"></span>Performance</a></li>
<li><a href="../../metron-deployment/index.html" title="Deployment"><span class="icon-chevron-right"></span>Deployment</a></li>
<li><a href="../../metron-interface/index.html" title="Interface"><span class="icon-chevron-right"></span>Interface</a></li>
<li><a href="../../metron-platform/index.html" title="Platform"><span class="icon-chevron-right"></span>Platform</a></li>
<li><a href="../../metron-sensors/index.html" title="Sensors"><span class="icon-chevron-right"></span>Sensors</a></li>
<li><a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example"><span class="none"></span>Stellar-3rd-party-example</a></li>
<li><a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common"><span class="icon-chevron-right"></span>Stellar-common</a></li>
<li><a href="../../metron-stellar/stellar-zeppelin/index.html" title="Stellar-zeppelin"><span class="none"></span>Stellar-zeppelin</a></li>
<li><a href="../../use-cases/index.html" title="Use-cases"><span class="icon-chevron-right"></span>Use-cases</a></li>
</ul>
</li>
</ul>
<hr />
<div id="poweredBy">
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /></a>
</div>
</div>
</div>
<div id="bodyColumn" class="span10" >
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<h1>Metron Profiler for Storm</h1>
<p><a name="Metron_Profiler_for_Storm"></a></p>
<p>This project allows profiles to be executed using <a class="externalLink" href="https://storm.apache.org">Apache Storm</a>. This is a port of the Profiler to Storm that builds low-latency profiles over streaming data sets.</p>
<ul>
<li><a href="#Introduction">Introduction</a></li>
<li><a href="#Getting_Started">Getting Started</a></li>
<li><a href="#Installation">Installation</a></li>
<li><a href="#Configuring_the_Profiler">Configuring the Profiler</a></li>
<li><a href="#Implementation">Implementation</a></li>
</ul>
<div class="section">
<h2><a name="Introduction"></a>Introduction</h2>
<p>The Profiler is a feature extraction mechanism that can generate a profile describing the behavior of an entity. An entity might be a server, user, subnet or application. Once a profile has been generated defining what normal behavior looks-like, models can be built that identify anomalous behavior.</p>
<p>This is achieved by summarizing the streaming telemetry data consumed by Metron over sliding windows. A summary statistic is applied to the data received within a given window. Collecting this summary across many windows results in a time series that is useful for analysis.</p>
<p>Any field contained within a message can be used to generate a profile. A profile can even be produced by combining fields that originate in different data sources. A user has considerable power to transform the data used in a profile by leveraging the Stellar language. A user only need configure the desired profiles and ensure that the Profiler topology is running.</p>
<p>For an introduction to the Profiler, see the <a href="../metron-profiler-common/index.html">Profiler README</a>.</p></div>
<div class="section">
<h2><a name="Getting_Started"></a>Getting Started</h2>
<p>This section will describe the steps required to get your first &#x201c;Hello, World!&#x201d;&quot; profile running. This assumes that you have a successful Profiler <a href="#Installation">Installation</a> and have it running. You can deploy profiles in two different ways.</p>
<ul>
<li><a href="#Deploying_Profiles_with_the_Stellar_Shell">Deploying Profiles with the Stellar Shell</a></li>
<li><a href="#Deploying_Profiles_from_the_Command_Line">Deploying Profiles from the Command Line</a></li>
</ul>
<div class="section">
<h3><a name="Deploying_Profiles_with_the_Stellar_Shell"></a>Deploying Profiles with the Stellar Shell</h3>
<p>Continuing the previous running example, at this point, you have seen how your profile behaves against real, live telemetry in a controlled execution environment. The next step is to deploy your profile to the live, actively running Profiler topology.</p>
<ol style="list-style-type: decimal">
<li>
<p>Start the Stellar Shell with the <tt>-z</tt> command line argument so that a connection to Zookeeper is established. This is required when deploying a new profile definition as shown in the steps below.</p>
<div>
<div>
<pre class="source">[root@node1 ~]# source /etc/default/metron
[root@node1 ~]# $METRON_HOME/bin/stellar -z $ZOOKEEPER
Stellar, Go!
[Stellar]&gt;&gt;&gt;
[Stellar]&gt;&gt;&gt; %functions CONFIG CONFIG_GET, CONFIG_PUT
</pre></div></div>
</li>
<li>
<p>If you haven&#x2019;t already, define your profile.</p>
<div>
<div>
<pre class="source">[Stellar]&gt;&gt;&gt; conf := SHELL_EDIT()
[Stellar]&gt;&gt;&gt; conf
{
&quot;profiles&quot;: [
{
&quot;profile&quot;: &quot;hello-world&quot;,
&quot;onlyif&quot;: &quot;exists(ip_src_addr)&quot;,
&quot;foreach&quot;: &quot;ip_src_addr&quot;,
&quot;init&quot;: { &quot;count&quot;: &quot;0&quot; },
&quot;update&quot;: { &quot;count&quot;: &quot;count + 1&quot; },
&quot;result&quot;: &quot;count&quot;
}
]
}
</pre></div></div>
</li>
<li>
<p>Check what is already deployed.</p>
<p>Pushing a new profile configuration is destructive. It will overwrite any existing configuration. Check what you have out there. Manually merge the existing configuration with your new profile definition.</p>
<div>
<div>
<pre class="source">[Stellar]&gt;&gt;&gt; existing := CONFIG_GET(&quot;PROFILER&quot;)
</pre></div></div>
</li>
<li>
<p>Deploy your profile. This will push the configuration to to the live, actively running Profiler topology. This will overwrite any existing profile definitions.</p>
<div>
<div>
<pre class="source">[Stellar]&gt;&gt;&gt; CONFIG_PUT(&quot;PROFILER&quot;, conf)
</pre></div></div>
</li>
</ol></div>
<div class="section">
<h3><a name="Deploying_Profiles_from_the_Command_Line"></a>Deploying Profiles from the Command Line</h3>
<ol style="list-style-type: decimal">
<li>
<p>Create the profile definition in a file located at <tt>$METRON_HOME/config/zookeeper/profiler.json</tt>. This file will likely not exist, if you have never created Profiles before.</p>
<p>The following example will create a profile that simply counts the number of messages per <tt>ip_src_addr</tt>.</p>
<div>
<div>
<pre class="source">{
&quot;profiles&quot;: [
{
&quot;profile&quot;: &quot;hello-world&quot;,
&quot;onlyif&quot;: &quot;exists(ip_src_addr)&quot;,
&quot;foreach&quot;: &quot;ip_src_addr&quot;,
&quot;init&quot;: { &quot;count&quot;: &quot;0&quot; },
&quot;update&quot;: { &quot;count&quot;: &quot;count + 1&quot; },
&quot;result&quot;: &quot;count&quot;
}
]
}
</pre></div></div>
</li>
<li>
<p>Upload the profile definition to Zookeeper.</p>
<div>
<div>
<pre class="source">$ source /etc/default/metron
$ cd $METRON_HOME
$ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z $ZOOKEEPER
</pre></div></div>
<p>You can validate this by reading back the Metron configuration from Zookeeper using the same script. The result should look-like the following.</p>
<div>
<div>
<pre class="source">$ bin/zk_load_configs.sh -m DUMP -z $ZOOKEEPER
...
PROFILER Config: profiler
{
&quot;profiles&quot;: [
{
&quot;profile&quot;: &quot;hello-world&quot;,
&quot;onlyif&quot;: &quot;exists(ip_src_addr)&quot;,
&quot;foreach&quot;: &quot;ip_src_addr&quot;,
&quot;init&quot;: { &quot;count&quot;: &quot;0&quot; },
&quot;update&quot;: { &quot;count&quot;: &quot;count + 1&quot; },
&quot;result&quot;: &quot;count&quot;
}
]
}
</pre></div></div>
</li>
<li>
<p>Ensure that test messages are being sent to the Profiler&#x2019;s input topic in Kafka. The Profiler will consume messages from the input topic defined in the Profiler&#x2019;s configuration (see <a href="#Configuring_the_Profiler">Configuring the Profiler</a>). By default this is the <tt>indexing</tt> topic.</p>
</li>
<li>
<p>Check the HBase table to validate that the Profiler is writing the profile. Remember that the Profiler is flushing the profile every 15 minutes. You will need to wait at least this long to start seeing profile data in HBase.</p>
<div>
<div>
<pre class="source">$ /usr/hdp/current/hbase-client/bin/hbase shell
hbase(main):001:0&gt; count 'profiler'
</pre></div></div>
</li>
<li>
<p>Use the <a href="../metron-profiler-client/index.html">Profiler Client</a> to read the profile data. The following <tt>PROFILE_GET</tt> command will read the data written by the <tt>hello-world</tt> profile. This assumes that <tt>10.0.0.1</tt> is one of the values for <tt>ip_src_addr</tt> contained within the telemetry consumed by the Profiler.</p>
<div>
<div>
<pre class="source">$ source /etc/default/metron
$ bin/stellar -z $ZOOKEEPER
[Stellar]&gt;&gt;&gt; PROFILE_GET( &quot;hello-world&quot;, &quot;10.0.0.1&quot;, PROFILE_FIXED(30, &quot;MINUTES&quot;))
[451, 448]
</pre></div></div>
<p>This result indicates that over the past 30 minutes, the Profiler stored two values related to the source IP address &#x201c;10.0.0.1&#x201d;. In the first 15 minute period, the IP <tt>10.0.0.1</tt> was seen in 451 telemetry messages. In the second 15 minute period, the same IP was seen in 448 telemetry messages.</p>
<p>It is assumed that the <tt>PROFILE_GET</tt> client is correctly configured to match the Profile configuration before using it to read that Profile. More information on configuring and using the Profiler client can be found <a href="../metron-profiler-client/index.html">here</a>.</p>
</li>
</ol></div></div>
<div class="section">
<h2><a name="Installation"></a>Installation</h2>
<p>The Profiler can be installed with either of these two methods.</p>
<ul>
<li><a href="#Ambari_Installation">Ambari Installation</a></li>
<li><a href="#Manual_Installation">Manual Installation</a></li>
</ul>
<div class="section">
<h3><a name="Ambari_Installation"></a>Ambari Installation</h3>
<p>The Metron Profiler is installed automatically when installing Metron using the Ambari MPack. You can skip the <a href="#Installation">Installation</a> section and move ahead to <a href="#Creating_Profiles">Creating Profiles</a> should this be the case.</p></div>
<div class="section">
<h3><a name="Manual_Installation"></a>Manual Installation</h3>
<p>This section will describe the steps necessary to manually install the Profiler on an RPM-based Linux distribution. This assumes that core Metron has already been installed and validated. If you installed Metron using the <a href="#Ambari_MPack">Ambari MPack</a>, then the Profiler has already been installed and you can skip this section.</p>
<ol style="list-style-type: decimal">
<li>
<p>Build the Metron RPMs (see Building the <a href="../../metron-deployment/index.html#RPMs">RPMs</a>).</p>
<p>You may have already built the Metron RPMs when core Metron was installed.</p>
<div>
<div>
<pre class="source">$ find metron-deployment/ -name &quot;metron-profiler*.rpm&quot;
metron-deployment//packaging/docker/rpm-docker/RPMS/noarch/metron-profiler-0.4.1-201707131420.noarch.rpm
</pre></div></div>
</li>
<li>
<p>Copy the Profiler RPM to the installation host.</p>
<p>The installation host must be the same host on which core Metron was installed. Depending on how you installed Metron, the Profiler RPM might have already been copied to this host with the other Metron RPMs.</p>
<div>
<div>
<pre class="source">[root@node1 ~]# find /localrepo/ -name &quot;metron-profiler*.rpm&quot;
/localrepo/metron-profiler-0.4.1-201707112313.noarch.rpm
</pre></div></div>
</li>
<li>
<p>Install the RPM.</p>
<div>
<div>
<pre class="source">[root@node1 ~]# rpm -ivh metron-profiler-*.noarch.rpm
Preparing... ########################################### [100%]
1:metron-profiler ########################################### [100%]
</pre></div></div>
<div>
<div>
<pre class="source">[root@node1 ~]# rpm -ql metron-profiler
/usr/metron
/usr/metron/0.4.2
/usr/metron/0.4.2/bin
/usr/metron/0.4.2/bin/start_profiler_topology.sh
/usr/metron/0.4.2/config
/usr/metron/0.4.2/config/profiler.properties
/usr/metron/0.4.2/flux
/usr/metron/0.4.2/flux/profiler
/usr/metron/0.4.2/flux/profiler/remote.yaml
/usr/metron/0.4.2/lib
/usr/metron/0.4.2/lib/metron-profiler-0.4.2-uber.jar
</pre></div></div>
</li>
<li>
<p>Edit the configuration file located at <tt>$METRON_HOME/config/profiler.properties</tt>.</p>
<div>
<div>
<pre class="source">kafka.zk=node1:2181
kafka.broker=node1:6667
</pre></div></div>
<ul>
<li>Change <tt>kafka.zk</tt> to refer to Zookeeper in your environment.</li>
<li>Change <tt>kafka.broker</tt> to refer to a Kafka Broker in your environment.</li>
</ul>
</li>
<li>
<p>Create a table within HBase that will store the profile data. By default, the table is named <tt>profiler</tt> with a column family <tt>P</tt>. The table name and column family must match the Profiler&#x2019;s configuration (see <a href="#Configuring_the_Profiler">Configuring the Profiler</a>).</p>
<div>
<div>
<pre class="source">$ /usr/hdp/current/hbase-client/bin/hbase shell
hbase(main):001:0&gt; create 'profiler', 'P'
</pre></div></div>
</li>
<li>
<p>Start the Profiler topology.</p>
<div>
<div>
<pre class="source">$ cd $METRON_HOME
$ bin/start_profiler_topology.sh
</pre></div></div>
</li>
</ol>
<p>At this point the Profiler is running and consuming telemetry messages. We have not defined any profiles yet, so it is not doing anything very useful. The next section walks you through the steps to create your very first &#x201c;Hello, World!&#x201d; profile.</p></div></div>
<div class="section">
<h2><a name="Configuring_the_Profiler"></a>Configuring the Profiler</h2>
<p>The Profiler runs as an independent Storm topology. The configuration for the Profiler topology is stored in local filesystem at <tt>$METRON_HOME/config/profiler.properties</tt>. After changing these values, the Profiler topology must be restarted for the changes to take effect.</p>
<table border="0" class="table table-striped">
<thead>
<tr class="a">
<th> Setting </th>
<th> Description</th></tr>
</thead><tbody>
<tr class="b">
<td> <a href="#profiler.input.topic"><tt>profiler.input.topic</tt></a> </td>
<td> The name of the input Kafka topic.</td></tr>
<tr class="a">
<td> <a href="#profiler.output.topic"><tt>profiler.output.topic</tt></a> </td>
<td> The name of the output Kafka topic.</td></tr>
<tr class="b">
<td> <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a> </td>
<td> The duration of each profile period.</td></tr>
<tr class="a">
<td> <a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a> </td>
<td> The units used to specify the <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a>.</td></tr>
<tr class="b">
<td> <a href="#profiler.window.duration"><tt>profiler.window.duration</tt></a> </td>
<td> The duration of each profile window.</td></tr>
<tr class="a">
<td> <a href="#profilerpwindowdurationunits"><tt>profiler.window.duration.units</tt></a> </td>
<td> The units used to specify the <a href="#profiler.window.duration"><tt>profiler.window.duration</tt></a>.</td></tr>
<tr class="b">
<td> <a href="#profiler.window.lag"><tt>profiler.window.lag</tt></a> </td>
<td> The maximum time lag for timestamps.</td></tr>
<tr class="a">
<td> <a href="#profilerpwindowlagunits"><tt>profiler.window.lag.units</tt></a> </td>
<td> The units used to specify the <a href="#profiler.window.lag"><tt>profiler.window.lag</tt></a>.</td></tr>
<tr class="b">
<td> <a href="#profiler.workers"><tt>profiler.workers</tt></a> </td>
<td> The number of worker processes for the topology.</td></tr>
<tr class="a">
<td> <a href="#profiler.executors"><tt>profiler.executors</tt></a> </td>
<td> The number of executors to spawn per component.</td></tr>
<tr class="b">
<td> <a href="#profiler.ttl"><tt>profiler.ttl</tt></a> </td>
<td> If a message has not been applied to a Profile in this period of time, the Profile will be forgotten and its resources will be cleaned up.</td></tr>
<tr class="a">
<td> <a href="#profiler.ttl.units"><tt>profiler.ttl.units</tt></a> </td>
<td> The units used to specify the <tt>profiler.ttl</tt>.</td></tr>
<tr class="b">
<td> <a href="#profiler.hbase.salt.divisor"><tt>profiler.hbase.salt.divisor</tt></a> </td>
<td> A salt is prepended to the row key to help prevent hot-spotting.</td></tr>
<tr class="a">
<td> <a href="#profiler.hbase.table"><tt>profiler.hbase.table</tt></a> </td>
<td> The name of the HBase table that profiles are written to.</td></tr>
<tr class="b">
<td> <a href="#profiler.hbase.column.family"><tt>profiler.hbase.column.family</tt></a> </td>
<td> The column family used to store profiles.</td></tr>
<tr class="a">
<td> <a href="#profiler.hbase.batch"><tt>profiler.hbase.batch</tt></a> </td>
<td> The number of puts that are written to HBase in a single batch.</td></tr>
<tr class="b">
<td> <a href="#profiler.hbase.flush.interval.seconds"><tt>profiler.hbase.flush.interval.seconds</tt></a> </td>
<td> The maximum number of seconds between batch writes to HBase.</td></tr>
<tr class="a">
<td> <a href="#topology.kryo.register"><tt>topology.kryo.register</tt></a> </td>
<td> Storm will use Kryo serialization for these classes.</td></tr>
<tr class="b">
<td> <a href="#profiler.writer.batchSize"><tt>profiler.writer.batchSize</tt></a> </td>
<td> The number of records to batch when writing to Kakfa.</td></tr>
<tr class="a">
<td> <a href="#profiler.writer.batchTimeout"><tt>profiler.writer.batchTimeout</tt></a> </td>
<td> The timeout in ms for batching when writing to Kakfa.</td></tr>
</tbody>
</table>
<div class="section">
<h3><a name="profiler.input.topic"></a><tt>profiler.input.topic</tt></h3>
<p><i>Default</i>: indexing</p>
<p>The name of the Kafka topic from which to consume data. By default, the Profiler consumes data from the <tt>indexing</tt> topic so that it has access to fully enriched telemetry.</p></div>
<div class="section">
<h3><a name="profiler.output.topic"></a><tt>profiler.output.topic</tt></h3>
<p><i>Default</i>: enrichments</p>
<p>The name of the Kafka topic to which profile data is written. This property is only applicable to profiles that define the <a href="#result"><tt>result</tt> <tt>triage</tt> field</a>. This allows Profile data to be selectively triaged like any other source of telemetry in Metron.</p></div>
<div class="section">
<h3><a name="profiler.period.duration"></a><tt>profiler.period.duration</tt></h3>
<p><i>Default</i>: 15</p>
<p>The duration of each profile period. This value should be defined along with <a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a>.</p>
<p><i>Important</i>: To read a profile using the <a href="metron-analytics/metron-profiler-client/index.html">Profiler Client</a>, the Profiler Client&#x2019;s <tt>profiler.client.period.duration</tt> property must match this value. Otherwise, the Profiler Client will be unable to read the profile data.</p></div>
<div class="section">
<h3><a name="profiler.period.duration.units"></a><tt>profiler.period.duration.units</tt></h3>
<p><i>Default</i>: MINUTES</p>
<p>The units used to specify the <tt>profiler.period.duration</tt>. This value should be defined along with <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a>.</p>
<p><i>Important</i>: To read a profile using the Profiler Client, the Profiler Client&#x2019;s <tt>profiler.client.period.duration.units</tt> property must match this value. Otherwise, the <a href="metron-analytics/metron-profiler-client/index.html">Profiler Client</a> will be unable to read the profile data.</p></div>
<div class="section">
<h3><a name="profiler.window.duration"></a><tt>profiler.window.duration</tt></h3>
<p><i>Default</i>: 30</p>
<p>The duration of each profile window. Telemetry that arrives within a slice of time is processed within a single window.</p>
<p>Many windows of telemetry will be processed during a single profile period. This does not change the output of the Profiler, it only changes how the Profiler processes data. The window defines how much data the Profiler processes in a single pass.</p>
<p>This value should be defined along with <a href="#profiler.window.duration.units"><tt>profiler.window.duration.units</tt></a>.</p>
<p>This value must be less than the period duration as defined by <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a> and <a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a>.</p></div>
<div class="section">
<h3><a name="profiler.window.duration.units"></a><tt>profiler.window.duration.units</tt></h3>
<p><i>Default</i>: SECONDS</p>
<p>The units used to specify the <tt>profiler.window.duration</tt>. This value should be defined along with <a href="#profiler.window.duration"><tt>profiler.window.duration</tt></a>.</p></div>
<div class="section">
<h3><a name="profiler.window.lag"></a><tt>profiler.window.lag</tt></h3>
<p><i>Default</i>: 1</p>
<p>The maximum time lag for timestamps. Timestamps cannot arrive out-of-order by more than this amount. This value should be defined along with <a href="#profiler.window.lag.units"><tt>profiler.window.lag.units</tt></a>.</p></div>
<div class="section">
<h3><a name="profiler.window.lag.units"></a><tt>profiler.window.lag.units</tt></h3>
<p><i>Default</i>: SECONDS</p>
<p>The units used to specify the <tt>profiler.window.lag</tt>. This value should be defined along with <a href="#profiler.window.lag"><tt>profiler.window.lag</tt></a>.</p></div>
<div class="section">
<h3><a name="profiler.workers"></a><tt>profiler.workers</tt></h3>
<p><i>Default</i>: 1</p>
<p>The number of worker processes to create for the Profiler topology. This property is useful for performance tuning the Profiler.</p></div>
<div class="section">
<h3><a name="profiler.executors"></a><tt>profiler.executors</tt></h3>
<p><i>Default</i>: 0</p>
<p>The number of executors to spawn per component for the Profiler topology. This property is useful for performance tuning the Profiler.</p></div>
<div class="section">
<h3><a name="profiler.ttl"></a><tt>profiler.ttl</tt></h3>
<p><i>Default</i>: 30</p>
<p>If a message has not been applied to a Profile in this period of time, the Profile will be terminated and its resources will be cleaned up. This value should be defined along with <a href="#profiler.ttl.units"><tt>profiler.ttl.units</tt></a>.</p>
<p>This time-to-live does not affect the persisted Profile data in HBase. It only affects the state stored in memory during the execution of the latest profile period. This state will be deleted if the time-to-live is exceeded.</p></div>
<div class="section">
<h3><a name="profiler.ttl.units"></a><tt>profiler.ttl.units</tt></h3>
<p><i>Default</i>: MINUTES</p>
<p>The units used to specify the <a href="#profiler.ttl"><tt>profiler.ttl</tt></a>.</p></div>
<div class="section">
<h3><a name="profiler.hbase.salt.divisor"></a><tt>profiler.hbase.salt.divisor</tt></h3>
<p><i>Default</i>: 1000</p>
<p>A salt is prepended to the row key to help prevent hotspotting. This constant is used to generate the salt. This constant should be roughly equal to the number of nodes in the Hbase cluster to ensure even distribution of data.</p></div>
<div class="section">
<h3><a name="profiler.hbase.table"></a><tt>profiler.hbase.table</tt></h3>
<p><i>Default</i>: profiler</p>
<p>The name of the HBase table that profile data is written to. The Profiler expects that the table exists and is writable. It will not create the table.</p></div>
<div class="section">
<h3><a name="profiler.hbase.column.family"></a><tt>profiler.hbase.column.family</tt></h3>
<p><i>Default</i>: P</p>
<p>The column family used to store profile data in HBase.</p></div>
<div class="section">
<h3><a name="profiler.hbase.batch"></a><tt>profiler.hbase.batch</tt></h3>
<p><i>Default</i>: 10</p>
<p>The number of puts that are written to HBase in a single batch.</p></div>
<div class="section">
<h3><a name="profiler.hbase.flush.interval.seconds"></a><tt>profiler.hbase.flush.interval.seconds</tt></h3>
<p><i>Default</i>: 30</p>
<p>The maximum number of seconds between batch writes to HBase.</p></div>
<div class="section">
<h3><a name="topology.kryo.register"></a><tt>topology.kryo.register</tt></h3>
<p><i>Default</i>:</p>
<div>
<div>
<pre class="source">[ org.apache.metron.profiler.ProfileMeasurement, \
org.apache.metron.profiler.ProfilePeriod, \
org.apache.metron.common.configuration.profiler.ProfileResult, \
org.apache.metron.common.configuration.profiler.ProfileResultExpressions, \
org.apache.metron.common.configuration.profiler.ProfileTriageExpressions, \
org.apache.metron.common.configuration.profiler.ProfilerConfig, \
org.apache.metron.common.configuration.profiler.ProfileConfig, \
org.json.simple.JSONObject, \
java.util.LinkedHashMap, \
org.apache.metron.statistics.OnlineStatisticsProvider ]
</pre></div></div>
<p>Storm will use Kryo serialization for these classes. Kryo serialization is more performant than Java serialization, in most cases.</p>
<p>For these classes, Storm will uses Kryo&#x2019;s <tt>FieldSerializer</tt> as defined in the <a class="externalLink" href="http://storm.apache.org/releases/1.1.2/Serialization.html">Storm Serialization docs</a>. For all other classes not in this list, Storm defaults to using Java serialization which is slower and not recommended for a production topology.</p>
<p>This value should only need altered if you have defined a profile that results in a non-primitive, user-defined type that is not in this list. If the class is not defined in this list, Java serialization will be used and the class must adhere to Java&#x2019;s serialization requirements.</p>
<p>The performance of the entire Profiler topology can be negatively impacted if any profile produces results that undergo Java serialization.</p></div></div>
</div>
</div>
</div>
<hr/>
<footer>
<div class="container-fluid">
<div class="row-fluid">
© 2015-2016 The Apache Software Foundation. Apache Metron, Metron, Apache, the Apache feather logo,
and the Apache Metron project logo are trademarks of The Apache Software Foundation.
</div>
</div>
</footer>
</body>
</html>