blob: 1469d6f89249cca7561bcd94cf0698a041b2fb24 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia Site Renderer 1.8 from src/site/markdown/metron-platform/metron-writer/index.md at 2018-12-14
| Rendered using Apache Maven Fluido Skin 1.7
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20181214" />
<meta http-equiv="Content-Language" content="en" />
<title>Metron &#x2013; Writer</title>
<link rel="stylesheet" href="../../css/apache-maven-fluido-1.7.min.css" />
<link rel="stylesheet" href="../../css/site.css" />
<link rel="stylesheet" href="../../css/print.css" media="print" />
<script type="text/javascript" src="../../js/apache-maven-fluido-1.7.min.js"></script>
<script type="text/javascript">
$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );
</script>
</head>
<body class="topBarDisabled">
<div class="container-fluid">
<div id="banner">
<div class="pull-left"><a href="http://metron.apache.org/" id="bannerLeft"><img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/></a></div>
<div class="pull-right"></div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li class=""><a href="http://www.apache.org" class="externalLink" title="Apache">Apache</a><span class="divider">/</span></li>
<li class=""><a href="http://metron.apache.org/" class="externalLink" title="Metron">Metron</a><span class="divider">/</span></li>
<li class=""><a href="../../index.html" title="Documentation">Documentation</a><span class="divider">/</span></li>
<li class="active ">Writer</li>
<li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-12-14</li>
<li id="projectVersion" class="pull-right">Version: 0.7.0</li>
</ul>
</div>
<div class="row-fluid">
<div id="leftColumn" class="span2">
<div class="well sidebar-nav">
<ul class="nav nav-list">
<li class="nav-header">User Documentation</li>
<li><a href="../../index.html" title="Metron"><span class="icon-chevron-down"></span>Metron</a>
<ul class="nav nav-list">
<li><a href="../../CONTRIBUTING.html" title="CONTRIBUTING"><span class="none"></span>CONTRIBUTING</a></li>
<li><a href="../../Upgrading.html" title="Upgrading"><span class="none"></span>Upgrading</a></li>
<li><a href="../../metron-analytics/index.html" title="Analytics"><span class="icon-chevron-right"></span>Analytics</a></li>
<li><a href="../../metron-contrib/metron-docker/index.html" title="Docker"><span class="none"></span>Docker</a></li>
<li><a href="../../metron-contrib/metron-performance/index.html" title="Performance"><span class="none"></span>Performance</a></li>
<li><a href="../../metron-deployment/index.html" title="Deployment"><span class="icon-chevron-right"></span>Deployment</a></li>
<li><a href="../../metron-interface/metron-alerts/index.html" title="Alerts"><span class="none"></span>Alerts</a></li>
<li><a href="../../metron-interface/metron-config/index.html" title="Config"><span class="none"></span>Config</a></li>
<li><a href="../../metron-interface/metron-rest/index.html" title="Rest"><span class="none"></span>Rest</a></li>
<li><a href="../../metron-platform/index.html" title="Platform"><span class="icon-chevron-down"></span>Platform</a>
<ul class="nav nav-list">
<li><a href="../../metron-platform/Performance-tuning-guide.html" title="Performance-tuning-guide"><span class="none"></span>Performance-tuning-guide</a></li>
<li><a href="../../metron-platform/metron-common/index.html" title="Common"><span class="none"></span>Common</a></li>
<li><a href="../../metron-platform/metron-data-management/index.html" title="Data-management"><span class="none"></span>Data-management</a></li>
<li><a href="../../metron-platform/metron-elasticsearch/index.html" title="Elasticsearch"><span class="none"></span>Elasticsearch</a></li>
<li><a href="../../metron-platform/metron-enrichment/index.html" title="Enrichment"><span class="icon-chevron-right"></span>Enrichment</a></li>
<li><a href="../../metron-platform/metron-indexing/index.html" title="Indexing"><span class="none"></span>Indexing</a></li>
<li><a href="../../metron-platform/metron-job/index.html" title="Job"><span class="none"></span>Job</a></li>
<li><a href="../../metron-platform/metron-management/index.html" title="Management"><span class="none"></span>Management</a></li>
<li><a href="../../metron-platform/metron-parsers/index.html" title="Parsers"><span class="icon-chevron-right"></span>Parsers</a></li>
<li><a href="../../metron-platform/metron-pcap-backend/index.html" title="Pcap-backend"><span class="none"></span>Pcap-backend</a></li>
<li><a href="../../metron-platform/metron-solr/index.html" title="Solr"><span class="none"></span>Solr</a></li>
<li class="active"><a href="#"><span class="none"></span>Writer</a></li>
</ul>
</li>
<li><a href="../../metron-sensors/index.html" title="Sensors"><span class="icon-chevron-right"></span>Sensors</a></li>
<li><a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example"><span class="none"></span>Stellar-3rd-party-example</a></li>
<li><a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common"><span class="icon-chevron-right"></span>Stellar-common</a></li>
<li><a href="../../metron-stellar/stellar-zeppelin/index.html" title="Stellar-zeppelin"><span class="none"></span>Stellar-zeppelin</a></li>
<li><a href="../../use-cases/index.html" title="Use-cases"><span class="icon-chevron-right"></span>Use-cases</a></li>
</ul>
</li>
</ul>
<hr />
<div id="poweredBy">
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /></a>
</div>
</div>
</div>
<div id="bodyColumn" class="span10" >
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<h1>Writer</h1>
<p><a name="Writer"></a></p>
<div class="section">
<h2><a name="Introduction"></a>Introduction</h2>
<p>The writer module provides some utilties for writing to outside components from within Storm. This includes managing bulk writing. An implemention is included for writing to HDFS in this module. Other writers can be found in their own modules.</p></div>
<div class="section">
<h2><a name="Kafka_Writer"></a>Kafka Writer</h2>
<p>We have an implementation of a writer which will write batches of messages to Kafka. An interesting aspect of this writer is that it can be configured to allow users to specify a message field which contains the topic for the message.</p>
<p>The configuration for this writer is held in the individual Sensor Configurations:</p>
<ul>
<li><a href="../metron-enrichment/index.html#sensor-enrichment-configuration">Enrichment</a> under the <tt>config</tt> element</li>
<li><a href="../metron-parsers/index.html#parser-configuration">Parsers</a> in the <tt>parserConfig</tt> element</li>
<li>Profiler - Unsupported currently</li>
</ul>
<p>In each of these, the kafka writer can be configured via a map which has the following elements:</p>
<ul>
<li><tt>kafka.brokerUrl</tt> : The broker URL</li>
<li><tt>kafka.keySerializer</tt> : The key serializer (defaults to <tt>StringSerializer</tt>)</li>
<li><tt>kafka.valueSerializer</tt> : The key serializer (defaults to <tt>StringSerializer</tt>)</li>
<li><tt>kafka.zkQuorum</tt> : The zookeeper quorum</li>
<li><tt>kafka.requiredAcks</tt> : Whether to require acks.</li>
<li><tt>kafka.topic</tt> : The topic to write to</li>
<li><tt>kafka.topicField</tt> : The field to pull the topic from. If this is specified, then the producer will use this. If it is unspecified, then it will default to the <tt>kafka.topic</tt> property. If neither are specified, then an error will occur.</li>
<li><tt>kafka.producerConfigs</tt> : A map of kafka producer configs for advanced customization.</li>
</ul></div>
<div class="section">
<h2><a name="HDFS_Writer"></a>HDFS Writer</h2>
<p>The HDFS writer included here expands on what Storm has in several ways. There&#x2019;s customization in syncing to HDFS, rotation policy, etc. In addition, the writer allows for users to define output paths based on the fields in the provided JSON message. This can be defined using Stellar.</p>
<p>To manage the output path, a base path argument is provided by the Flux file, with the FileNameFormat as follows</p>
<div>
<div>
<pre class="source"> - id: &quot;fileNameFormat&quot;
className: &quot;org.apache.storm.hdfs.bolt.format.DefaultFileNameFormat&quot;
configMethods:
- name: &quot;withPrefix&quot;
args:
- &quot;enrichment-&quot;
- name: &quot;withExtension&quot;
args:
- &quot;.json&quot;
- name: &quot;withPath&quot;
args:
- &quot;/apps/metron/&quot;
</pre></div></div>
<p>This means that all output will land in <tt>/apps/metron/</tt>. With no further adjustment, it will be <tt>/apps/metron/&lt;sensor&gt;/</tt>. However, by modifying the sensor&#x2019;s JSON config, it is possible to provide additional pathing based on the the message itself.</p>
<p>E.g.</p>
<div>
<div>
<pre class="source">{
&quot;index&quot;: &quot;bro&quot;,
&quot;batchSize&quot;: 5,
&quot;outputPathFunction&quot;: &quot;FORMAT('uid-%s', uid)&quot;
}
</pre></div></div>
<p>will land data in <tt>/apps/metron/uid-&lt;uid&gt;/</tt>.</p>
<p>For example, if the data contains uid&#x2019;s 1, 3, and 5, there will be 3 output folders in HDFS:</p>
<div>
<div>
<pre class="source">/apps/metron/uid-1/
/apps/metron/uid-3/
/apps/metron/uid-5/
</pre></div></div>
<p>The Stellar function must return a String, but is not limited to FORMAT functions. Other functions, such as <tt>TO_LOWER</tt>, <tt>TO_UPPER</tt>, etc. are all available for use. Typically, it&#x2019;s preferable to do nontrivial transformations as part of enrichment and simply reference the output here.</p>
<p>If no Stellar function is provided, it will default to putting the sensor in a folder, as above.</p>
<p>A caveat is that the writer will only allow a certain number of files to be created at once. HdfsWriter has a function <tt>withMaxOpenFiles</tt> allowing this to be set. The default is 500. This can be set in Flux:</p>
<div>
<div>
<pre class="source"> - id: &quot;hdfsWriter&quot;
className: &quot;org.apache.metron.writer.hdfs.HdfsWriter&quot;
configMethods:
- name: &quot;withFileNameFormat&quot;
args:
- ref: &quot;fileNameFormat&quot;
- name: &quot;withRotationPolicy&quot;
args:
- ref: &quot;hdfsRotationPolicy&quot;
- name: &quot;withMaxOpenFiles&quot;
args: 500
</pre></div></div></div>
</div>
</div>
</div>
<hr/>
<footer>
<div class="container-fluid">
<div class="row-fluid">
© 2015-2016 The Apache Software Foundation. Apache Metron, Metron, Apache, the Apache feather logo,
and the Apache Metron project logo are trademarks of The Apache Software Foundation.
</div>
</div>
</footer>
</body>
</html>