Apache Metron

Clone this repo:
  1. 87ff7b7 METRON-1191 update public web site to point at 0.4.1 new release (mattf-horton) closes apache/metron#764 by mattf-horton · 2 hours ago master
  2. 695e904 METRON-1063 address javadoc warnings in metron-stellar (dbist via ottobackwards) closes apache/metron#668 by dbist · 9 hours ago
  3. f9710eb METRON-1190 Fix Meta Alert Type handling in calculation of scores (justinleet) closes apache/metron#763 by justinleet · 4 days ago
  4. c642a95 METRON-1187 Indexing/Profiler Kafka ACL Groups Not Setup Correctly (nickwallen) closes apache/metron#759 by nickwallen · 4 days ago
  5. 9b7bf7c METRON-1185: Stellar REPL does not work on a kerberized cluster when calling functions interacting with HBase closes apache/incubator-metron#755 by cstella · 5 days ago

Build Status

Apache Metron

Metron integrates a variety of open source big data technologies in order to offer a centralized tool for security monitoring and analysis. Metron provides capabilities for log aggregation, full packet capture indexing, storage, advanced behavioral analytics and data enrichment, while applying the most current threat intelligence information to security telemetry within a single platform.

For the latest information, please visit our website at http://metron.apache.org/

Metron can be divided into 4 areas:

  1. A mechanism to capture, store, and normalize any type of security telemetry at extremely high rates. Because security telemetry is constantly being generated, it requires a method for ingesting the data at high speeds and pushing it to various processing units for advanced computation and analytics.

  2. Real time processing and application of enrichments such as threat intelligence, geolocation, and DNS information to telemetry being collected. The immediate application of this information to incoming telemetry provides the context and situational awareness, as well as the who and where information critical for investigation

  3. Efficient information storage based on how the information will be used:

    • Logs and telemetry are stored such that they can be efficiently mined and analyzed for concise security visibility
    • The ability to extract and reconstruct full packets helps an analyst answer questions such as who the true attacker was, what data was leaked, and where that data was sent
    • Long-term storage not only increases visibility over time, but also enables advanced analytics such as machine learning techniques to be used to create models on the information. Incoming data can then be scored against these stored models for advanced anomaly detection.
  4. An interface that gives a security investigator a centralized view of data and alerts passed through the system. Metron’s interface presents alert summaries with threat intelligence and enrichment data specific to that alert on one single page. Furthermore, advanced search capabilities and full packet extraction tools are presented to the analyst for investigation without the need to pivot into additional tools.

Big data is a natural fit for powerful security analytics. The Metron framework integrates a number of elements from the Hadoop ecosystem to provide a scalable platform for security analytics, incorporating such functionality as full-packet capture, stream processing, batch processing, real-time search, and telemetry aggregation. With Metron, our goal is to tie big data into security analytics and drive towards an extensible centralized platform to effectively enable rapid detection and rapid response for advanced security threats.

Obtaining Metron

To obtain a release of Metron, please visit http://metron.apache.org/documentation/#releases

This repository is a collection of submodules for convenience which is regularly updated to point to the latest versions. Github provides multiple ways to obtain Metron's code:

  1. git clone --recursive https://github.com/apache/metron
  2. Download ZIP
  3. Clone or download each repository individually

Option 3 is more likely to have the latest code.

Getting Started

To start exploring the capabilities of Apache Metron follow these instructions to launch Metron in a single-node VM on your own hardware.

Building Metron

Build the full project and run tests:

$ mvn clean install

Build without tests:

$ mvn clean install -DskipTests

Build with the HDP profile:

$ mvn clean install -PHDP-2.5.0.0

You can swap “install” for “package” in the commands above if you don't want to deploy the artifacts to your local .m2 repo.

Build Metron Reporting

To build and run reporting with code coverage:

$ mvn clean install
$ mvn site site:stage-deploy site:deploy

Code coverage can be skipped by skipping tests:

$ mvn clean install -DskipTests site site:stage-deploy site:deploy

The staged site is deployed to /tmp/metron/site/index.html, and can be viewed by opening the file in a browser.

Navigating the Architecture

Metron is at its core a Kappa architecture with Apache Storm as the processing component and Apache Kafka as the unified data bus.

Some high level links to the relevant subparts of the architecture, for more information:

  • Parsers : Parsing data from kafka into the Metron data model and passing it downstream to Enrichment.
  • Enrichment : Enriching data post-parsing and providing the ability to tag a message as an alert and assign a risk triage level via a custom rule language.
  • Indexing : Indexing the data post-enrichment into HDFS, Elasticsearch or Solr.

Some useful utilities that cross all of these parts of the architecture:

  • Stellar : A custom data transformation language that is used throughout metron from simple field transformation to expressing triage rules.
  • Model as a Service : A Yarn application which can deploy machine learning and statistical models onto the cluster along with the associated Stellar functions to be able to call out to them in a scalable manner.
  • Data management : A set of data management utilities aimed at getting data into HBase in a format which will allow data flowing through metron to be enriched with the results. Contains integrations with threat intelligence feeds exposed via TAXII as well as simple flat file structures.
  • Profiler : A feature extraction mechanism that can generate a profile describing the behavior of an entity. An entity might be a server, user, subnet or application. Once a profile has been generated defining what normal behavior looks-like, models can be built that identify anomalous behavior.