Guide To Install Eagle Hortonworks sandbox.

  • Prerequisite
  • Download + Patch + Build
  • Setup Hadoop Environment.
  • Stream HDFS audit logs into Kafka.
  • Install Eagle.
  • Demos “HDFS File System” & “Hive” monitoring

Prerequisite

Download + Patch + Build

Setup Hadoop Environment In Sandbox

  1. Launch Ambari to manage the Hadoop environment
  2. Grant root as HBase superuser via Ambari add superuser
  3. Start HBase,Storm & Kafka from Ambari. Showing Storm as an example below. Restart Services
  4. If the NAT network is used in a virtual machine, it's required to add port 9099 to forwarding ports Forwarding Port

Stream HDFS audit logs into Kafka

Note: This section is only needed for “HDFS File System” Monitoring.

  • Step 1: Configure Advanced hadoop-log4j via Ambari UI, and add below “KAFKA_HDFS_AUDIT” log4j appender to hdfs audit logging.

    log4j.appender.KAFKA_HDFS_AUDIT=org.apache.eagle.log4j.kafka.KafkaLog4jAppender log4j.appender.KAFKA_HDFS_AUDIT.Topic=sandbox_hdfs_audit_log log4j.appender.KAFKA_HDFS_AUDIT.BrokerList=sandbox.hortonworks.com:6667 log4j.appender.KAFKA_HDFS_AUDIT.KeyClass=org.apache.eagle.log4j.kafka.hadoop.AuditLogKeyer log4j.appender.KAFKA_HDFS_AUDIT.Layout=org.apache.log4j.PatternLayout log4j.appender.KAFKA_HDFS_AUDIT.Layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n log4j.appender.KAFKA_HDFS_AUDIT.ProducerType=async

    HDFS LOG4J Configuration

  • Step 2: Edit Advanced hadoop-env via Ambari UI, and add the reference to KAFKA_HDFS_AUDIT to HADOOP_NAMENODE_OPTS.

    -Dhdfs.audit.logger=INFO,DRFAAUDIT,KAFKA_HDFS_AUDIT
    

    HDFS Environment Configuration

  • Step 3: Edit Advanced hadoop-env via Ambari UI, and append the following command to it.

    export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:/usr/hdp/current/eagle/lib/log4jkafka/lib/*
    

    HDFS Environment Configuration

  • Step 4: save the changes

  • Step 5: “Restart All” Storm & Kafka from Ambari. (Similar to starting server in pervious step except make sure to click on “Restart All”)

  • Step 6: Restart name node Restart Services

  • Step 7: Check whether logs from are flowing into topic sandbox_hdfs_audit_log

    > /usr/hdp/2.2.4.2-2/kafka/bin/kafka-console-consumer.sh --zookeeper sandbox.hortonworks.com:2181 --topic sandbox_hdfs_audit_log      
    

Install Eagle

The following installation actually contains installing and setting up a sandbox site with HdfsAuditLog & HiveQueryLog data sources.

>$ scp -P 2222  /eagle-assembly/target/eagle-0.3.0-incubating-bin.tar.gz root@127.0.0.1:/root/ <br/>
>$ ssh root@127.0.0.1 -p 2222 <br/>
>$ tar -zxvf eagle-0.3.0-incubating-bin.tar.gz <br/>
>$ mv eagle-0.3.0-incubating eagle <br/>
>$ mv eagle /usr/hdp/current/ <br/>
>$ cd /usr/hdp/current/eagle <br/>
>$ examples/eagle-sandbox-starter.sh <br/>

Demos

  • Login to Eagle UI http://localhost:9099/eagle-service/ username and password as “admin” and “secret”

  • HDFS:

    1. Click on menu “DAM” and select “HDFS” to view HDFS policy

    2. You should see policy with name “viewPrivate”. This Policy generates alert when any user reads HDFS file name “private” under “tmp” folder.

    3. In sandbox read restricted HDFS file “/tmp/private” by using command

      hadoop fs -cat “/tmp/private”

    From UI click on alert tab and you should see alert for the attempt to read restricted file.

  • Hive:

    1. Click on menu “DAM” and select “Hive” to view Hive policy

    2. You should see policy with name “queryPhoneNumber”. This Policy generates alert when hive table with sensitivity(Phone_Number) information is queried.

    3. In sandbox read restricted sensitive HIVE column.

      $ su hive
      $ hive
      $ set hive.execution.engine=mr;
      $ use xademo;
      $ select a.phone_number from customer_details a, call_detail_records b where a.phone_number=b.phone_number;

    From UI click on alert tab and you should see alert for your attempt to dfsf read restricted column.