Apache Eagle data classification feature provides the ability to classify data with different levels of sensitivity. Currently this feature is available ONLY for applications monitoring HDFS, Apache Hive and Apache HBase. For example, HdfsAuditLog, HiveQueryLog and HBaseSecurityLog.
The main content of this page are
To monitor a remote cluster, we first make sure the connection to the cluster is configured. For more details, please refer to Site Management
After the configuration is The first part is about how to add/remove sensitivity to files/directories; the second part shows how to monitor these sensitive data. In the following, we take HdfsAuditLog as an example.
add the sensitive mark to files/directories.
Basic: Label sensitivity files directly (recommended)
Advanced: Import json file/content
remove sensitive mark on files/directories
Basic: remove label directly
Advanced: delete lin batch
You can mark a particular folder/file as “PRIVATE”. Once you have this information you can create policies using this label.
For example: the following policy monitors all the operations to resources with sensitivity type “PRIVATE”.