blob: fe5afa56da44a9d309f02ff8716c2c3f32856ea8 [file] [log] [blame]
<!-- vim: set syn=markdown : -->
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
#set($h1='#')
#set($h2='##')
#set($h3='###')
$h1 Apache Log4j Audit
$h2 What is Audit Logging
The basic purpose of audit logging is to provide the information to determine what actions have been performed, or
attempted to be performed, by whom, when they did it and what the data associated with the action was. Audit log
data is most typically used by security teams to monitor for fraud or illegal or otherwise unauthorized activities
and to be able to correct changes that were incorrectly made. Audit logs are often used to demonstrate compliance
with legal obligations such as Sarbanes-Oxley or HIPPA.
$h3 What is the difference between audit logging and "normal" logging?
In a typical application developers add logging statements at key points to help diagnose problems or to
document unexpected occurances, such as a failure to communicate with a key service. These are normally referred
to as diagnostic logs. Diagnostic logs are critical in maintaining the servicability of the application
but generally aren't very useful in helping to determine who made a particular change and when it was done.
On the other hand, audit logs focus on identifying when a change was made, who made it, and what data elements were
changed. While audit logs can sometimes help in troubleshooting problems, that isn't their primary purpose just as
the primary purpose of diagnostic logging is not to record actions taken by people using the system.
A key difference between diagnostic logs and audit logs is that while diagnostic logs are generally free form
with the content left up to the developer, audit logs usually contain the action being performed and the data
elements that have been impacted. In many systems audit logs are written to a database where the values for these
elements may be written to specific columns. In recent times it is more common to see audit logs written to NoSQL
data stores where they can be efficiently queried.
Another key difference is that audit logs are often used to generate reports that are of value to several parts of
the organization. For example, an auditor at a bank might be interested in locating all the accounts with more than
3 failed login attempts the prior day, or all the transfers for more than $10,000.00. A customer service representative
might want to watch the audit events for a specific user while they are on the phone with them troubleshooting a problem.
Generally, diagnostic logs do not provide the information to support these use cases while audit logs do.
Finally, diagnostic logs and audit logs differ in that diagnostic logs typically are categorized via "levels"
while audit logs are not. Diagnostic logging typically includes fairly detailed logging of the internals of the system
via debug level messages. Other messages may be classified as informational or, in response to some kind of an
error, a warning or error log message may be generated. In contrast to this, audit logs are usually always logged
so a level is meaningless.
While audit logs and diagnostic logs may differ in their format and usage, Log4j can still play a part in delivering
both. Unlike systems where audit logs are written directly to a database, the use of Log4j and Log4j-Audit allows
applications to be written to perform audit logging without any knowledge of where those will end up and what they will
be used for. Furthermore, the log events can be easily transformed into various formats that best matches their
intended uses.
$h2 What is Log4j Audit?
Log4j Audit provides a framework for defining audit events and then logging them using Log4j. The framework focuses on
defining the events and providing an easy mechanism for applications to log them, allowing products to provide
consistency and validity to the events that are logged. It does not focus on how the log events are written to a
data store. Log4j itself provides many options for that.
Log4j Audit builds upon Log4j by defining its own AuditMessage. An AuditMessage is a container for
[StructuredDataMessage](http://logging.apache.org/log4j/2.x/manual/messages.html)s which allows a log message to be
generated that contains a set of keys and values. The AuditMessage passes through Log4j as any other log event
would.
$h2 Features
Each application has its own events that need to be audited. Before using Log4j Audit, application owners need to define
the AuditMessages that capture the exact attributes of that need to be captured. The [Getting Started](gettingStarted.html) page
provides a tutorial that explains how to define audit events for an application.
$h3 Audit Event Catalog
Once audit events are defined, they need to be maintained: as the application evolves, developers will inevitably
discover they need to add, remove or change attributes of the audit events. Log4j Audit can persist the audit event
definitions in a JSON file. This file becomes the Audit Event Catalog for the application. Log4j Audit is designed
to store the event definition file in a Git repository so that the evolution of the audit events themselves have an
audit trail in the Git history of the file. Log4j Audit provides a web interface for editing the events.
Log4j Audit uses a catalog of events to determine what events can be logged and to validate the events. Java
interfaces are generated from the catalog for applications to use when generating the audit events. A web UI
is provided to edit and manage the catalog but the UI requires a specific user's Git credentials and
so is best run on the local user's computer.
$h3 Dynamic Event Catalog
The typical event catalog is considered static. It is defined during the development process and cannot be changed in
a running system without a new release.
In addition to the catalog that is managed by the web UI a set of dynamic catalogs can also be used. These catalogs
can be used by multi-tenant applications where each tenant has a different set of events and/or attributes that need
to be logged. The dynamic events are defined while the application is running and because they are specific to a
tenant in the running system. Because they are dynamic they are maintained in a database. Changes to the dynamic
catalog do not affect the static catalog. Changes to the dynamic catalog are performed using the audit service's REST
API.
$h3 Audit Service
Non-Java applications can perform audit logging by generating the event data and passing it to an auditing service
where the events will be validated and logged.
$h3 Sample Project
Users of Log4j Audit must provide a Git repository where the audit events will reside which will be used to generate
the Java event interfaces. Log4j Audit provides a sample project that contains a sample catalog file and creates
the Java interfaces from that file. The project also contains a module that generates a war file containing the
editor for that catalog as well as the REST endpoints to manipulate the dynamic catalogs.
$h2 Requirements
Log4j Audit requires Java 8 and Log4j 2.10.0 or greater.
A Git repo to maintain the JSON catalog of events and attributes is required.
A database is required if the application wants to define dynamic catalog events and attributes. Log4j Audit provides
built in support for PostgresQL.