_threat-model-log4j.adoc - logging-site - Git at Google

 ////
     Licensed to the Apache Software Foundation (ASF) under one or more
     contributor license agreements.  See the NOTICE file distributed with
     this work for additional information regarding copyright ownership.
     The ASF licenses this file to You under the Apache License, Version 2.0
     (the "License"); you may not use this file except in compliance with
     the License.  You may obtain a copy of the License at

          https://www.apache.org/licenses/LICENSE-2.0

     Unless required by applicable law or agreed to in writing, software
     distributed under the License is distributed on an "AS IS" BASIS,
     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     See the License for the specific language governing permissions and
     limitations under the License.
 ////

 [#threat-log4j]
 === Log4j threat model

 Below we share the threat model specific to link:/log4j[Log4j].

 [#threat-log4j-parametrized-logging]
 ==== Parameterized logging

 When using a log message containing template parameters like `{}`, only the format string is evaluated for parameters to be substituted.
 The message parameters themselves are not evaluated for parameters; they are only included in the format string corresponding to their template position.
 The conversion of message parameters into a string is done on-demand depending on the layout being used.
 When structure-preserving transformations of log message data are required, the `Message` API should be used for logging structured data combined with a structured layout (e.g., `JsonTemplateLayout`).
 Format strings should be compile-time constants, and under no circumstances should format strings be built using user-controlled input data.

 [#threat-log4j-unstructured-logging]
 ==== Unstructured logging

 When using an unstructured layout such as `PatternLayout`, no guarantees can be made about the output format.
 This layout is mainly useful for development purposes and should not be relied on in production applications.
 For example, if a log message contains new lines, these are not escaped or encoded specially unless the configured pattern uses the `%encode\{pattern}\{CRLF}` wrapper pattern converter (which will encode a carriage return as the string `\r` and a line feed as the string `\n`) or some other `%encode` option.
 Note that `%xEx` is appended to the pattern unless already present.
 Similarly, other encoding options are available for other formats, but pattern layouts cannot make assumptions about the entire output.
 As such, when using unstructured layouts, no user-controlled input should be included in logs.
 It is strongly recommended that a structured layout (e.g., `JsonTemplateLayout`) is used instead for these situations.
 Note that `StrLookup` plugins (those referenced by `${...}` templates in configuration files) that contain user-provided input should not be referenced by layouts.

 [#threat-log4j-structured-logging]
 ==== Structured logging

 When using a structured layout (most layouts besides pattern layout), log messages are encoded according to various output formats.
 These safely encode the various fields included in a log message.
 For example, the `JsonTemplateLayout` can be configured to output log messages in various JSON structures where all log data is properly encoded into safely parseable JSON.
 This is the recommended mode of operation for use with log parsing and log collection tools that rely on log files or arbitrary output streams.

 [#threat-log4j-java-security-manager]
 ==== Java Security Manager

 Log4j 3 no longer supports running in or using a custom `SecurityManager`.
 This Java feature has been deprecated for removal in Java 21.
 Log4j 2 includes partial support for running with a Security Manager.

 [#threat-log4j-log-masking]
 ==== Log masking

 Log4j, like any other generic logging library, cannot generically support log masking of sensitive data.
 While custom plugins may be developed to attempt to mask various regular expressions (such as a string that looks like a credit card number), the general problem of log masking is equivalent to the halting problem in computer science where sensitive data can always be obfuscated in such a way as to avoid detection by log masking.
 As such, it is the responsibility of the developer to properly demarcate sensitive data such that it can be consistently masked by log masking plugins.
 This sort of use case should make use of the `Message` API for better control over the output of such data.

 [#threat-log4j-availability]
 ==== Availability

 Log4j goes to great lengths to minimize performance overhead along with options for minimizing latency or maximizing throughput.
 However, we cannot guarantee availability of the application if the appenders cannot keep up with the logs being written.
 Synchronous logging can cause applications to block and wait for a log message to be written.
 Asynchronous logging can also cause applications to block and wait depending on the wait strategy and queue full policy configured.
 Configuring too large or too many buffers in an application can also result in out of memory errors.

 [#threat-log4j-compressing-logs]
 ==== Compressing logs

 If log compression is used along with custom encryption where logs contain user-controlled input, then this can lead to a https://en.wikipedia.org/wiki/CRIME[CRIME attack] style vulnerability where a chosen-plaintext attack is combined with information leakage caused by how the compression algorithm handles different inputs.
 The simplest way to avoid this problem is to never combine compression with encryption when encoding user-controlled input.
	////
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	https://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	////

	[#threat-log4j]
	=== Log4j threat model

	Below we share the threat model specific to link:/log4j[Log4j].

	[#threat-log4j-parametrized-logging]
	==== Parameterized logging

	When using a log message containing template parameters like `{}`, only the format string is evaluated for parameters to be substituted.
	The message parameters themselves are not evaluated for parameters; they are only included in the format string corresponding to their template position.
	The conversion of message parameters into a string is done on-demand depending on the layout being used.
	When structure-preserving transformations of log message data are required, the `Message` API should be used for logging structured data combined with a structured layout (e.g., `JsonTemplateLayout`).
	Format strings should be compile-time constants, and under no circumstances should format strings be built using user-controlled input data.

	[#threat-log4j-unstructured-logging]
	==== Unstructured logging

	When using an unstructured layout such as `PatternLayout`, no guarantees can be made about the output format.
	This layout is mainly useful for development purposes and should not be relied on in production applications.
	For example, if a log message contains new lines, these are not escaped or encoded specially unless the configured pattern uses the `%encode\{pattern}\{CRLF}` wrapper pattern converter (which will encode a carriage return as the string `\r` and a line feed as the string `\n`) or some other `%encode` option.
	Note that `%xEx` is appended to the pattern unless already present.
	Similarly, other encoding options are available for other formats, but pattern layouts cannot make assumptions about the entire output.
	As such, when using unstructured layouts, no user-controlled input should be included in logs.
	It is strongly recommended that a structured layout (e.g., `JsonTemplateLayout`) is used instead for these situations.
	Note that `StrLookup` plugins (those referenced by `${...}` templates in configuration files) that contain user-provided input should not be referenced by layouts.

	[#threat-log4j-structured-logging]
	==== Structured logging

	When using a structured layout (most layouts besides pattern layout), log messages are encoded according to various output formats.
	These safely encode the various fields included in a log message.
	For example, the `JsonTemplateLayout` can be configured to output log messages in various JSON structures where all log data is properly encoded into safely parseable JSON.
	This is the recommended mode of operation for use with log parsing and log collection tools that rely on log files or arbitrary output streams.

	[#threat-log4j-java-security-manager]
	==== Java Security Manager

	Log4j 3 no longer supports running in or using a custom `SecurityManager`.
	This Java feature has been deprecated for removal in Java 21.
	Log4j 2 includes partial support for running with a Security Manager.

	[#threat-log4j-log-masking]
	==== Log masking

	Log4j, like any other generic logging library, cannot generically support log masking of sensitive data.
	While custom plugins may be developed to attempt to mask various regular expressions (such as a string that looks like a credit card number), the general problem of log masking is equivalent to the halting problem in computer science where sensitive data can always be obfuscated in such a way as to avoid detection by log masking.
	As such, it is the responsibility of the developer to properly demarcate sensitive data such that it can be consistently masked by log masking plugins.
	This sort of use case should make use of the `Message` API for better control over the output of such data.

	[#threat-log4j-availability]
	==== Availability

	Log4j goes to great lengths to minimize performance overhead along with options for minimizing latency or maximizing throughput.
	However, we cannot guarantee availability of the application if the appenders cannot keep up with the logs being written.
	Synchronous logging can cause applications to block and wait for a log message to be written.
	Asynchronous logging can also cause applications to block and wait depending on the wait strategy and queue full policy configured.
	Configuring too large or too many buffers in an application can also result in out of memory errors.

	[#threat-log4j-compressing-logs]
	==== Compressing logs

	If log compression is used along with custom encryption where logs contain user-controlled input, then this can lead to a https://en.wikipedia.org/wiki/CRIME[CRIME attack] style vulnerability where a chosen-plaintext attack is combined with information leakage caused by how the compression algorithm handles different inputs.
	The simplest way to avoid this problem is to never combine compression with encryption when encoding user-controlled input.