blob: 3b7dbe14c9f86e2ca9210edcae3105f56848ede3 [file] [log] [blame]
[[grok-dataformat]]
= Grok DataFormat
:page-source: components/camel-grok/src/main/docs/grok-dataformat.adoc
*Available as of Camel version 3.0*
This component provides dataformat for processing inputs with grok patterns.
Grok patterns are used to process unstructured data into structured objects - `List<Map<String, Object>>`.
This component is based on https://github.com/thekrakken/java-grok[Java Grok library]
Maven users will need to add the following dependency to their `pom.xml`
for this component:
[source,xml]
------------------------------------------------------------
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-grok</artifactId>
<version>x.x.x</version>
<!-- use the same version as your Camel core version -->
</dependency>
------------------------------------------------------------
== Basic usage
Extract all IP adresses from input
[source,java]
--------------------------------------------------------------------------------
from("direct:in")
.unmarshal().grok("%{IP:ip}")
.to("log:out");
--------------------------------------------------------------------------------
Parse Apache logs and process only 4xx responses
[source,java]
--------------------------------------------------------------------------------
from("file://apacheLogs")
.unmarshal().grok("%{COMBINEDAPACHELOG")
.split(body()).filter(simple("${body[response]} starts with '4'"))
.to("log:4xx")
--------------------------------------------------------------------------------
== Predefined patterns
This component comes with predefined patterns, which are based on Logstash patterns.
Full list can be found at https://github.com/thekrakken/java-grok/tree/master/src/main/resources/patterns[Java Grok repository]
== Custom patterns
Camel Grok DataFormat supports plugable patterns, which are auto loaded from Camel Registry.
You can register patterns with Java DSL and Spring DSL
Spring DSL:
[source,xml]
--------------------------------------------------------------------------------
<beans>
<bean id="myCustomPatternBean" class="org.apache.camel.component.grok.GrokPattern">
<constructor-arg value="FOOBAR"/>
<constructor-arg value="foo|bar"/>
</bean>
<beans>
<camelContext id="camel" xmlns="http://camel.apache.org/schema/spring">
<route>
<from uri="direct:in"/>
<unmarshal>
<grok pattern="%{FOOBAR:fooBar}"/>
</unmarshal>
<to uri="log:out"/>
</route>
</camelContext>
--------------------------------------------------------------------------------
Java DSL:
[source,java]
--------------------------------------------------------------------------------
public class MyRouteBuilder extends RouteBuilder {
@Override
public void configure() throws Exception {
bindToRegistry("myCustomPatternBean", new GrokPattern("FOOBAR", "foo|bar"));
from("direct:in")
.unmarshal().grok("%{FOOBAR:fooBar}")
.to("log:out");
}
}
--------------------------------------------------------------------------------
== Grok Dataformat Options
// dataformat options: START
The Grok dataformat supports 5 options, which are listed below.
[width="100%",cols="2s,1m,1m,6",options="header"]
|===
| Name | Default | Java Type | Description
| pattern | | String | The grok pattern to match lines of input
| flattened | false | Boolean | Turns on flattened mode. In flattened mode the exception is thrown when there are multiple pattern matches with same key.
| allowMultipleMatchesPerLine | true | Boolean | If false, every line of input is matched for pattern only once. Otherwise the line can be scanned multiple times when non-terminal pattern is used.
| namedOnly | false | Boolean | Whether to capture named expressions only or not (i.e. %{IP:ip} but not \$\{IP\})
| contentTypeHeader | false | Boolean | Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc.
|===
// dataformat options: END
// spring-boot-auto-configure options: START
== Spring Boot Auto-Configuration
When using Spring Boot make sure to use the following Maven dependency to have support for auto configuration:
[source,xml]
----
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-grok-starter</artifactId>
<version>x.x.x</version>
<!-- use the same version as your Camel core version -->
</dependency>
----
The component supports 6 options, which are listed below.
[width="100%",cols="2,5,^1,2",options="header"]
|===
| Name | Description | Default | Type
| *camel.dataformat.grok.allow-multiple-matches-per-line* | If false, every line of input is matched for pattern only once. Otherwise the line can be scanned multiple times when non-terminal pattern is used. | true | Boolean
| *camel.dataformat.grok.content-type-header* | Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc. | false | Boolean
| *camel.dataformat.grok.enabled* | Whether to enable auto configuration of the grok data format. This is enabled by default. | | Boolean
| *camel.dataformat.grok.flattened* | Turns on flattened mode. In flattened mode the exception is thrown when there are multiple pattern matches with same key. | false | Boolean
| *camel.dataformat.grok.named-only* | Whether to capture named expressions only or not (i.e. %{IP:ip} but not \$\{IP\}) | false | Boolean
| *camel.dataformat.grok.pattern* | The grok pattern to match lines of input | | String
|===
// spring-boot-auto-configure options: END
ND