blob: 752ea426260a911796dbb547db047126d5875019 [file] [log] [blame]
// Do not edit directly!
// This file was generated by camel-quarkus-maven-plugin:update-extension-doc-page
= Tika
:page-aliases: extensions/tika.adoc
:linkattrs:
:cq-artifact-id: camel-quarkus-tika
:cq-native-supported: true
:cq-status: Stable
:cq-status-deprecation: Stable
:cq-description: Parse documents and extract metadata and text using Apache Tika.
:cq-deprecated: false
:cq-jvm-since: 1.0.0
:cq-native-since: 1.0.0
[.badges]
[.badge-key]##JVM since##[.badge-supported]##1.0.0## [.badge-key]##Native since##[.badge-supported]##1.0.0##
Parse documents and extract metadata and text using Apache Tika.
== What's inside
* xref:{cq-camel-components}::tika-component.adoc[Tika component], URI syntax: `tika:operation`
Please refer to the above link for usage and configuration details.
== Maven coordinates
https://code.quarkus.io/?extension-search=camel-quarkus-tika[Create a new project with this extension on code.quarkus.io, window="_blank"]
Or add the coordinates to your existing project:
[source,xml]
----
<dependency>
<groupId>org.apache.camel.quarkus</groupId>
<artifactId>camel-quarkus-tika</artifactId>
</dependency>
----
Check the xref:user-guide/index.adoc[User guide] for more information about writing Camel Quarkus applications.
== Camel Quarkus limitations
Parameters `tikaConfig` and `tikaConfigUri` are not available in quarkus camel tika extension. Configuration
can be changed only via `application.properties`.
While you can use any of the available https://tika.apache.org/1.24.1/formats.html[Tika parsers] in JVM mode,
only some of those are supported in native mode - see the https://quarkiverse.github.io/quarkiverse-docs/quarkus-tika/dev/index.html[Quarkus Tika guide].
Use of the Tika parser without any configuration will initialize all available parsers. Unfortunately as some of them
don't work in the native mode, the whole execution will fail.
In order to make the Tika parser work in the native mode, selection of parsers for initialization should be used.
* `quarkus.tika.parsers` Comma separated list of parsers (abbreviations). There are two predefined parsers:
`pdf` and `odf`.
* `quarkus.tika.parser.*` Adds new parser abbreviation to be used with previous property. Value is the full class of
the parser.
Example of `application.properties`:
[source,properties]
----
quarkus.tika.parsers = pdf,odf,office
quarkus.tika.parser.office = org.apache.tika.parser.microsoft.OfficeParser
----
For more information about selecting parsers see the https://quarkiverse.github.io/quarkiverse-docs/quarkus-tika/dev/index.html[Quarkus Tika guide].
You may need to add the `quarkus-awt` extension to build the native image. For more information, see https://quarkiverse.github.io/quarkiverse-docs/quarkus-tika/dev/index.html[Quarkus Tika guide].