| [[any23-dataformat]] |
| = Any23 DataFormat |
| :page-source: components/camel-any23/src/main/docs/any23-dataformat.adoc |
| Camel Any23 is a DataFormat that uses the Apache Anything To Triples (Any23) library to extract structured data in RDF from a variety of documents on the web. |
| *Available as of Camel version 3.0* |
| |
| |
| *Available as of Camel version 3.0* |
| |
| The main functionality of this DataFormat focuses on its Unmarshal method which extracts RDF triplets from compatible pages, in a wide variety of RDF syntaxes. |
| Any23 is a Data Format that is intended to convert HTML from a site (or file) into rdf. |
| |
| |
| == Any23 Options |
| |
| // dataformat options: START |
| The Any23 dataformat supports 4 options, which are listed below. |
| |
| |
| |
| [width="100%",cols="2s,1m,1m,6",options="header"] |
| |=== |
| | Name | Default | Java Type | Description |
| | outputFormat | RDF4JMODEL | Any23Type | What RDF syntax to unmarshal as, can be: NTRIPLES, TURTLE, NQUADS, RDFXML, JSONLD, RDFJSON, RDF4JMODEL. It is by default: RDF4JMODEL. |
| | extractors | | List | List of Any23 extractors to be used in the unmarshal operation. A list of the available extractors can be found here here. If not provided, all the available extractors are used. |
| | baseURI | | String | The URI to use as base for building RDF entities if only relative paths are provided. |
| | contentTypeHeader | false | Boolean | Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc. |
| |=== |
| // dataformat options: END |
| // spring-boot-auto-configure options: START |
| == Spring Boot Auto-Configuration |
| |
| When using Spring Boot make sure to use the following Maven dependency to have support for auto configuration: |
| |
| [source,xml] |
| ---- |
| <dependency> |
| <groupId>org.apache.camel</groupId> |
| <artifactId>camel-any23-starter</artifactId> |
| <version>x.x.x</version> |
| <!-- use the same version as your Camel core version --> |
| </dependency> |
| ---- |
| |
| |
| The component supports 4 options, which are listed below. |
| |
| |
| |
| [width="100%",cols="2,5,^1,2",options="header"] |
| |=== |
| | Name | Description | Default | Type |
| | *camel.dataformat.any23.base-u-r-i* | The URI to use as base for building RDF entities if only relative paths are provided. | | String |
| | *camel.dataformat.any23.content-type-header* | Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc. | false | Boolean |
| | *camel.dataformat.any23.enabled* | Whether to enable auto configuration of the any23 data format. This is enabled by default. | | Boolean |
| | *camel.dataformat.any23.extractors* | List of Any23 extractors to be used in the unmarshal operation. A list of the available extractors can be found here here. If not provided, all the available extractors are used. | | List |
| |=== |
| // spring-boot-auto-configure options: END |
| |
| |
| |
| |
| == Java DSL Example |
| |
| An example where the consumer provides some HTML |
| |
| [source,java] |
| --------------------------------------------------------------------------- |
| from("direct:start").unmarshal().any23("http://mock.foo/bar").to("mock:result"); |
| --------------------------------------------------------------------------- |
| |
| == Spring XML Example |
| |
| The following example shows how to use TidyMarkup |
| to unmarshal using Spring |
| |
| [source,java] |
| ----------------------------------------------------------------------- |
| <camelContext id="camel" xmlns="http://camel.apache.org/schema/spring"> |
| <dataFormats> |
| <any23 id="any23" baseURI ="http://mock.foo/bar" outputFormat="TURTLE" > |
| <configurations> |
| <entry> |
| <key>any23.extraction.metadata.nesting</key> |
| <value>off</value> |
| </entry> |
| </configurations> |
| <extractors>html-head-title</extractors> |
| </any23> |
| </dataFormats> |
| <route> |
| <from uri="direct:start"/> |
| <to uri="http://microformats.org/2009/08"/> |
| <unmarshal> |
| <custom ref="any23"/> |
| </unmarshal> |
| <to uri="mock:result"/> |
| </route> |
| </camelContext> |
| ----------------------------------------------------------------------- |
| |
| == Dependencies |
| |
| To use Any23 in your camel routes you need to add the a dependency |
| on *camel-any23* which implements this data format. |
| |
| If you use maven you could just add the following to your pom.xml, |
| substituting the version number for the latest & greatest release (see |
| the download page for the latest versions). |
| |
| [source,java] |
| ---------------------------------------- |
| <dependency> |
| <groupId>org.apache.camel</groupId> |
| <artifactId>camel-any23</artifactId> |
| <version>x.x.x</version> |
| </dependency> |
| ---------------------------------------- |