blob: 2b3f2921a459bd1bdd1dbcf92b7c8872e73bbc61 [file] [log] [blame]
------
Apache Any23 - REST Service
------
The Apache Software Foundation
------
2011-2012
~~ Licensed to the Apache Software Foundation (ASF) under one or more
~~ contributor license agreements. See the NOTICE file distributed with
~~ this work for additional information regarding copyright ownership.
~~ The ASF licenses this file to You under the Apache License, Version 2.0
~~ (the "License"); you may not use this file except in compliance with
~~ the License. You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License.
Apache Any23 REST Service
<Apache Any23> provides REST Service module <any23-service> able to provide useful processing methods.
* Compact API
HTTP GET requests can be made to IRIs of the shape:
+---------------------------------------------------
http://<any23-service-host>/<output-format>/<input-uri>
+---------------------------------------------------
Where <input-uri> is the input HTTP resource to be processed and <output-format> is the desired output format for the extracted RDF data.
Example requests:
+---------------------------------------------------
http://any23.org/best/twitter.com/cygri
http://any23.org/rdfxml/http://data.gov
http://any23.org/ttl/http://www.w3.org/People/Berners-Lee/card
http://any23.org/?uri=http://dbpedia.org/resource/Berlin
http://any23.org/?format=nt&uri=http://dbpedia.org/resource/Berlin
+---------------------------------------------------
Supported input and output formats are described {{{./supported-formats.html} here}}.
Form-style GET API
HTTP GET requests can be made to the IRI http://any23.org/ with the following query parameters:
+---------------------------------------------------
uri IRI of an input document
format Desired output format; defaults to best
+---------------------------------------------------
* Direct POST API
HTTP POSTing a document body to http://any23.org/format will convert the document to the specified output format. The media type of the input has to be specified in the Content-Type HTTP header. Depending on the servlet container, a Content-Length header specifying the length of the input document in bytes might also be required. Typical media types for supported input formats are:
+---------------------------------------------------
Input format Media type
-------------------------------
HTML text/html
RDF/XML application/rdf+xml
Turtle text/turtle
N-Triples text/plain
N-Quads text/plain
+---------------------------------------------------
Example POST request:
+---------------------------------------------------
POST /rdfxml HTTP/1.0
Host: any23.org
Content-Type: text/turtle
Content-Length: 174
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
[] a foaf:Person;
foaf:name "John X. Foobar";
foaf:mbox_sha1sum "cef817456278b70cee8e5a1611539ef9d928810e";
.
+---------------------------------------------------
* Form-style POST API
A document body can also be converted by HTTP POSTing form data to http://any23.org/.
The Content-Type HTTP header must be set to <application/x-www-form-urlencoded>.
The following parameters are supported:
*----------+------------------------------------------------------------------------------------------------------------+
|type |Media type of the input, see the table above. If not present, auto-detection will be attempted. |
*----------+------------------------------------------------------------------------------------------------------------+
|body |Document body to be converted. |
*----------+------------------------------------------------------------------------------------------------------------+
|format |Desired output format; defaults to <<best>>. |
*----------+------------------------------------------------------------------------------------------------------------+
|validation|The validation level to be applied, supported values: <<none>> (default), <<validate>> and <<validate-fix>>.| | |
*----------+------------------------------------------------------------------------------------------------------------+
* Output Formats
Supported input and output formats are described {{{./supported-formats.html} here}}.
* Error reporting
Processing errors are indicated via HTTP status codes and brief text/plain error messages. The following status codes can be returned:
+---------------------------------------------------
Code Reason
-------------------------------------------------------------------------------------------------------------
200 OK Success.
400 Bad Request Missing or malformed input parameter.
404 Not Found Malformed request IRI.
406 Not Acceptable None of the media types specified in the Accept header are supported.
415 Unsupported Media Type Document body with unsupported media type was POSTed.
501 Not Implemented Extraction from input was successful, but yielded zero triples.
502 Bad Gateway Input document from a remote server could not be fetched or parsed.
+---------------------------------------------------
* XML Report Format
{report-format}
The <Apache Any23 Service> can optionally return an XML report and attempt error fix if
the flags <fix> and <report> are activated ( <fix=on&report=on> ).
The following URL shows how to use these flags.
+---------------------------------------------------
http://any23.org/any23-service/any23/?format=best&uri=http%3A%2F%2Fpath%2Fto%2Fresource&validation=none&report=on
+---------------------------------------------------
The <fix> functionality is described {{{./dev-validation-fix.html}here}}.
A report format example is listed below.
In particular ath path <response/extractors/extractor> it is possible to find the list
of all extractors activated during the page processing.
The section <response/report/message> contains an eventual error message while the
<response/report/error> section the error stack trace if available.
The result of validation is contained within the <response/report/validationReport> node.
Within that node there is the list of the activated rules, the issues detected and
the errors generated.
+-----------------------------------------------------------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8" ?>
<response>
<!-- List of activated extractors. -->
<extractors>
<extractor><!-- Extractor name. --></extractor>
<!-- ... -->
</extractors>
<report>
<message></message>
<error></error>
<!-- Validation specific report, contains all errors and issues detected within the document. -->
<validationReport>
<!-- List of errors found while validating the document. -->
<errors>
</errors>
<!-- List of issues found while validating the document. -->
<issues>
</issues>
<!-- List of rules activated to solve the detected issues. -->
<ruleActivations>
</ruleActivations>
</validationReport>
</report>
<data>
<![CDATA[
-- Actual Data in the format specified as output. --
]]>
</data>
</response>
+-----------------------------------------------------------------------------------------------------------------