blob: 4031b052036ae2fa678ddfdefe4a56f72d0cff6b [file] [log] [blame]
------
Apache Any23 - Extractors
------
The Apache Software Foundation
------
2011-2012
~~ Licensed to the Apache Software Foundation (ASF) under one or more
~~ contributor license agreements. See the NOTICE file distributed with
~~ this work for additional information regarding copyright ownership.
~~ The ASF licenses this file to You under the Apache License, Version 2.0
~~ (the "License"); you may not use this file except in compliance with
~~ the License. You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License.
Apache Any23 Extractors
This page enlists all the Apache Any23 Extractors (see source code {{{./apidocs/org/apache/any23/extractor/package-summary.html}package}}).
* Microformat Extractors
The following extractors refer to the {{{http://microformats.org/}Microformats specifications}}.
Specific details about *Microformats* extractors can be found {{{./dev-microformat-extractors.html}here}}.
In particular the *Microformats Nesting* representation policy is described {{{./dev-microformat-extractors.html#microformat-nesting}here}}.
{{{./apidocs/org/apache/any23/extractor/html/AdrExtractor.html}AdrExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/GeoExtractor.html}GeoExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/HCalendarExtractor.html}HCalendar}}
{{{./apidocs/org/apache/any23/extractor/html/HCardExtractor.html}HCard}}
{{{./apidocs/org/apache/any23/extractor/html/HListingExtractor.html}HListing}}
{{{./apidocs/org/apache/any23/extractor/html/HResumeExtractor.html}HResume}}
{{{./apidocs/org/apache/any23/extractor/html/HReviewExtractor.html}HReview}}
{{{./apidocs/org/apache/any23/extractor/html/SpeciesExtractor.html}SpeciesExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/LicenseExtractor.html}LicenseExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/XFNExtractor.html}XFNExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/HRecipeExtractor.html}HRecipeExtractor}}
* RDFa [1.0 , 1.1]
The following extractors refer to the {{{http://www.w3.org/TR/rdfa-syntax/}RDFa 1.0}}
and {{{http://www.w3.org/TR/rdfa-core/}RDFa 1.1}} specifications.
{{{./apidocs/org/apache/any23/extractor/rdfa/RDFaExtractor.html}RDFaExtractor}}
* Microdata
The following extractors refer to the {{{http://dev.w3.org/html5/md/}Microdata specifications}}.
{{{./apidocs/org/apache/any23/extractor/microdata/MicrodataExtractor.html}MicrodataExtractor}}
* RDF
{{{./apidocs/org/apache/any23/extractor/rdf/RDFXMLExtractor.html}RDFXMLExtractor}}
{{{./apidocs/org/apache/any23/extractor/rdf/NQuadsExtractor.html}NQuadsExtractor}}
{{{./apidocs/org/apache/any23/extractor/rdf/TurtleExtractor.html}TurtleExtractor}}
{{{./apidocs/org/apache/any23/extractor/rdf/NTriplesExtractor.html}NTriplesExtractor}}
* Metadata Extractors
{{{./apidocs/org/apache/any23/extractor/html/TitleExtractor.html}TitleExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/HTMLMetaExtractor.html}HTMLMetaExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/HeadLinkExtractor.html}HeadLinkExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/ICBMExtractor.html}ICBMExtractor}}
{{{./apidocs/org/apache/any23/extractor/html/TurtleHTMLExtractor.html}TurtleHTMLExtractor}}
* Content Extractors
{{{./apidocs/org/apache/any23/extractor/xpath/XPathExtractor.html}XPath Extractor}} (<<Experimental>>)
{{{./apidocs/org/apache/any23/extractor/csv/CSVExtractor.html}CSV Extractor}} (See the extraction {{{./dev-csv-extractor.html}algorithm}}.)
Get more documentation
It is possible to generate the list of all the available extractors invoking the following command:
+------------------------------------------------------------
<any23-core>/bin$ any23tools ExtractorDocumentation -list
+------------------------------------------------------------