| ------ |
| Apache Any23 - Extractors |
| ------ |
| The Apache Software Foundation |
| ------ |
| 2011-2012 |
| |
| ~~ Licensed to the Apache Software Foundation (ASF) under one or more |
| ~~ contributor license agreements. See the NOTICE file distributed with |
| ~~ this work for additional information regarding copyright ownership. |
| ~~ The ASF licenses this file to You under the Apache License, Version 2.0 |
| ~~ (the "License"); you may not use this file except in compliance with |
| ~~ the License. You may obtain a copy of the License at |
| ~~ |
| ~~ http://www.apache.org/licenses/LICENSE-2.0 |
| ~~ |
| ~~ Unless required by applicable law or agreed to in writing, software |
| ~~ distributed under the License is distributed on an "AS IS" BASIS, |
| ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| ~~ See the License for the specific language governing permissions and |
| ~~ limitations under the License. |
| |
| Apache Any23 Extractors |
| |
| This page enlists all the Apache Any23 Extractors (see source code {{{./apidocs/org/apache/any23/extractor/package-summary.html}package}}). |
| |
| * Microformat Extractors |
| |
| The following extractors refer to the {{{http://microformats.org/}Microformats specifications}}. |
| |
| Specific details about *Microformats* extractors can be found {{{./dev-microformat-extractors.html}here}}. |
| In particular the *Microformats Nesting* representation policy is described {{{./dev-microformat-extractors.html#microformat-nesting}here}}. |
| |
| {{{./apidocs/org/apache/any23/extractor/html/AdrExtractor.html}AdrExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/GeoExtractor.html}GeoExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HCalendarExtractor.html}HCalendar}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HCardExtractor.html}HCard}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HListingExtractor.html}HListing}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HResumeExtractor.html}HResume}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HReviewExtractor.html}HReview}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/SpeciesExtractor.html}SpeciesExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/LicenseExtractor.html}LicenseExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/XFNExtractor.html}XFNExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HRecipeExtractor.html}HRecipeExtractor}} |
| |
| * RDFa [1.0 , 1.1] |
| |
| The following extractors refer to the {{{http://www.w3.org/TR/rdfa-syntax/}RDFa 1.0}} |
| and {{{http://www.w3.org/TR/rdfa-core/}RDFa 1.1}} specifications. |
| |
| {{{./apidocs/org/apache/any23/extractor/rdfa/RDFaExtractor.html}RDFaExtractor}} |
| |
| * Microdata |
| |
| The following extractors refer to the {{{http://dev.w3.org/html5/md/}Microdata specifications}}. |
| |
| {{{./apidocs/org/apache/any23/extractor/microdata/MicrodataExtractor.html}MicrodataExtractor}} |
| |
| * RDF |
| |
| {{{./apidocs/org/apache/any23/extractor/rdf/RDFXMLExtractor.html}RDFXMLExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/rdf/NQuadsExtractor.html}NQuadsExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/rdf/TurtleExtractor.html}TurtleExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/rdf/NTriplesExtractor.html}NTriplesExtractor}} |
| |
| * Metadata Extractors |
| |
| {{{./apidocs/org/apache/any23/extractor/html/TitleExtractor.html}TitleExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HTMLMetaExtractor.html}HTMLMetaExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/HeadLinkExtractor.html}HeadLinkExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/ICBMExtractor.html}ICBMExtractor}} |
| |
| {{{./apidocs/org/apache/any23/extractor/html/TurtleHTMLExtractor.html}TurtleHTMLExtractor}} |
| |
| * Content Extractors |
| |
| {{{./apidocs/org/apache/any23/extractor/xpath/XPathExtractor.html}XPath Extractor}} (<<Experimental>>) |
| |
| {{{./apidocs/org/apache/any23/extractor/csv/CSVExtractor.html}CSV Extractor}} (See the extraction {{{./dev-csv-extractor.html}algorithm}}.) |
| |
| Get more documentation |
| |
| It is possible to generate the list of all the available extractors invoking the following command: |
| |
| +------------------------------------------------------------ |
| <any23-core>/bin$ any23tools ExtractorDocumentation -list |
| +------------------------------------------------------------ |