blob: dfdd3695a84490380d32470587eb53c6e91d5a3f [file] [log] [blame]
------
Apache Any23 - XPath Extractor
------
The Apache Software Foundation
------
~~ Licensed to the Apache Software Foundation (ASF) under one or more
~~ contributor license agreements. See the NOTICE file distributed with
~~ this work for additional information regarding copyright ownership.
~~ The ASF licenses this file to You under the Apache License, Version 2.0
~~ (the "License"); you may not use this file except in compliance with
~~ the License. You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License.
XPath Extractor
The XPath extractor is a specific extractor meant to scrape
data from pages not containing RDF information.
Such extractor is based on a set of configurable extraction rules
activated by a regular expression over the page URL.
When an extraction rule is activated all the variables it defines are
evaluated and then a NQuads template is expanded for generating statements.
See {{{./apidocs/org/apache/any23/extractor/xpath/package-summary.html}Javadoc}}.