                           Apache Any23 0.7.0
                             Release Notes
                         01/05/2013 (dd/mm/yyyy)
                         
Sub-task

    [ANY23-109] - Missing tika-config.xml in o.a.a.mime
    [ANY23-110] - DOAP Vocabulary

Bug

    [ANY23-44] - error when parsing a document from http://www.afdsi.org/docs/test/html/RDFa/_food-stream_.htm
    [ANY23-78] - Download page links are broken
    [ANY23-108] - Broken schema.org microdata extraction
    [ANY23-112] - Fix incubation disclaimer
    [ANY23-113] - Remove dependencies from parent pom.xml file
    [ANY23-116] - Empty values are skipped when reading tab separated CSV.
    [ANY23-156] - Add logging dependencies to plugins and service

Improvement

    [ANY23-2] - Add support for hreview-aggregate microformat.
    [ANY23-26] - Upgrade dependency to Apache Tika 1.2
    [ANY23-46] - Update Any23 web service
    [ANY23-83] - Remove hardcoded formats throughout Any23 to make it useful as a library
    [ANY23-101] - Use RDFFormat.NQUADS in nquads module
    [ANY23-139] - Simplify site deploy plugging the maven-scm-publish-plugin
    [ANY23-144] - Implement comprehensive naming of o.a.a.api.vocab classes

New Feature

    [ANY23-4] - Integrate W3C's RDFa test suite and pass all tests
    [ANY23-85] - Split NQuads out into its own module
    [ANY23-96] - Add user agent string to basic-crawler
    [ANY23-117] - Split Mime type detection out into its own module
    [ANY23-118] - Split Encoding detection out into its own module

Task

    [ANY23-41] - Write basic-crawler plugin documentation
    [ANY23-125] - Drop the Incubating DISCLAIMER
                         

                             Apache Any23 0.7.0-incubating
                              Release Notes
                              25/06/2012

Sub-task

    [ANY23-25] - Update all Maven POM's in trunk
    [ANY23-31] - Move any23 site documentation out of trunk and into its own SVN directory
    [ANY23-53] - Bad Web Service documentation

Bug

    [ANY23-14] - Add support for Extractor sub results
    [ANY23-20] - The Any23 PluginManager fails handing resource paths containing spaces.
    [ANY23-34] - Plugin Integration Test Fails
    [ANY23-37] - LGPL'ed components cannot be included in distribution packages
    [ANY23-42] - Fix issue in RDFa11Parser.java is not resolving relative URIs correctly
    [ANY23-49] - N3/NQ parsers ignoring stopAtFirstError flag
    [ANY23-58] - HCardExtractor infinite loop and memory exhaustion
    [ANY23-62] - ExtractionResultImpl loses all issues generated by sub extractions
    [ANY23-73] - The ToolRunner CLI driver -p (--plugins-dir) option doesn't work because parsed after the Tool list loading
    [ANY23-77] - Facing a infinite loop problem in version 0.6.1 - Verify
    [ANY23-78] - Download page links are broken
    [ANY23-87] - Bogus arguement in o.a.a.cli.CrawlerTest
    [ANY23-88] - any23 script -v or --version option doesn't display actual version
    [ANY23-94] - The Microdata CLI tool doesn't work anymore
    [ANY23-95] - Activate the IgnoreAccidentalRDFa filter for the Any23 Service instance
    [ANY23-97] - The test suite was not running all tests, minor regressions occurred

Improvement

    [ANY23-18] - Add a new extractor for RDFa using java-rdfa
    [ANY23-28] - Document munging of Any23 history to CHANGES.txt
    [ANY23-32] - replace hardcoded bash script with generated via appassembler
    [ANY23-33] - Replace proprietary SUN imports from Any23 classes.
    [ANY23-45] - Improve issue verification support in Extractor tests
    [ANY23-50] - Simplify plugin loading avoiding the classpath scanning
    [ANY23-56] - Change repo-ext to Any23 SVN mirrior repo.
    [ANY23-63] - The Any23 web service doesn't return the Issue Report generated by activated Extractors, hiding major metadata issues
    [ANY23-64] - Improve CLI uage aesthetics
    [ANY23-70] - Establish searchable list archives
    [ANY23-71] - improve the current CLI engine
    [ANY23-74] - Disable domain triple generation in default configuration
    [ANY23-75] - Improve runtime of the Microdata extractor on documents with many relations.
    [ANY23-76] - Improve runtime of the Microformat extractor on documents with many relations.
    [ANY23-82] - Don't use explicit reference to Log4j classes
    [ANY23-86] - Better logging in SiteCrawlerTest

New Feature

    [ANY23-9] - Prepare a dedicated homepage for Any23
    [ANY23-29] - Migrate code base to ASF infrastructure
    [ANY23-57] - Create Any23 History documentation and add to site
    [ANY23-59] - Create KEYS file for Any23
    [ANY23-68] - Create Powered By documentation/page
    [ANY23-102] - Any23 DOAP file

Task

    [ANY23-21] - Migrate all packages and classes to ORG.APACHE.ANY23
    [ANY23-27] - Import revisions r1547 to r1607 from Google Code SVN to ASF SVN
    [ANY23-36] - Merge GCode specific CHANGES.txt report in main changes.xml
    [ANY23-39] - Write Down Overall Architecture Document to help new developers maintaining the Any23 core
    [ANY23-48] - Update Documentation (Site + READMEs) to reflect changes in shell script usage
    [ANY23-52] - Remove non ASF logos from Any23 Service page
    [ANY23-66] - Fix Javadoc

==========================================================================

                             Apache Any23 0.6.1
                              Release Notes

Fixes

 * Improved MIMEType detection for CSV input. [172, 176]

==========================================================================

                             Apache Any23 0.6.0
                              Release Notes

Fixes

 * Fixed several bugs. [151, 153, 154, 155, 156, 164, 168]
 * Removed unused Apache Any23 dependencies. [162]
 * Introduced parent POM dependencyManagement. [163]
 * Minor code refactoring. [142]
 * Updated project documentation. [161]

Enhancements

 * Added support for Microdata [114, 141, 144, 145, 152, 157]
 * Added RDFa 1.1 support for new prefix specification. [143]
 * Added CSV Extractor (RDFizer). [150, 165]
 * Added HTML/META Extractor. [148, 149]
 * Improved Configuration programmatic management. [147]
 * Added several flags to control metadata triples generation. [146]
 * Improved nesting relationship explicitation in Microformat extractors. [80]
 * Major Extractor interface refactoring. [160, 167]
 * Improved TagSoup Extractor based error reporting. [159]
 * Added command-line tool to print out the Apache Any23 declared vocabularies. [114]

==========================================================================

                              Apache Any23 0.6.0-M2
                                Release Notes

The release 0.6.0-M2 introduces major fixes on M1 milestone
[154, 155, 156] and improves Configuration [147] and Microdata
 error management[157].

==========================================================================

                             Apache Any23 0.6.0-M1
                               Release Notes

The release 0.6.0-M1 is an early preview of the
Microdata support. [114]

==========================================================================

                             Apache Any23 0.5.0
                              Release Notes

Fixes

 * Fixed wrong conversion of a generic XML file to RDF. [131]
 * Fixed usage of 'base' tag when resolving relative URIs
   in RDFa. [75]
 * Fixed error parsing Turtle data. [87]
 * Fixed issue with escaping in NQuads parser. [126]
 * Fixed XML DTD validation attempt. [95]
 * Fixed concurrent modification exception in
   ExtractionContentBlocker filter. [86]
 * Fixed mime type detection of direct input when source
   contains blank chars. [83, 90]
 * Fixed reporting when producing no triples. [79]
 * Fixed any23-service packaging, added profile for excluding
   embedded dependencies. [113]

Enhancements

 * Improved extraction report: added list of 
   activated extractors. [89]
 * Improved extraction of HTML link element. [133]
 * Added XPath HTML extractor. [124]
 * Added HRecipe Microformat extractor. [103]
 * Added plugin support for Apache Any23. [111]
 * Implemented HTML Scraper Plugin. [123]
 * Upgraded to Sesame 2.4.0. [136]
 * Upgraded to Jetty 8.0.0 [138]
 * Upgraded maven-site-plugin. [85]
 * Added flags to exclude metadata triples [134]
 * Added removal of CSS related triples. [135]
 * Improved overall documentation. [130]
 * Overall POM refactoring. [125]

==========================================================================

                             Apache Any23 0.4.0 
                              Release Notes

* The any23-service module has been separated from the any23-core module,
  the Ant build system has been dropped. [Issue 44]
* Added support for HTML metadata (RDFa / Microformats) validation
  and correction (validator). [Issue 77]
* Added flag to disable the nesting relationship property 
  enrichment. [Issue 67]
* Improved coverage of Microformats tests. [Issue 65]
* Improved documentation. [Issue 44]
* Various code consolidation. [Issues 68, 69, 70, 71, 72, 73, 74, 77]

==========================================================================

	                         Apache Any23 0.3.0 
                              Release Notes

* Added detection and enrichment of nested microformats. [Issue #61]
* Added detection and support of N-Quads as input and output format. [Issue #7]
* General Improvements in RDFa extraction. [Issue #12, Issue #14]
* Added support of Turtle embedded in HTML script tag. [Issue #62]
* Improvement in encoding support. [Issue #43]
* Improvement in Core API. [Issue #27]
* Improved support for Species Microformat. [Issue #63]
* General Code prettification.

==========================================================================

	                         Apache Any23 0.2.2 
                              Release Notes

* Fixed dependency management on Maven. A second level dependency of Xerces
  introduced a conflict on the java.xml.transform API causing wrong XSLT 
  transformations within RDFa extractor.

==========================================================================

	                         Apache Any23 0.2.1 
                              Release Notes

* Major applyFix on Tika configuration management. This applyFix solves the 
  auto detection of the main Semantic Web related formats.

==========================================================================

                            Apache Any23 0.2
                             Release Notes

============
Introduction
============

This release features a redesigned API and incorporating enhancements and
bug fixes that have accumulated since the 0.1 release.
Apart  from  some  new  or changed dependencies on the underlying libraries,
this  version  comes  with an improved unit test coverage and other features
like the automatic charset encoding detection and an improved documentation.
Maven build system has been introduced.


==================================
Summary of major changes since 0.1
==================================

* Redesigned Java API
    - Input from string, stream, file, or URI
    - Allow choosing which extractors to use
    - Report origin of triples (document/extractor) to client processors
    - Various processors/serializers for extracted triples
* Added flexible command-line tool for easy testing
* Vastly improved website and documentation
* Media type and encoding detection via Apache Tika
* Switched RDF library from Jena to Sesame
* Added Maven build
* Better RDF extraction from Microformats
* Extractors now come with an example file to document typical in- and output
* Major refactoring
* Lots and lots of bugfixes

=================
Supported formats
=================

* RDF/XML
* Notation3 and Turtle
* N-Triples
* RDFa

Various microformats, see http://sindice.com/developers/microformat on Sindice Microformats support.

===================
Dependency Upgrade
===================

CyberNeko Html parser has been upgraded to 1.9.14.

Apache Tika 0.3 has been replaced with 0.6, with the
new  support  for  the automatic encoding detection.

EOF

