| Release Notes -- Apache PDFBox -- Version 1.4.0 |
| |
| Introduction |
| ------------ |
| |
| PDFBox is an open source Java library for working with PDF documents. |
| |
| This is an incremental feature release based on the earlier 1.x releases. |
| This release contains many improvements and fixes especially related to |
| text extraction, AES decryption and malformed PDFs. |
| For more details on these changes and all the other fixes and improvements |
| included in this release, please refer to the following issues on the |
| PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX. |
| |
| New Features |
| |
| [PDFBOX-865] - Optional Content Groups (OCGs aka layers): initial support |
| [PDFBOX-913] - Add program which decompresses object streams |
| |
| Improvements |
| |
| [PDFBOX-521] - Improved PDF Text Extraction that notes paragraph boundaries |
| [PDFBOX-885] - Add constructors from super class to PDFTextStripperByArea to support encoding |
| [PDFBOX-893] - Performance improvement in PDFStreamEngine and Matrix (patch included) |
| [PDFBOX-909] - Add support for a 6 element matrix |
| [PDFBOX-914] - Using TextToPDF to create a PDF from the empty string produces unreadble PDF file (patch included) |
| |
| Bug Fixes |
| |
| [PDFBOX-28] - Spliiting a PDF creates unnecessarily large chunks |
| [PDFBOX-671] - Cannot use PDFToImage to convert Chinese PDF pages into images. |
| [PDFBOX-751] - Text Extraction truncates last character when image page has sideways text |
| [PDFBOX-759] - Special characters not extracted |
| [PDFBOX-779] - All English characters and some Chinese words are separated by a space |
| [PDFBOX-806] - Failure to extract dc:description when the value is the node text |
| [PDFBOX-854] - PDPageContentStream.drawString() doesn't work with all PDFs |
| [PDFBOX-872] - ERROR org.apache.pdfbox.filter.FlateFilter - Stop reading corrupt stream |
| [PDFBOX-881] - Incorrect output when word spacing is achieved by matrix translation |
| [PDFBOX-883] - Special characters are not correctly handled anymore when printing or exporting to image |
| [PDFBOX-887] - CCITTFaxDecodeFilter doesn't use the abbreviated names for image parameters |
| [PDFBOX-888] - Decrypt doesn't allow more then 3 args |
| [PDFBOX-889] - Empty page causes NPE in importPage |
| [PDFBOX-896] - PDFViewer doesn't render landscape mode correctly |
| [PDFBOX-897] - NullPointerException PDFFont#getEncodingFromFont with a PDF book because Type1Encoding is null |
| [PDFBOX-898] - COSStreamArray NullPointerException. firstStream is null if COSArray contains no items |
| [PDFBOX-900] - ArrayIndexOutOfBoundsException with extracting labels from malformed document |
| [PDFBOX-902] - ClassCastException caused by unhandled Markup Annotations. |
| [PDFBOX-907] - Encrypted Key not correctly calculated when the meta data is not encrypted |
| [PDFBOX-910] - certain sequences (such as endstrea[^m] are eaten by BaseParser#readUntilEndStream |
| [PDFBOX-918] - Can't parse PDF |
| [PDFBOX-921] - NumberFormatException when parsing a type1 font |
| |
| Release Contents |
| ---------------- |
| |
| This release consists of a single source archive packaged as a zip file. |
| The archive can be unpacked with the jar tool from your JDK installation. |
| See the README.txt file for instructions on how to build this release. |
| |
| The source archive is accompanied by SHA1 and MD5 checksums and a PGP |
| signature that you can use to verify the authenticity of your download. |
| The public key used for the PGP signature can be found at |
| https://svn.apache.org/repos/asf/pdfbox/KEYS. |
| |
| About Apache PDFBox |
| ------------------- |
| |
| Apache PDFBox is an open source Java library for working with PDF documents. |
| This project allows creation of new PDF documents, manipulation of existing |
| documents and the ability to extract content from documents. Apache PDFBox |
| also includes several command line utilities. Apache PDFBox is published |
| under the Apache License, Version 2.0. |
| |
| For more information, visit http://pdfbox.apache.org/ |
| |
| About The Apache Software Foundation |
| ------------------------------------ |
| |
| Established in 1999, The Apache Software Foundation provides organizational, |
| legal, and financial support for more than 100 freely-available, |
| collaboratively-developed Open Source projects. The pragmatic Apache License |
| enables individual and commercial users to easily deploy Apache software; |
| the Foundation's intellectual property framework limits the legal exposure |
| of its 2,500+ contributors. |
| |
| For more information, visit http://www.apache.org/ |