| Release Notes -- Apache PDFBox -- Version 1.5.0 |
| |
| Introduction |
| ------------ |
| |
| PDFBox is an open source Java library for working with PDF documents. |
| |
| This is an incremental feature release based on the earlier 1.x releases. |
| This release contains many improvements and fixes especially related to |
| performance, resource usage and fault-tolerance. |
| |
| For more details on these changes and all the other fixes and improvements |
| included in this release, please refer to the following issues on the |
| PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX. |
| |
| Improvements |
| |
| [PDFBOX-274] PDFDocument.save is really slow |
| [PDFBOX-917] Read non-conforming PDFs (attached) without throwing ... |
| [PDFBOX-928] added NPE protection which occurred when reading corrupt PDFs |
| [PDFBOX-947] Avoid using temporary files in PDJpeg |
| [PDFBOX-948] Don't use temporty files by default for all PDF sizes |
| |
| Bug Fixes |
| |
| [PDFBOX-202] Error on text extraction: java.lang.IndexOutOfBoundsExceptio |
| [PDFBOX-328] PDFTextStripper not handling some Japanese |
| [PDFBOX-578] NPE NullPointerException in PDPageNode.getCount |
| [PDFBOX-706] CFFParser.readCharset java.lang.IllegalArgumentException |
| [PDFBOX-708] Failed to create Type1C font. Falling back to Type1 font |
| [PDFBOX-713] PDFont fails to close Font File. |
| [PDFBOX-797] NPE in PDPageNode |
| [PDFBOX-920] PDFStreamEngine.processEncodedText fails on UTF-16 text |
| [PDFBOX-925] ExtractText china pdf ,but pdfbox distinguish Korea,The ... |
| [PDFBOX-935] Text not extracted with PDFBox 1.4 |
| [PDFBOX-938] Wrong extracted text using PDFBox 1.4 |
| [PDFBOX-939] Lost whitespaces when extracting Arabic text |
| [PDFBOX-941] extracting Japanese characters gives garbage |
| [PDFBOX-942] Image quality improvements |
| [PDFBOX-945] PDFBOX may not depend on plattform encoding |
| [PDFBOX-946] RandomAccessBuffer shoud be created empty |
| [PDFBOX-949] ExtractText returns junk |
| [PDFBOX-952] getParent method of class PDField doesn't consider both ... |
| [PDFBOX-959] Text extraction slow and /tmp fills upwith AWT font files |
| [PDFBOX-960] Null Pointer Exception when Annotation is missing the Subtype |
| |
| Release Contents |
| ---------------- |
| |
| This release consists of a single source archive packaged as a zip file. |
| The archive can be unpacked with the jar tool from your JDK installation. |
| See the README.txt file for instructions on how to build this release. |
| |
| The source archive is accompanied by SHA1 and MD5 checksums and a PGP |
| signature that you can use to verify the authenticity of your download. |
| The public key used for the PGP signature can be found at |
| https://svn.apache.org/repos/asf/pdfbox/KEYS. |
| |
| About Apache PDFBox |
| ------------------- |
| |
| Apache PDFBox is an open source Java library for working with PDF documents. |
| This project allows creation of new PDF documents, manipulation of existing |
| documents and the ability to extract content from documents. Apache PDFBox |
| also includes several command line utilities. Apache PDFBox is published |
| under the Apache License, Version 2.0. |
| |
| For more information, visit http://pdfbox.apache.org/ |
| |
| About The Apache Software Foundation |
| ------------------------------------ |
| |
| Established in 1999, The Apache Software Foundation provides organizational, |
| legal, and financial support for more than 100 freely-available, |
| collaboratively-developed Open Source projects. The pragmatic Apache License |
| enables individual and commercial users to easily deploy Apache software; |
| the Foundation's intellectual property framework limits the legal exposure |
| of its 2,500+ contributors. |
| |
| For more information, visit http://www.apache.org/ |