| Release Notes -- Apache PDFBox -- Version 1.8.0 |
| |
| Introduction |
| ------------ |
| |
| The Apache PDFBox library is an open source Java tool for working with PDF documents. |
| |
| This is an incremental feature release based on the earlier 1.x releases. |
| This release contains many improvements and fixes especially related to |
| performance, resource usage and rendering. The most significant |
| new features are the first official release of the new PDF/A preflight module as |
| part of Apache PDFBox and the improved signature abilities. |
| |
| For more details on these changes and all the other fixes and improvements |
| included in this release, please refer to the following issues on the |
| PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX. |
| |
| New features |
| |
| [PDFBOX-46] - Support XFA form submitting |
| [PDFBOX-81] - Excetion while extracting images |
| [PDFBOX-84] - Read PDF XFA Form Contents |
| [PDFBOX-127] - Accessing XML-Forms (patch provided) |
| [PDFBOX-1067] - PDF Scan from Xerox WorkCentre 5030 renders as all black |
| [PDFBOX-1514] - Improved overlay cammand line tool |
| |
| Improvements |
| |
| [PDFBOX-1246] - Allow resolution to be defined when calling ImageIOUtil.writeImage |
| [PDFBOX-1312] - Refactor the PdfA parser |
| [PDFBOX-1352] - xmpbox refactoring |
| [PDFBOX-1367] - Do not generate preflight jar with dependencies at each build |
| [PDFBOX-1369] - support getting file pointer from RandomAccessRead interface |
| [PDFBOX-1377] - Simplify PDF/A schema parsing |
| [PDFBOX-1387] - Create NonSequentialParser with InputStream |
| [PDFBOX-1388] - Create a branch to refactor xmpbox |
| [PDFBOX-1392] - Enable usage of compressionQuality when creating a PDJpeg |
| [PDFBOX-1399] - Add an example on how to extract embedded files |
| [PDFBOX-1418] - Improved font mapping |
| [PDFBOX-1423] - An error exists on this page. Acrobat may not display the page correctly |
| [PDFBOX-1425] - Make PositionWrapper.getTextPosition public |
| [PDFBOX-1439] - Problems with Image Extraction from PDF |
| [PDFBOX-1468] - Decrypting unencrypted strings |
| [PDFBOX-1488] - Add generics to the COSArrayList class |
| [PDFBOX-1492] - Add basic XFA extraction |
| [PDFBOX-1513] - PDF signature improvements |
| [PDFBOX-1536] - Improve the ExtractEmbeddedFiles example to deal with different kind of |
| trees representing the embedded files |
| |
| Bug Fixes |
| |
| [PDFBOX-137] - Does not detect paper format |
| [PDFBOX-811] - EmbeddedFiles example does not work |
| [PDFBOX-819] - PDFBox prints landscape documents as portrait |
| [PDFBOX-927] - Problem on writing some kind of images to a File in filesystem |
| [PDFBOX-969] - IndexOutOfBound whle creating a Type1C font |
| [PDFBOX-985] - PDF Printing Orientation |
| [PDFBOX-992] - IndexOutOfBoundsException: while parsing few pdf's |
| [PDFBOX-1072] - PDFImageWriter extracts black images from arabic PDFs |
| [PDFBOX-1084] - java.lang.NumberFormatException when getting PDF text of some PDF file if |
| dup line does not contains font index |
| [PDFBOX-1130] - ExtractText -html doesn't always close the <p> tags it opens |
| [PDFBOX-1138] - Printing fails for pages in landscape format |
| [PDFBOX-1169] - Images extracted from PDF are loosing color (are shown in blackcolor) |
| [PDFBOX-1191] - Lost information while extracting images from pdf scanned by XEROX |
| [PDFBOX-1298] - java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-2) |
| [PDFBOX-1344] - xml namespace problem in ResourceRef |
| [PDFBOX-1346] - Can't assign an arbitrary string value to an editable acroform combobox |
| [PDFBOX-1359] - stack overflow~~ ExtractText (PDF2TXT) |
| [PDFBOX-1362] - Slovakian characters |
| [PDFBOX-1364] - Error On MetaData |
| [PDFBOX-1365] - Error On MetaData: The Metadata entry doesn't reference a stream object |
| [PDFBOX-1368] - Xmp validation KO if there are complex type in a seq element |
| [PDFBOX-1371] - MetaData : Trapped property |
| [PDFBOX-1373] - Body Syntax Error : Possible Encoding problem |
| [PDFBOX-1374] - Error On MetaData: Title |
| [PDFBOX-1376] - xmpbox cannot parse structured types containing structured types |
| [PDFBOX-1378] - [PATCH] COSArray: Avoid NullPointerException in setString |
| [PDFBOX-1379] - [PATCH] COSDocument: setVersion |
| [PDFBOX-1380] - [PATCH] PDNameTreeNode |
| [PDFBOX-1381] - [PATCH] PDNumberTreeNode |
| [PDFBOX-1382] - [PATCH] PDObjectReference |
| [PDFBOX-1394] - Image streams are lost when adding new images to page |
| [PDFBOX-1395] - Transparency isn't checked in Page dictionary |
| [PDFBOX-1398] - Runtime exception when trying to check PDF/A compliance on non PDF/A document |
| [PDFBOX-1408] - Width of space character is calculated wrong |
| [PDFBOX-1411] - [Patch] PDPixelMap.createImageStream can attempt to close output stream it |
| didn't open, hiding errors. |
| [PDFBOX-1412] - NullPointerException when getting fields from a PDF file |
| [PDFBOX-1421] - TextPosition.getX()returen 0 in case of rotation ==360 |
| [PDFBOX-1424] - Wrong glyph (Persian) is used in extacted text instead of the original glyph |
| (Persian) in PDF file |
| [PDFBOX-1427] - PDF page rotation is not working |
| [PDFBOX-1431] - Some pdfss created by ABBY trigger a NPE |
| [PDFBOX-1432] - PDF rotation problem |
| [PDFBOX-1434] - Font being changed after form field is set |
| [PDFBOX-1440] - Garbled image from PDFToImage |
| [PDFBOX-1443] - Images are rendered blank |
| [PDFBOX-1445] - /ImageMask true does not work. Patch included. |
| [PDFBOX-1447] - wasted work in PDFMarkedContentExtractor.processTextPosition() |
| [PDFBOX-1449] - Preflight doesn't report on non-embedded font |
| [PDFBOX-1456] - wasted work in PublicKeySecurityHandler.prepareForDecryption() |
| [PDFBOX-1458] - wasted work in PDOptionalContentProperties.setGroupEnabled() |
| [PDFBOX-1464] - unnecessary linear searches in "CFFParser.Format0FDSelect.getFd" |
| [PDFBOX-1465] - Preflight crashes on PDF |
| [PDFBOX-1469] - [PATCH] PDPageContentStream incorrectly sets colors in CMYK color space |
| [PDFBOX-1470] - about attribute is serialized more than one time in XmpSerializer |
| [PDFBOX-1471] - Parsing of xmp properties set in xml attributes is not done |
| [PDFBOX-1473] - Incorrect handling of OpenType fonts |
| [PDFBOX-1475] - Exception thrown during rendering page if /DecodeParms specified indirectly |
| (like [9 0 R]) in XObject/Image |
| [PDFBOX-1476] - Isartor tests fails due to bad rdf:about handling |
| [PDFBOX-1477] - PDF/A file is declared invalid on windows and valid with linux |
| [PDFBOX-1481] - Ignore postscript code when parsing a type1 font |
| [PDFBOX-1482] - Java color spaces returned by PDDeviceN do not take tint transformation into account |
| and type mismatch |
| [PDFBOX-1489] - Maven Dependency not resolveable agains central |
| [PDFBOX-1490] - pdf page => inline image not converted |
| [PDFBOX-1491] - Image with colour key masking triggers NPE |
| [PDFBOX-1496] - Can't add multiple form XObjects to a PDF - they become duplicated |
| [PDFBOX-1497] - Preflight throws an exception on DeviceN validation |
| [PDFBOX-1499] - The blank white page is converted with method pdPage.convertToImage(); |
| [PDFBOX-1501] - Width of the character "201" .. inconsistent with the width in the PDF dictionary. |
| [PDFBOX-1504] - Split document issue |
| [PDFBOX-1505] - [PATCH] CharStringRenderer does not render CharString data correctly for Type 2 |
| CFF fonts |
| [PDFBOX-1517] - PDFSplit: split is set to one if no -split argument present |
| [PDFBOX-1518] - ClassCastException writing text to a page |
| [PDFBOX-1522] - Some PDF files are causing exception (java.io.IOException: Error: Could not find |
| font(COSName{F53.0}) in map=) |
| [PDFBOX-1535] - Extract text from PDF cause Nullpointer Exception in PDFStreamEngine.processEncodedText |
| Method |
| |
| Misc |
| |
| [PDFBOX-1366] - Reduce xmpbox code complexity |
| [PDFBOX-1528] - rename org.apache.padaf.xmpbox to org.apache.xmpbox |
| [PDFBOX-1530] - Respect PDFBox coding rules in new modules |
| [PDFBOX-1531] - Reaarange xmpbox and preflight maven modules |
| [PDFBOX-795] - PDPage convertToImage partially generates image file and throws exception |
| |
| Release Contents |
| ---------------- |
| |
| This release consists of a single source archive packaged as a zip file. |
| The archive can be unpacked with the jar tool from your JDK installation. |
| See the README.txt file for instructions on how to build this release. |
| |
| The source archive is accompanied by SHA1 and MD5 checksums and a PGP |
| signature that you can use to verify the authenticity of your download. |
| The public key used for the PGP signature can be found at |
| https://svn.apache.org/repos/asf/pdfbox/KEYS. |
| |
| About Apache PDFBox |
| ------------------- |
| |
| Apache PDFBox is an open source Java library for working with PDF documents. |
| This project allows creation of new PDF documents, manipulation of existing |
| documents and the ability to extract content from documents. Apache PDFBox |
| also includes several command line utilities. Apache PDFBox is published |
| under the Apache License, Version 2.0. |
| |
| For more information, visit http://pdfbox.apache.org/ |
| |
| About The Apache Software Foundation |
| ------------------------------------ |
| |
| Established in 1999, The Apache Software Foundation provides organizational, |
| legal, and financial support for more than 100 freely-available, |
| collaboratively-developed Open Source projects. The pragmatic Apache License |
| enables individual and commercial users to easily deploy Apache software; |
| the Foundation's intellectual property framework limits the legal exposure |
| of its 2,500+ contributors. |
| |
| For more information, visit http://www.apache.org/ |