The Apache PDFBox™ library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities. Apache PDFBox is published under the Apache License v2.0.
PDFBox 4.0 Migration Guide
<h1 id="pdfbox-4.0-migration-guide" tabindex="-1">PDFBox 4.0 Migration Guide</h1>
<p class="alert alert-warning">Work in progress! There isn't any release of the 4.0 version yet. Nevertheless we already provide
a migration guide. It will be improved over time. If you believe there is a missing topic, open an issue or help us with a
contribution to improve the guide.
<p>This guide describes the updates in Apache PDFBox 4.0 version. Use the information provided to upgrade your PDFBox 3.x applications
to PDFBox 4.0. It provides information about the new, deprecated and unsupported features in this release.</p>
<h2 id="java-versions" tabindex="-1">Java Versions</h2>
<p>PDFBox 4.0 requires at least Java 11. Testing has been done up to Java 20.</p>
<h2 id="dependency-updates" tabindex="-1">Dependency Updates</h2>
<p>All libraries on which PDFBox depends are updated to their latest stable versions:</p>
<li>Bouncy Castle 1.77</li>
<li>Apache Log4j 2.22.1</li>
<li>picocli 4.7.5</li>
<p>For test support the libraries are updated to</p>
<li>JUnit 5.10.1</li>
<li>JAI Image Core 1.4.0</li>
<li>JAI JPEG2000 1.4.0</li>
<li>Apache JBIG ImageIO Plugin 3.0.4</li>
<li>Apache Commons IO 2.15.0</li>
<h2 id="general-changes-for-pdfbox-4.0" tabindex="-1">General Changes for PDFBox 4.0</h2>
<p>This section explains the fundamental differences between PDFBox 4.0 and 3.x releases.</p>
<h3 id="preflight-was-removed" tabindex="-1">Preflight was removed</h3>
<p>The subproject Preflight was removed due to inactivity. There weren't any substantial changes or improvements in the past years. The parser
was still limited to PDF/A 1B.</p>
<p>People looking for an open source preflight solution might check <a href="">VeraPDF</a>. The VeraPDF parser is based on a PDFBox fork and
was stream lined to fit their needs. But VeraPDF is still using the PDFBox parser as possible alternative.</p>
<h3 id="switch-to-apache-log4j" tabindex="-1">Switch to Apache Log4j</h3>
<p>Apache Commons Logging was replaced by Apache Log4j, some of the obvious reasons were</p>
<li>maintainabilty and performance</li>
<li>JPMS support</li>
<li>lambda logging support</li>
<p><a href="">PDFBOX-5695</a> provides more details about the reasons and the transition itself.</p>
