license: Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.

layout: documentation title: Command-Line Tools eleventyNavigation: order: 5

Command-Line Tools

PDFBox comes with a series of command-line utilities. They are available as standard Java applications.

See the Dependencies page for instructions on how to set your classpath in order to run PDFBox tools as Java applications.

Decrypt

This application will decrypt a PDF document.

NOTE: You must have the owner password to decrypt the document!

Usage: java -jar pdfbox-app-3.y.z.jar decrypt [OPTIONS] -i=<infile>

Command-Line ParameterDescription
-alias=The alias to the certificate in the keystore.
-h, --helpShow help message and exit.
-i, --input=<infile>The PDF file to decrypt.
-keyStore=<keyStore>Path to keystore that holds certificate to decrypt the document. This is only required if the document is encrypted with a certificate, otherwise only the password is required.
-o, --output=<outfile>The file to save the decrypted document to. If left blank then it will be the same as the input file.
-password=[<password>]Password to the PDF or certificate in keystore.
-V, --versionPrint version information and exit.

Encrypt

This application will encrypt a PDF document.

Usage: java -jar pdfbox-app-3.y.z.jar encrypt [OPTIONS] -i=<infile>

Command-Line ParameterDefaultDescription
-canAssembletrueSet the assemble permission.
-canExtractContenttrueSet the extraction permission.
-canExtractForAccessibilitytrueSet the extraction permission.
-canFillInFormtrueSet the fill in form permission.
-canModifytrueSet the modify permission.
-canModifyAnnotationstrueSet the modify annots permission.
-canPrinttrueSet the print permission.
-canPrintFaithfultrueSet the print faithful permission.
-certFile=<certFile>Path to X.509 cert file.
-h, --helpShow help message and exit.
-i, --input=<infile>The PDF file to encrypt.
-keyLength256Key length in bits (valid values: 40, 128 or 256)
-o, --output=<outfile>the encrypted PDF file. If omitted the original file is overwritten.
-O=[<ownerPassword>]set the owner password (ignored if certFile is set)
-U=[<userPassword>]set the user password (ignored if certFile is set)
-V, --versionPrint version information and exit.

ExtractImages

This application will extract all images from the given PDF document.

Usage: java -jar pdfbox-app-3.y.z.jar export:images [OPTIONS] -i=<infile>

Command-Line ParameterDescription
-h, --helpShow help message and exit.
-i, --input=<infile>The PDF file to decrypt.
-noColorConvertImages are extracted with their original colorspace if possible.
-password=[<password>]Password for the PDF or certificate in keystore.
-prefix=<prefix>the image prefix (default to pdf name).
-useDirectJPEGForces the direct extraction of JPEG/JPX images regardless of colorspace or masking.
-V, --versionPrint version information and exit.

ExtractText

This application will extract all text from the given PDF document.

Usage: java -jar pdfbox-app-3.y.z.jar export:text [OPTIONS] -i=<infile>

Command-Line ParameterDefaultDescription
-alwaysNextfalseProcess next page (if applicable) despite IOException (ignored when -html)
-consolefalseSend text to console instead of file.
-debugfalseEnables debug output about the time consumption of every stage.
-encoding=<encoding>UTF-8The encoding type of the text file, e.g. UTF-8 or ISO-8859-1, UTF-16BE, UTF-16LE, etc.
-endPage=<endPage>Integer.MAX_INTThe last page to extract (1 based, inclusive)
-h, --helpShow help message and exit.
-htmlfalseOutput in HTML format instead of raw text.
-i, --input=<infile>The PDF file to encrypt.
-ignoreBeadsfalseDisables the separation by beads.
-o, --output=<outfile>the exported text file.
-password=[<password>]Password for the PDF or certificate in keystore.
-rotationMagicfalseAnalyze each page for rotated/skewed text, rotate to 0° and extract separately (slower, and ignored when -html)
-sortfalseSort the text before writing.
-startPage=<startPage>1The first page to start extraction (1 based, inclusive)
-V, --versionPrint version information and exit.

OverlayPDF

This application will overlay one document with the content of another document

Usage: java -jar pdfbox-app-3.y.z.jar overlayPDF [OPTIONS] -i=<infile> -o=<outfile>

Command-Line ParameterDefaultDescription
-default=<defaultOverlay>the default overlay file.
-even=<evenPageOverlay>overlay file used for even pages.
-first=<firstPageOverlayoverlay file used for the first page.
-h, --helpShow help message and exit.
-i, --input=<infile>the PDF file to be overlayed.
-last=<lastPageOverlay>Overlay file used for the last pages.
-o, --output=<outfile>the resulting PDF file.
-odd=<oddPageOverlay>overlay file used for odd pages.
-page=<Integer=specificPageOverlay>overlay file used for the given page number, may occur more than once.
-position=<position>backgroundWhere to put the overlay, foreground or background.
-useAllPages=<useAllPagesOverlayoverlay file used for overlay, all pages are used by simply repeating them
-V, --versionPrint version information and exit.

Examples:

  • overlayPDF -i=input.pdf -default=overlay.pdf -o=output.pdf
  • overlayPDF -i=input.pdf -default=defaultOverlay.pdf -page=“10=overlayForPage10.pdf” -position=foreground -o=output.pdf
  • overlayPDF -i=input.pdf -odd=oddOverlay.pdf -even=evenOverlay.pdf -o=output.pdf

PDFDebugger

This application will take an existing PDF document and allows to analyze and inspect the internal structure. It is used as replacement for the PDFReader which was removed in 2.0.0.

Usage: java -jar pdfbox-app-3.y.z.jar debug [inputfile]

Command-Line ParameterDescription
inputfilethe name of an optional PDF file to open.
-h, --helpShow help message and exit.
-password=[<password>]password to derypt the PDF.
-viewstructureActivates the “view structure” view on startup.

PDFMerger

This application will take a list of pdf documents and merge them, saving the result in a new document.

Usage: java -jar pdfbox-app-3.y.z.jar merge [-hV] -o=outfile -i=<infile> [-i=<infile>]

Command-Line ParameterDescription
-h, --helpShow help message and exit.
-i, --input=<infile>the PDF files to merge
-o, --output=<outfile>the merged PDF file.
-V, --versionPrint version information and exit.

PDFSplit

This application will take an existing PDF document and split it into a number of new documents.

Per default the resulting files will be named after the original filenmame with -<nr> appended before the suffix. To override the filename use the outputPrefix option.

Usage: java -jar pdfbox-app-3.y.z.jar split [OPTIONS] -i=<infile>

Command-Line ParameterDescription
-endPage=<endPage>end page.
-h, --helpShow help message and exit.
-i, --input=<infile>the PDF file to split
--outputPrefix=<outputPrefix>the filename prefix for split files.
-password=[<password>]Password to the PDF.
-split=<split>split after this many pages (default 1, if startPage and endPage are unset).
-startPage=<startPage>start page.
-endPageThe page to stop at.
-V, --versionPrint version information and exit.

Examples:

  • PDFSplit -split=2 -i=sample_with_13_pages.pdf will split the pdf in pieces of 2 pages each except the last which will contain 1 page only.
  • PDFSplit -startPage=5 -i=sample_with_13_pages.pdf will provide a pdf containing all pages of the source pdf starting at page 5
  • PDFSplit -startPage=5 -endPage=10 -i=sample_with_13_pages.pdf will provide a pdf containing all pages from 5 to 10 of the source pdf
  • PDFSplit -split=2 -startPage=5 -endPage=10 -isample_with_13_pages.pdf will provide 3 pdfs containing all pages from 5 to 10 of the source pdf 2 pages each

PDFToImage

This application will create an image for every page in the PDF document.

Usage: java -jar pdfbox-app-3.y.z.jar render [OPTIONS] -i=<infile>

Command-Line ParameterDefaultDescription
-color=<imageType>rgbThe color depth (valid: BINARY, GRAY, RGB, ARGB, BGR)
-cropbox=<int> <int> <int> <int>The page area to export.
-dpi, -resolution=<dpi>detected from screen (or 96 if headless)the DPI of the output image
-endPage=<endPage>Integer.MAX_INTThe last page to convert, (one based, inclusive).
-format=<imageFormat>jpgThe image file format.
-h, --helpShow help message and exit.
-i, --input=<infile>the PDF file to convert.
-page=<page>The only page to extract (1-based).
-password=[<password>]Password for the PDF.
-prefix, -outputPrefix=<outputPrefix>Name of PDF documentthe filename prefix for image files
-quality=<quality>0 for PNG and 1 for other formatsthe quality to be used when compressing the image (0 <= quality <= 1).
-startPage=<startPage>1the first page to start extraction (one based)
-subSamplingactivate subsampling (for PDFs with huge images)
-timeprint timing information to stdout.
-V, --versionprint version information and exit.

PrintPDF

This application will send a pdf document to the printer.

Usage: java -jar pdfbox-app-3.y.z.jar print [OPTIONS] -i=<infile>

Command-Line ParameterDefaultDescription
-borderPrint with border.
-dpi=<dpi>render into intermediate image with specific dpi and then print
-duplex=<duplex>DOCUMENTprint using duplex (SIMPLEX, DUPLEX, TUMBLE, DOCUMENT).
-h, --helpShow help message and exit.
-i, --input=<infile>the PDF file to print.
-mediaSize=<mediaSize>print using media size name.
-noColorOptdisable color optimizations (useful when printing barcodes).
-orientationAUTOprint using orientation (AUTO, LANDSCAPE, PORTRAIT).
-password=[<password>]Password for the PDF.
-printerName=<printerName>print to specified printer.
-silentPrintprint without printer dialog box.
-tray=<tray>print using tray.
-V, --versionprint version information and exit.

TextToPDF

This application will create a PDF document from a text file.

Usage: java -jar pdfbox-app-3.y.z.jar fromText [OPTIONS] -i=<infile> -o=<outfile>

Command-Line ParameterDefaultDescription
-charset=<charset>UTF-8the charset to use.
-fontSize=<fontSize>10the size of the font to use.
-h, --helpShow help message and exit.
-i, --input=<infile>the text file to convert.
-landscapeset orientation to landscape.
-o, --output=<outfile>the generated PDF file.
-pageSize=<pageSize>LETTERthe page size to use (LETTER, LEGAL, A0, A1, A2, A3, A4, A5, A6).
-standardFont=<standardFont>HelveticaThe font to use for the text. Either this or -ttf should be specified but not both.
-ttf=<ttfFile>The TTF font to use for the text. Either this or -standardFont should be specified but not both.
-V, --versionprint version information and exit.

The following font names can be used for the parameter standardFont:

  • Courier
  • Courier-Bold
  • Courier-Oblique
  • Courier-BoldOblique
  • Helvetica
  • Helvetica-Bold
  • Helvetica-Oblique
  • Helvetica-BoldOblique
  • Symbol
  • Times-Bold
  • Times-Roman
  • Times-Italic
  • Times-BoldItalic
  • ZapfDingbats

WriteDecodedDoc

An application to decompress PDF documents.

Usage: java -jar pdfbox-app-3.y.z.jar decode [OPTIONS] <input-file> <output-file>

Command-Line ParameterDescription
input-filethe PDF document to decompress
output-filethe PDF file top save to
-h, --helpShow help message and exit.
-password=[<password>]Password for the PDF.
-skipImagesdon't uncompress images
-V, --versionprint version information and exit.