Learn to create, edit and process PDFs using Java by following this informative Apache PDFBox Tutorial.
Apache PDFBox Tutorial
About Apache PDFBox
Apache PDFBox is an open source from Apache Software Foundation. The tool is built in Java to work with Pdf documents. The tool is used to create, process and modify (or edit) pdf documents. It also contains command-line utilities.
Setup a Java project with pdfbox libraries to start working on pdf files.
Features of Apache PDFBox
Following are the features and possibilities feasible with the tool :
Extract Text and Images
- Extract text from PDF file.
- Extract position and size of characters in the PDF file.
- Extract words from PDF document.
- Extract text line by line from PDF document.
- Get position and size of images in the PDF.
- Extract images from PDF.
Split and Merge
Fill Forms
- Extract data from PDF form.
- Fill a PDF form.
- Print a PDF file programmatically.
Save as Image
- Save pages in PDF file as images.
Create PDFs
Signing
- Digitally sign PDF file.
Conclusion
In this Apache PDFBox Tutorial, we have gone through different PDFBox operations that are done programmatically on PDF files using PDFBox toolkit form Apache.