Home Page
About Us


Library to create and manipulate PDF documents in Java and C#


The Apache PDFBox library is an open source Java tool for working with PDF documents



I have noticed that content extraction is faster in itext but searching words using regex in the content extracted by itext takes longer time than pdfbox

from question  

Any difference in content extracted by pdfbox and itext

Pdfbox contains tools for text extraction;itext has more low-level support for text manipulation but you d have to write a considerable amount of code to get text extraction

from question  

How to read PDF files using Java?

On the downside pdfbox is less mature than itext so it has less features and documentation available

from question  

PDF in XPages without using iText?

Pdfbox is a lot slower than itext when it comes to this

from question  

Faster PDF page dimensions using PDFBox?

Start with pdfbox as it s text extraction abilities are better than itext s

from question  

PDF to text tool or Java library?

Back to Home
Data comes from Stack Exchange with CC-BY-SA-4.0