Natural Language Processing
How to extract text from PDF files
By
How to extract text from PDF files
•
August 17th, 2020
In the following I want to present the open-source Python PDF tools PyPDF2, pdfminer and PyMuPDF that can be used to extract text from PDF files. I will compare their features and point out some drawbacks.