This article shows you how to extract text from an online PDF document in Python.
Here’s how:
- Copy the following source code to your Python script.
- Specify the URL of your online PDF document on line 10.
- Replace the Client ID and Secret on lines 5 and 6 if you have your own credentials.
- Install the
Requests
library because the sample script depends on it:pip install requests
- Make your script executable:
chmod 755 extract-pdf-text-sync.py
- Run the script to see the result:
./extract-pdf-text-sync.py
If your PDF file is larger than 1 MB, you will need to call the asynchronous API instead. See an example in Python.
The trial account only allows you to call the PDF-to-Text API up to 20 times for learning purpose. Upgrade to a Premium plan to use the API seriously.
Want to extract PDF text in another programming language? Check out the PDF-to-Text API page.