Examples of how to convert a pdf document pages to images using python
1. Using the python module pdf2image
The python module pdf2image is available on github. To install it a solution is to use pip:
pip install pdf2image
Note: the module needs poppler to run. If you use anaconda python distrubtion, it can be installed for example using the following command:
conda install -c conda-forge poppler
Then the module can now be imported:
>>> from pdf2image import convert_from_path
and the function convert_from_path() function can be used:
>>> pages = convert_from_path('document.pdf', dpi=200)
1.1 Convert all pdf document pages to images
To convert all pages of the pdf document to images, a solution is to use a loop over the iterative element pages:
>>> for idx,page in enumerate(pages):
... page.save('page'+str(idx)+'.jpg', 'JPEG')
1.2 To convert a given page
To convert a given page:
>>> page_idx = 0
>>> page = pages[page_idx]
>>> page.save('image.jpg', 'JPEG')
2. Using imagemagick
Another solution is to use imagemagick. To create a preview of the first page of a pdf document for example:
>>> import os
>>> os.system('convert document.pdf[0] image.jpg')
To change image size and the resolution:
convert -density 144 document.pdf[0] -resize 50% image.jpg
References
Links | Site |
---|---|
Extract a page from a pdf as a jpeg | stackoverflow |
pdf2image | github |
imagemagick | imagemagick |
converting 1 page of a pdf to jpg | imagemagick |
Converting a PDF to a series of images with Python | stackoverflow |
Image preview with Reportlab? | stackoverflow |
How to Convert PDF to Image Files | wikihow |