Converting PDF files to HTML on Ubuntu

It’s very easy to convert PDFs to HTML on Ubuntu.

The command is pdftohtml

If it’s not installed, you can install it with:

sudo apt-get install poppler-utils

Options:

-f  : first page to convert
-l  : last page to convert
-q : don't print any messages or errors
-h : print usage information
 -help : print usage information
 -p : exchange .pdf links by .html
 -c : generate complex document
 -i : ignore images
 -noframes : generate no frames
 -stdout : use standard output
 -zoom  : zoom the pdf document (default 1.5)
 -xml : output for XML post-processing
 -hidden : output hidden text
 -nomerge : do not merge paragraphs
 -enc  : output text encoding name
 -dev  : output device name for Ghostscript (png16m, jpeg etc)
 -v : print copyright and version info
 -opw  : owner password (for encrypted files)
 -upw  : user password (for encrypted files)
 -nodrm : override document DRM settings

Here’s a usage example

pdftohtml infile.pdf outfile.html
Posted in Linux Tips

Leave a Reply

Your email address will not be published. Required fields are marked *

*