PDFSyntax

A Python library & tool to inspect and transform the internal structure of PDF files

Free and open-source, written from scratch in a functional style, self-contained and hosted on GitHub

API

Inspect & transform programmatically the details of the inner objects, using the API from your application or the Python REPL

CLI

Get quicky some insights about a PDF file from the command line, like the metadata, the fonts used, the text of the document,...

HTML

Browse the internal structure of a PDF file with a generated HTML file that offers a logical navigation as an overlay on top of the physical structure

About the author

Martin loves problem solving. He builds PDFSyntax in his spare time.