CLI

Usage

The general form of the CLI usage is:

    python3 -m pdfsyntax COMMAND FILE

You can get quick insights on a PDF file with these commands:

overview

The output shows information about:

disasm

The output shows a terse and greppable view of the file internal structure.Please refer to the Disassembler article for details.

text

The output shows a full extract of the text content, with a spatial awareness: the algorithm tries to respect the original layout, as if characters of all sizes were approximately rendered on a fixed-size grid.

fonts

The output shows a list of fonts used in the file, with the following tabular data:

browse

This command generates HTML output that looks like the raw PDF file with additionnal hyperlinks and information that expose its internal structure and relations between its objects.Redirect the standard output to a file that you can open in your browser:

    python3 -m pdfsyntax browse file.pdf > inspection_file.html

Please refer to the Browse article for details.

TO BE CONTINUED