Actually get a CMap out of a PDF with pdftour (e91c364e) · Commits · xentrac / izodparse

Commit e91c364e authored 3 years ago by Kragen Javier Sitaker

Actually get a CMap out of a PDF with pdftour

I renamed `parsecmaps.py` to `pdftour.py`.  Now it's possible to use
it to navigate the structure of at least one PDF file well enough to
parse a CMap out of it.

This involved adding some stream support, which involved tweaking the
parsing engine a bit.  In keeping with the rest of the
fast-and-loose-exploration nature of the program, it doesn't even
check for `endstream` after the end of the stream, much less `endobj`.

With that, and a bit of tweaking, my `cmaps_for_pages` code from last
week runs now!

parent d7c5547f

Hide whitespace changes

Inline Side-by-side

Showing with 75 additions and 24 deletions

Please register or to comment