- Jul 08, 2021
-
-
Kragen Javier Sitaker authored
I renamed `parsecmaps.py` to `pdftour.py`. Now it's possible to use it to navigate the structure of at least one PDF file well enough to parse a CMap out of it. This involved adding some stream support, which involved tweaking the parsing engine a bit. In keeping with the rest of the fast-and-loose-exploration nature of the program, it doesn't even check for `endstream` after the end of the stream, much less `endobj`. With that, and a bit of tweaking, my `cmaps_for_pages` code from last week runs now!
-
Kragen Javier Sitaker authored
-
Kragen Javier Sitaker authored
Now I can open a PDF file and parse some objects out of it. Soon I'll be able to traverse the object graph.
-
- Jun 28, 2021
-
-
Kragen Javier Sitaker authored
Now we can see what grammar `csranges_to_grammar` has constructed for us from the CMap file and whether it makes sense.
-
- Jun 25, 2021
-
-
Kragen Javier Sitaker authored
-
Kragen Javier Sitaker authored
-