Random-access parse for PDF objects
implement random-access ("island") parser (walking objects from /Root). i'm not sure how much we need to know about the "DOM" for this. maybe nothing? since everything is built out of basic objects and we can just blindly follow references?
- Note: if I recall, text extraction uses the page catalog for finding text objects to some extent - Pomp
The most relevant part of the spec is probably 7.7.2 Document catalog dictionary, which describes the "DOM" referred to above.