From ccb489cdc9f8427d4c9927ca224f90e777de11bc Mon Sep 17 00:00:00 2001
From: "Sven M. Hallberg" <pesco@khjk.org>
Date: Mon, 3 Feb 2020 11:27:25 +0100
Subject: [PATCH] add TODO

---
 TODO | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)
 create mode 100644 TODO

diff --git a/TODO b/TODO
new file mode 100644
index 0000000..eae25e8
--- /dev/null
+++ b/TODO
@@ -0,0 +1,29 @@
+ - move main routine(s) into separate source file.
+ - move filter implementation(s) into separate source file.
+
+ - investigate memory use on big documents (millions of objects).
+
+ - replace disparate parsing routines (applied to different pieces of input)
+   with one big HParser that uses h_seek() to move around. this will enable
+   packrat to cache, for instance, the xref tables instead of us parsing them
+   once to resolve references and again as part of the linear parse.
+
+ - parse stream objects without reference to their /Length entry by simply
+   trying all possible ways and consistency-checking them against the xref
+   table in the end, via h_attr_bool().
+
+ - include position information, at least for objects, in the (JSON) output.
+ - format warnings/errors (stderr) as JSON, too.
+
+ - make custom token types for all appropriate parts of the parse result.
+
+ - parse content streams.
+
+ - implement random-access parser (walking objects from /Root).
+ - check linear and random-access parses for consistency.
+
+ - handle garbage before %PDF- and after %%EOF
+ - handle garbage at other points in the input?
+
+ - add ASCII filter types.
+ - add LZW filter.
-- 
GitLab