- Feb 28, 2023
-
-
Sven M. Hallberg authored
Instead of throwing an assertion failure. Fixes #46.
-
- Feb 27, 2023
-
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
The validation compares the number of elements in the object and index sequences. The number of index entries is fixed to the number N given in the stream dictionary, cf. p_objstm__m(). Fixes #48.
-
Sven M. Hallberg authored
Fixes #47.
-
- Feb 17, 2023
-
-
Sven M. Hallberg authored
-
- Jan 13, 2023
-
-
Sven M. Hallberg authored
Since we can parse incrementally with packrat now, we can do what is already suggested in the TODO file and abandon this item. Closes #13.
-
Sven M. Hallberg authored
If we are using packrat, we can use objs = MANY_WS(obj) for the object streams as well.
-
Sven M. Hallberg authored
This reverts the part of commit f7dbb2ac that reworked the definition of arrays into explicit grammar recursion in order to make 'obj' compile with LALR. That project never came to fruition and with packrat it causes a recursive function call for every array element, exhausting the stack with large arrays. Fixes #26. This does not yet remove the explicitly recursive rules elemd and elemr because the latter is still used by the object stream parser.
-
Sven M. Hallberg authored
-
- Jan 06, 2023
-
-
Sven M. Hallberg authored
Moves program invocation details from README to pdf.1.mdoc. Includes the generated ASCII output for convenience. Make sure to regenerate with 'make doc' after changing the mdoc source.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
- Jan 05, 2023
-
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
- Dec 21, 2022
-
-
Sven M. Hallberg authored
The code in pdf.c actually does this already, but there is no reason not to be defensive here. Just for completeness' sake: There is nothing theoretically wrong with having even "earlier changes" (earlychange > 1), but we don't want that.
-
Sven M. Hallberg authored
Returning an empty HBytes was an artefact to satisfy the earlier structure of the grammar and is no longer necessary.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
Oh. My. God.
-
- Dec 20, 2022
-
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This replaces the validations on code9 etc. with one continuation that picks the appropriate parser. Also relaxes the parser to allow further output codes after the table is full. Looking at the spec, it seems to me at this times that the requirement for a clear code when the table is full is a requirement on producers of PDF files, but not on the file format itself. As far as I understand, conforming files can be created by a non-conforming process. Note: The implementation uses a slight trick to handle the last code (4095) correctly. Quoting the comment in act_output(): Rather than going through the effort of ensuring that the last code is only updated once, we simply assign one more code as a dummy. So, the table is now 4097 entries in actual size. The last one will receive a bogus update every cycle, so that the last real code does not. This is less work than actually detecting and avoiding the bogus updates.
-
- Dec 19, 2022
-
-
Sven M. Hallberg authored
Since we don't expose the struct (any more), we might as well pick a simpler name for it.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
Also removes an unneeded memset.
-
Sven M. Hallberg authored
This avoids creating an HBytes for each and every code word. Instead, the code words are collected into blocks behind each clear code and translated together into a single HBytes per block.
-
Sven M. Hallberg authored
This saves us from allocating and freeing the HBytes that were stored in the table. It should also save memory since it essentially shares common prefixes between codes. The only remaining call to malloc() is the one allocating the global context object itself.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
Remember that HBytes itself just wraps a pointer and a size, so this does not significantly enlarge the struct, but it saves a whole bunch of allocation.
-
Sven M. Hallberg authored
No need for it to be part of the exposed interface.
-
Sven M. Hallberg authored
Commit 970f23cf already removed the use of h_butnot(), so there is no need anymore for act_output to handle code = 257 (eod).
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This includes the global context variable and all semantic actions and validations. Besides being good practice, this makes the "LZW" in their names unnecessary.
-
Sven M. Hallberg authored
The only difference between the codeword and the litspec rules was that the latter validated that code < 258. This has become redundant because they were only still used for eod and clear both of which have their own specific validation for the code value. Thus the litspec rules and their validations can go.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This frees up the more generic name.
-
Sven M. Hallberg authored
This makes LZW_literal redundant and removes the need to use h_butnot() to detect eod.
-
Sven M. Hallberg authored
-