- Dec 19, 2022
-
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
Also removes an unneeded memset.
-
Sven M. Hallberg authored
This avoids creating an HBytes for each and every code word. Instead, the code words are collected into blocks behind each clear code and translated together into a single HBytes per block.
-
Sven M. Hallberg authored
This saves us from allocating and freeing the HBytes that were stored in the table. It should also save memory since it essentially shares common prefixes between codes. The only remaining call to malloc() is the one allocating the global context object itself.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
Remember that HBytes itself just wraps a pointer and a size, so this does not significantly enlarge the struct, but it saves a whole bunch of allocation.
-
Sven M. Hallberg authored
No need for it to be part of the exposed interface.
-
Sven M. Hallberg authored
Commit 970f23cf already removed the use of h_butnot(), so there is no need anymore for act_output to handle code = 257 (eod).
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This includes the global context variable and all semantic actions and validations. Besides being good practice, this makes the "LZW" in their names unnecessary.
-
Sven M. Hallberg authored
The only difference between the codeword and the litspec rules was that the latter validated that code < 258. This has become redundant because they were only still used for eod and clear both of which have their own specific validation for the code value. Thus the litspec rules and their validations can go.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This frees up the more generic name.
-
Sven M. Hallberg authored
This makes LZW_literal redundant and removes the need to use h_butnot() to detect eod.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This makes test/valid/lzw.pdf report decoder failure. Is that file actually valid?
-
Sven M. Hallberg authored
After the previous commit, we no longer need to know the last seen code. The only remaining use was the test whether we have already assigned a code (after clear). We can just as well detect that by inspecting the number of defined codes.
-
Sven M. Hallberg authored
This changes the logic of act_LZW_codeword such that it creates a new table entry after processing each code word, even though it does not know the last character, yet. We know that we will discover the last character on the next round, before we need it for any output. In return we can remove all the fumbling around with prev_string. A tiny gripe remains in the fact that HBytes declares its token member const, so technically we are forbidden from filling in the last character after the fact. But also technically, we can sledgehammer-cast the const away, thanks. Also slightly extends coverage of the defensive asserts and exposes a bug (test/valid/lzw.pdf crashes) that I think must have been there before: It seems that we never validate that code words are actually in the defined range!?
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This saves a tiny bit of code dup in updating ctx->old and building the return value.
-
Sven M. Hallberg authored
The only thing missing from act_LZW_codeword is to skip the table update on the first code after a clear. The rest of the relevant code path is virtually identical.
-
Sven M. Hallberg authored
This logically matches the H_ALLOC in act_LZW_literal. NB: We can drop the multiplication by sizeof(uint8_t) because the latter is guaranteed to be 1. If uint8_t exists, CHAR_BIT must equal 8.
-
Meredith L. Patterson authored
Fix recent instigator crashes Closes #25, #31, #35, #36, and #37 See merge request !35
-
- Dec 18, 2022
-
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This removes the requirement to have at least one code word after a clear code and fixes test/valid/lzw-clear2.pdf.
-
Sven M. Hallberg authored
This "fixes" the assertion failure on test/valid/lzw-clear2.pdf but exposes another issue: Codes 256 (clear) and 257 (eod) are actually allowed to follow a clear code, yet our grammar says otherwise.
-
Sven M. Hallberg authored
Includes a failing test. The decoder currently mishandles the code word following a clear code. It expects a literal (codes 0-255) but accepts the range 0-257. This appears to be the root cause of issues #26 and #37.
-
- Dec 16, 2022
-
-
Sven M. Hallberg authored
Fixes test/valid/lzw-clear.pdf
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
This makes the fall-through handling of TW_Tqq (reusing the TW_Tj path) match the struct layout. Fixes #36.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
Fix several erroneous uses of dictentry/resolve. In particular, it is fine to call resolve() on NULL (it is a no-op in that case), but its result must be tested. It returns NULL, for instance, if a referenced object does not exist. The correct idiom is therefore val = dictentry(dict, "key"); val = resolve(aux, val); or val = resolve(aux, dictentry(dict, "key")); depending on taste. I have preferred the former (two-line) variant. The idiom should be encapsulated in a function. There are also several occurances of dictentry() in the code that are not followed by resolve() but probably should be. Fixes #31, #35.
-
Sven M. Hallberg authored
-
Sven M. Hallberg authored
-