From a5abf1e2d9cdc9bbb71f02f8555d2055309541c5 Mon Sep 17 00:00:00 2001
From: xentrac <xentrac@special-circumstanc.es>
Date: Fri, 26 Feb 2021 01:05:57 -0300
Subject: [PATCH] Fix segfault when `decode_stream` fails in xrefs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In instigator-crashes/aux-xrefs-segfault an invalid flate-encoded stream
was producing this behavior:

    inflate: invalid distance too far back (-3)
    parse error in stream (XRef)
    ../instigator-crashes/aux-xrefs-segfault: error parsing xref section at position 249939 (0x3d053)

    Program received signal SIGSEGV, Segmentation fault.
    0x000055555555d91f in lookup_xref (aux=0x7fffffffdf60, nr=4, gen=0) at pdf.c:1249
    1249                    HCountedArray *subs = H_INDEX_SEQ(aux->xrefs[i], 0);

What was happening was that `act_ks_value`, indirectly invoked by
`parse_xrefs`, invoked `decode_stream`, which produced the "inflate:"
message and returned NULL; so `act_ks_value` produced the "parse error
in stream" message and returned an HParseResult of that NULL pointer.
Higher up the stack `act_xrstm` packs this NULL pointer into element 0
of a new `h_sequence`.  `parse_xrefs` was happily storing this
`h_sequence` into `aux->xrefs[0]`, then blithely continuing to the next
loop iteration, at which point it would report "error parsing xref
section" and return back to main().

However, this did not abort parsing the file!  main() was continuing on
to attempt to parse the PDF file as a whole, but the first time the
resulting parse tried to `lookup_xref`, that lookup would attempt to
iterate over the xrefs section in the file, checking to see if the xref
number belonged to any of them.  The line of code above then segfaulted
while attempting to assert that the NULL was actually a valid
`h_sequence` pointer.
So this patch simply prevents `parse_xrefs` from treating the failed xrefs
section as valid.  The result is that, as before, the parse exits shortly
because it can't follow any xrefs — but now without segfaulting!

    inflate: invalid distance too far back (-3)
    parse error in stream (XRef)
    ../instigator-crashes/aux-xrefs-segfault: error parsing xref section at position 255242 (0x3e50a)
    VIOLATION[1]@433 (0x1b1): Missing endobj token (severity=1)
    ../instigator-crashes/aux-xrefs-segfault: no parse
    VIOLATION[1]@433 (0x1b1): Missing endobj token (severity=1)
    ../instigator-crashes/aux-xrefs-segfault: error after position 433 (0x1b1)
    [Inferior 1 (process 626584) exited with code 01]
---
 pdf.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/pdf.c b/pdf.c
index c2d370e..6782e47 100644
--- a/pdf.c
+++ b/pdf.c
@@ -2356,12 +2356,11 @@ parse_xrefs(const uint8_t *input, size_t sz, size_t *nxrefs)
 		//res = h_parse(p_xref, input + offset, sz - offset);
 		HParser *p = h_right(h_seek(offset * 8, SEEK_SET), p_xref);	// XXX
 		res = h_parse(p, input, sz);
-		if (res == NULL) {
+		if (res == NULL || res->ast == NULL || H_INDEX_TOKEN(res->ast, 0) == NULL) {
 			fprintf(stderr, "%s: error parsing xref section at "
 			    "position %zu (%#zx)\n", infile, offset, offset);
 			break;
 		}
-		assert(res->ast != NULL);
 
 		/* save this section in xrefs */
 		if (n >= SIZE_MAX / sizeof(HParsedToken *))
-- 
GitLab