Refactor: Move content stream parsing, pagetree and text extraction into content.c

This is a part of #8.

We can decide later if maybe pagetree or text extraction deserve their own files.

Relevant to this is commit 6e5955c4 ("Most of the code folded in") and its descendants which are guilty of seriously messing up the order of things in pdf.c. The latter used to be split into relatively logical and self-contained sections before. Now not so much any more. The changes that introduced content stream handling should be carefully reviewed to restore that original order.