Skip to content
Snippets Groups Projects
Commit 92544e8c authored by Kragen Javier Sitaker's avatar Kragen Javier Sitaker
Browse files

Make izodparse a Python package

parent 590ea9d8
No related branches found
No related tags found
No related merge requests found
/*~ *~
/build
/dist
__pycache__
\ No newline at end of file
izodparse: tool for exploring file formats and testbed for Hammer ideas izodparse: tool for exploring file formats and testbed for Hammer ideas
======================================================================= =======================================================================
This is a prototype PEG parsing engine intended as a testbed for This is a prototype PEG parsing engine in Python 3 intended as a testbed for
low-cost experimentation, initially with CMap and PDF files and later low-cost experimentation, initially with CMap and PDF files and later
for Hammer features. Basically I wanted a more interactive way to for features for the Hammer parsing engine.
Basically I wanted a more interactive way to
explore PDF files than recompiling batch-mode C programs and looking explore PDF files than recompiling batch-mode C programs and looking
at the results in a text editor. at the results in a text editor.
Quick start
-----------
$ python ./setup.py install
...
$ python
Python 3.8.10 (default, Jun 2 2021, 10:49:15)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from izodparse import pdftour
>>> p = pdftour.read_pdf('../Descargas/1.2754649.pdf')
>>> p.catalog
<< /Type /Catalog /Pages 13 0 R >>
>>> _['Pages']
<< /Count 4 /Type /Pages /ITXT b'5.1.2' /Kids [11 0 R 38 0 R 74 0 R 86 0 R] >>
History History
------- -------
......
__version__ = '0.0.1'
#!/usr/bin/python3 #!/usr/bin/python3
"""Explore PDF file structure, at least enough to parse a CMap file, hopefully. r"""Explore PDF file structure, at least enough to parse a CMap file, hopefully.
For example, we can navigate the PDF structure. For example, we can navigate the PDF structure.
>>> import pdftour >>> from izodparse import pdftour
>>> p = pdftour.read_pdf('../Descargas/1.2754649.pdf') >>> p = pdftour.read_pdf('../Descargas/1.2754649.pdf')
>>> p.catalog >>> p.catalog
<< /Type /Catalog /Pages 13 0 R >> << /Type /Catalog /Pages 13 0 R >>
......
...@@ -6,7 +6,8 @@ popular) martel? (could be, but common surname) maillet? (could be, ...@@ -6,7 +6,8 @@ popular) martel? (could be, but common surname) maillet? (could be,
but common surname) otsuchi? (would be fine) totokia? (would be but common surname) otsuchi? (would be fine) totokia? (would be
fine) mere? (too common) patu? (too common) fine) mere? (too common) patu? (too common)
For now it's 1zodparse. For now it's 1zodparse. No, izodparse, so it's a valid Python module
name.
How do I git filter-branch? I want pdftour.py and parsecmaps.py. or How do I git filter-branch? I want pdftour.py and parsecmaps.py. or
maybe git-filter-repo? no, don't have it. --prune-empty? --all? maybe git-filter-repo? no, don't have it. --prune-empty? --all?
...@@ -66,6 +67,8 @@ we find this suggestion: ...@@ -66,6 +67,8 @@ we find this suggestion:
This apparently changes GIT_COMMITTER_DATE but I don't care. This apparently changes GIT_COMMITTER_DATE but I don't care.
* DONE make izodparse an installable Python package
Man, I forgot all about the distutils/setuptools mess.
* TODO examine example PDF file with compressed object streams and no fonts in page resource dictionaries * TODO examine example PDF file with compressed object streams and no fonts in page resource dictionaries
* TODO fix nested parentheses parsing * TODO fix nested parentheses parsing
* TODO make xrefs, etc., lazy properties * TODO make xrefs, etc., lazy properties
......
setup.py 0 → 100644
from distutils.core import setup # setuptools may be nicer but it's not in the stdlib
with open("README.md", "r", encoding="utf-8") as f:
long_description = f.read()
setup(
name='izodparse',
version='0.0.1',
author='Kragen Javier Sitaker',
license='GPL3+',
description='tool for exploring file formats and testbed for Hammer ideas',
long_description=long_description,
long_description_content_type="text/markdown",
platforms='any',
classifiers=[
"Development Status :: 2 - Pre-Alpha",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)",
"Operating System :: OS Independent",
"Programming Language :: Python",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Topic :: Software Development :: Libraries :: Python Modules",
],
python_requires=">=3.7",
packages=["izodparse"],
)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment