Skip to content
Snippets Groups Projects
  1. Feb 13, 2023
  2. Sep 28, 2022
  3. Sep 22, 2022
    • Sven M. Hallberg's avatar
      fail packrat parsers if they need more input · 866fd4d6
      Sven M. Hallberg authored
      This commit changes the contract for the combinator parse functions:
      
       (1) The input state on failure must retain valid overrun and last_chunk
           fields. The latter is never changed, but overrun would be cleared by
           various combinators that backtrack in case of failure. All other
           fields of the input stream are still considered indeterminate after a
           failed parse.
      
       (2) If an overrun condition is encountered before the final chunk
           (last_chunk is false), the parse *must* fail. A helper want_suspend()
           is introduced as a shorthand for this check.
      
      Fixes the packrat/iterative/dummy test.
      866fd4d6
    • Sven M. Hallberg's avatar
      don't restore input state on failure · 0d7f1091
      Sven M. Hallberg authored
      There's no need.
      0d7f1091
    • Sven M. Hallberg's avatar
      improve a label name · f8e0dffb
      Sven M. Hallberg authored
      This is the case where parsing stops, which may be a parse error or not,
      depending on how many elements were read.
      f8e0dffb
    • Sven M. Hallberg's avatar
      remove an unreachable case · a30adad4
      Sven M. Hallberg authored
      Replace it with an assert. This case could never occur because it tests
      precisely the loop condition and there are no break statements in the loop.
      
      This was the only use of the 'err' label, so that can go. The code under it
      remains the fall-through case for 'err0', i.e. the actual error (parse failure)
      case.
      a30adad4
    • Sven M. Hallberg's avatar
      fit comment to 80 columns · 3b22f3e7
      Sven M. Hallberg authored
      3b22f3e7
    • Sven M. Hallberg's avatar
      make header guard match file name · 1c83beb7
      Sven M. Hallberg authored
      Also includes a very important cosmetic whitespace change.
      1c83beb7
  4. Jun 04, 2021
  5. May 08, 2021
    • picomeg's avatar
      Parsers use backend-vtable pointers without needing the enum. TODO: · 5c30b9c7
      picomeg authored
      right now there is duplication as the enum backend value is still
      present- need to find out if it's OK to make a breaking change for
      anyone who is for reasons known only to themselves chekcing which
      backend enum value is one a parser they may have created and compiled...
      5c30b9c7
  6. Feb 09, 2020
  7. Jan 12, 2020
  8. Dec 05, 2019
  9. Dec 03, 2019
  10. Dec 02, 2019
  11. Nov 28, 2019
  12. Nov 26, 2019
    • xentrac's avatar
      Fix bug #19 in permutations and sequences too · 032430e3
      xentrac authored
      In `h_sequence__ma` the same bug occurs, and it manifests as a crash
      in the same way, so I've added a test for it.  In `h_permutation__ma`
      it evidently exists in the same form, but I haven't figured out how to
      reproduce it; in that case I added a fix to the implementation, but no
      test.
      032430e3
  13. Nov 23, 2019
    • xentrac's avatar
      Fix #19, GLR backend reaches unreachable code · 9e662b68
      xentrac authored
      Original behavior:
      
          hammer/build/debug/src/bindings/python$ LD_LIBRARY_PATH=. gdb python
          ...
          (gdb) r
          ...
          Python 2.7.6 (default, Nov 12 2018, 20:00:40)
          [GCC 4.8.4] on linux2
          Type "help", "copyright", "credits" or "license" for more information.
          >>> if 1:
          ...     import hammer as h
          ...     h.choice(h.ch_range('0', '9'), h.ch_range('A', 'Z'), h.ch_range('a', 'z')).compile(h._PB_GLR)
          ...
      
          Program received signal SIGSEGV, Segmentation fault.
          0xb79ecab2 in collect_nts (grammar=0x836abe0, symbol=0xb7d550a4)
              at build/debug/src/cfgrammar.c:120
          120	      for(x = (*s)->items; *x != NULL; x++) {
          (gdb) bt
          #0  0xb79ecab2 in collect_nts (grammar=0x836abe0, symbol=0xb7d550a4)
              at build/debug/src/cfgrammar.c:120
          #1  0xb79ecacd in collect_nts (grammar=0x836abe0, symbol=0x836ab58)
              at build/debug/src/cfgrammar.c:121
          #2  0xb79ec90a in h_cfgrammar_ (mm__=0xb79ff4b4 <system_allocator>,
              desugared=0x836ab58) at build/debug/src/cfgrammar.c:66
          #3  0xb79e8207 in h_lalr_compile (mm__=0xb79ff4b4 <system_allocator>,
              parser=0x836ab40, params=0x0) at build/debug/src/backends/lalr.c:280
          #4  0xb79e634a in h_glr_compile (mm__=0xb79ff4b4 <system_allocator>,
              parser=0x836ab40, params=0x0) at build/debug/src/backends/glr.c:15
          #5  0xb79f0eef in h_compile__m (mm__=0xb79ff4b4 <system_allocator>,
              parser=0x836ab40, backend=PB_GLR, params=0x0)
              at build/debug/src/hammer.c:97
          #6  0xb79f0e9d in h_compile (parser=parser@entry=0x836ab40, backend=PB_GLR,
              params=params@entry=0x0) at build/debug/src/hammer.c:92
          #7  0xb7a54ca4 in HParser__compile (backend=<optimized out>, self=0x836ab40)
              at hammer_wrap.c:3567
      
      New behavior:
      
          hammer/build/debug/src/bindings/python$ LD_LIBRARY_PATH=. gdb python
          ...
          (gdb) r
          ...
          Python 2.7.6 (default, Nov 12 2018, 20:00:40)
          [GCC 4.8.4] on linux2
          Type "help", "copyright", "credits" or "license" for more information.
          >>> if 1:
          ...     import hammer as h
          ...     h.choice(h.ch_range('0', '9'), h.ch_range('A', 'Z'), h.ch_range('a', 'z')).compile(h._PB_GLR)
          ...
          True
          >>> ^D
          [Inferior 1 (process 19621) exited normally]
          (gdb) quit
      
      After thrashing about for a few hours, this was the crucial clue:
      
          >>> import hammer as h
          >>> h.choice(h.ch('0')).compile(h._PB_GLR)
          ==18856== Conditional jump or move depends on uninitialised value(s)
          ==18856==    at 0x4A34FE9: h_desugar (desugar.c:7)
          ==18856==    by 0x4A2D150: h_desugar_augmented (lalr.c:261)
          ==18856==    by 0x4A2D1F7: h_lalr_compile (lalr.c:280)
          ==18856==    by 0x4A2B349: h_glr_compile (glr.c:15)
          ==18856==    by 0x4A35EEE: h_compile__m (hammer.c:97)
          ==18856==    by 0x4A35E9C: h_compile (hammer.c:92)
          ==18856==    by 0x49B0CA3: HParser__compile (hammer_wrap.c:3567)
      
      The particular thing that it's saying is uninitialized seems to be
      
            if(parser->desugared == NULL) {
      
      The `parser` in question is the `HParser` we're trying to desugar,
      which is presumably the choice object created by `h.choice`, which
      seems to be invoking `h_choice__a`:
      
          def choice(*args): return _h_choice__a(list(args))
      
      That was implemented as follows:
      
          HParser* h_choice__a(void *args[]) {
            return h_choice__ma(&system_allocator, args);
          }
      
          HParser* h_choice__ma(HAllocator* mm__, void *args[]) {
            size_t len = -1; // because do...while
            const HParser *arg;
      
            do {
              arg=((HParser **)args)[++len];
            } while(arg);
      
            HSequence *s = h_new(HSequence, 1);
            s->p_array = h_new(HParser *, len);
      
            for (size_t i = 0; i < len; i++) {
              s->p_array[i] = ((HParser **)args)[i];
            }
      
            s->len = len;
            HParser *ret = h_new(HParser, 1);
            ret->vtable = &choice_vt;
            ret->env = (void*)s;
            ret->backend = PB_MIN;
            return ret;
          }
      
      Indeed it does not seem to have been initializing `desugared`.  Fixing
      this cures this symptom.
      
      Other things it's probably worth checking out:
      
      - Are there other places where we create HParser objects where one or
        more fields may be uninitialized?
      - Perhaps `h_new` should zero the memory it returns, since it's only
        used for fixed-size objects and not things like variable-size
        character buffers?
      9e662b68
  14. Dec 20, 2015
  15. Nov 30, 2015
    • Sven M. Hallberg's avatar
      don't allocate a new arena in h_bind, use the existing one · ca1d8df0
      Sven M. Hallberg authored
      Rationale: If memory allocation fails in the inner parse and we
      longjump up the stack, the temporary arena will be missed and leak.
      
      NB: This change means that any allocations done by the continuation
      (in the form of new parsers, probably) will persist for the
      lifetime of the parse result. Beware of wasting too much memory
      this way! The bind continuation should generally keep dynamic
      allocations to a minimum.
      ca1d8df0
  16. Nov 01, 2015
  17. Oct 29, 2015
  18. Oct 03, 2015
  19. Sep 16, 2015
  20. Aug 31, 2015
  21. Aug 09, 2015
  22. Mar 04, 2015
  23. Feb 23, 2015
  24. Jan 23, 2015
  25. Jan 04, 2015
    • TQ Hirsch's avatar
      Fix #118 · af73181c
      TQ Hirsch authored
      NEWS:
      * Switching endianness mid-byte no longer potentially re-reads bytes.
      * bit_offset now consistently refers to the number of bits already
        read.
      * HParsedTokens now have a bit_length field; this is a size_t.  This
        may be removed for memory reasons.
      
      The bit writer has not yet been updated to match; the result of
      switching bit writer endianness in the middle of a byte remains
      undefined.
      af73181c
  26. Jun 18, 2014
  27. May 12, 2014
  28. May 11, 2014
  29. May 07, 2014
  30. Apr 20, 2014
Loading