Paul McGuire, author of pyparsing, was nice enough to write up a TAP parser using it for comparison to my yeanpypa attempt. A few days earlier I had hacked up my own pyparsing version too, as pyparsing has a few features that seemed like it would make a more forgiving parser. Here’s the pyparsing version I wrote:
from pyparsing import *
non_zero_digits = '123456789'
digits = '0' + non_zero_digits
non_zero_number = Combine(Word(non_zero_digits, digits))
plan = Combine(Suppress(Literal('1..')) + non_zero_number)
test_num = non_zero_number('num')
description = Combine(NotAny(digits) + OneOrMore(CharsNotIn('#\n') | Literal('\\#')))('description')
directive = Suppress(Literal('#')) + Optional(CaselessLiteral('skip')('skip')
result = Literal('not ok')('notok') | Literal('ok')('ok')
preamble = ZeroOrMore(~result + SkipTo(OneOrMore(LineEnd()), include=True))
test_line = result + Optional(test_num + Optional(description) + Optional(directive))
test = Group(Combine(preamble)('preamble') + test_line)
tokens = re.sub(r'^-\s*', '', tokens)
tokens['description'] = re.sub(r'^-\s*', '', tokens['description'])
tap = (plan('plan') & Group(OneOrMore(test))('tests')) | ('1..0' + directive)
Some interesting features:
- I used the pydoc and the examples instead of the actual manual, so I didn’t see
setDefaultWhitespaceCharsthat Paul used. I had to use
directiveregexp so the dot wouldn’t match the
- I used a parse action to strip the optional leading dash from
description, rather than try to match it. Looking back now I don’t see why I couldn’t include the
Optional('-')on the front, but it seemed to make sense when I wrote it.
Combine and the naming feature means I get immediately useful results with this parser, unlike the yeanpypa parser, which ends up being more of a tokenizer.
>>> from tap_parser import tap >>> test1 = """\ ... 1..4 ... ok 1 - Input file opened ... not ok 2 - First line of the input valid ... ok 3 - Read the rest of the file ... not ok 4 - Summarized correctly # TODO Not written yet ... """ >>> res = tap.parseString(test1) >>> res.plan '4' >>> res.tests.ok 'ok' >>> res.tests.ok '' >>> res.tests.notok 'not ok' >>> res.tests.description 'First line of the input valid' >>> [res.tests[f] for f in ('num', 'todo', 'description')] ['4', 'todo', 'Summarized correctly '] >>>
Paul still uses a function in his example to massage the results into output, but my results map almost exactly into the Django model objects I planned to make.
On the other hand, mine fails Paul’s
test3 example, which is certainly proper TAP. Seems to not be handling post-planned results due to my attempted use of
& operator). Hmm.
Update: Oh, no, it was the missing test numbers that made it barf. Somehow I’d read into the TAP spec that a test number was required if there was a description… or at least accidentally written that into the parser. Changing
test_line’s definition thusly made the tests work (though I’m only testing one stream for correctness so far):
test_line = result + Optional(test_num) + Optional(description) + Optional(directive)