Path: blob/master/venv/Lib/site-packages/pip/_vendor/pyparsing.py
811 views
# -*- coding: utf-8 -*-1# module pyparsing.py2#3# Copyright (c) 2003-2019 Paul T. McGuire4#5# Permission is hereby granted, free of charge, to any person obtaining6# a copy of this software and associated documentation files (the7# "Software"), to deal in the Software without restriction, including8# without limitation the rights to use, copy, modify, merge, publish,9# distribute, sublicense, and/or sell copies of the Software, and to10# permit persons to whom the Software is furnished to do so, subject to11# the following conditions:12#13# The above copyright notice and this permission notice shall be14# included in all copies or substantial portions of the Software.15#16# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,17# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF18# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.19# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY20# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,21# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE22# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.23#2425__doc__ = \26"""27pyparsing module - Classes and methods to define and execute parsing grammars28=============================================================================2930The pyparsing module is an alternative approach to creating and31executing simple grammars, vs. the traditional lex/yacc approach, or the32use of regular expressions. With pyparsing, you don't need to learn33a new syntax for defining grammars or matching expressions - the parsing34module provides a library of classes that you use to construct the35grammar directly in Python.3637Here is a program to parse "Hello, World!" (or any greeting of the form38``"<salutation>, <addressee>!"``), built up using :class:`Word`,39:class:`Literal`, and :class:`And` elements40(the :class:`'+'<ParserElement.__add__>` operators create :class:`And` expressions,41and the strings are auto-converted to :class:`Literal` expressions)::4243from pip._vendor.pyparsing import Word, alphas4445# define grammar of a greeting46greet = Word(alphas) + "," + Word(alphas) + "!"4748hello = "Hello, World!"49print (hello, "->", greet.parseString(hello))5051The program outputs the following::5253Hello, World! -> ['Hello', ',', 'World', '!']5455The Python representation of the grammar is quite readable, owing to the56self-explanatory class names, and the use of '+', '|' and '^' operators.5758The :class:`ParseResults` object returned from59:class:`ParserElement.parseString` can be60accessed as a nested list, a dictionary, or an object with named61attributes.6263The pyparsing module handles some of the problems that are typically64vexing when writing text parsers:6566- extra or missing whitespace (the above program will also handle67"Hello,World!", "Hello , World !", etc.)68- quoted strings69- embedded comments707172Getting Started -73-----------------74Visit the classes :class:`ParserElement` and :class:`ParseResults` to75see the base classes that most other pyparsing76classes inherit from. Use the docstrings for examples of how to:7778- construct literal match expressions from :class:`Literal` and79:class:`CaselessLiteral` classes80- construct character word-group expressions using the :class:`Word`81class82- see how to create repetitive expressions using :class:`ZeroOrMore`83and :class:`OneOrMore` classes84- use :class:`'+'<And>`, :class:`'|'<MatchFirst>`, :class:`'^'<Or>`,85and :class:`'&'<Each>` operators to combine simple expressions into86more complex ones87- associate names with your parsed results using88:class:`ParserElement.setResultsName`89- access the parsed data, which is returned as a :class:`ParseResults`90object91- find some helpful expression short-cuts like :class:`delimitedList`92and :class:`oneOf`93- find more useful common expressions in the :class:`pyparsing_common`94namespace class95"""9697__version__ = "2.4.7"98__versionTime__ = "30 Mar 2020 00:43 UTC"99__author__ = "Paul McGuire <[email protected]>"100101import string102from weakref import ref as wkref103import copy104import sys105import warnings106import re107import sre_constants108import collections109import pprint110import traceback111import types112from datetime import datetime113from operator import itemgetter114import itertools115from functools import wraps116from contextlib import contextmanager117118try:119# Python 3120from itertools import filterfalse121except ImportError:122from itertools import ifilterfalse as filterfalse123124try:125from _thread import RLock126except ImportError:127from threading import RLock128129try:130# Python 3131from collections.abc import Iterable132from collections.abc import MutableMapping, Mapping133except ImportError:134# Python 2.7135from collections import Iterable136from collections import MutableMapping, Mapping137138try:139from collections import OrderedDict as _OrderedDict140except ImportError:141try:142from ordereddict import OrderedDict as _OrderedDict143except ImportError:144_OrderedDict = None145146try:147from types import SimpleNamespace148except ImportError:149class SimpleNamespace: pass150151# version compatibility configuration152__compat__ = SimpleNamespace()153__compat__.__doc__ = """154A cross-version compatibility configuration for pyparsing features that will be155released in a future version. By setting values in this configuration to True,156those features can be enabled in prior versions for compatibility development157and testing.158159- collect_all_And_tokens - flag to enable fix for Issue #63 that fixes erroneous grouping160of results names when an And expression is nested within an Or or MatchFirst; set to161True to enable bugfix released in pyparsing 2.3.0, or False to preserve162pre-2.3.0 handling of named results163"""164__compat__.collect_all_And_tokens = True165166__diag__ = SimpleNamespace()167__diag__.__doc__ = """168Diagnostic configuration (all default to False)169- warn_multiple_tokens_in_named_alternation - flag to enable warnings when a results170name is defined on a MatchFirst or Or expression with one or more And subexpressions171(only warns if __compat__.collect_all_And_tokens is False)172- warn_ungrouped_named_tokens_in_collection - flag to enable warnings when a results173name is defined on a containing expression with ungrouped subexpressions that also174have results names175- warn_name_set_on_empty_Forward - flag to enable warnings whan a Forward is defined176with a results name, but has no contents defined177- warn_on_multiple_string_args_to_oneof - flag to enable warnings whan oneOf is178incorrectly called with multiple str arguments179- enable_debug_on_named_expressions - flag to auto-enable debug on all subsequent180calls to ParserElement.setName()181"""182__diag__.warn_multiple_tokens_in_named_alternation = False183__diag__.warn_ungrouped_named_tokens_in_collection = False184__diag__.warn_name_set_on_empty_Forward = False185__diag__.warn_on_multiple_string_args_to_oneof = False186__diag__.enable_debug_on_named_expressions = False187__diag__._all_names = [nm for nm in vars(__diag__) if nm.startswith("enable_") or nm.startswith("warn_")]188189def _enable_all_warnings():190__diag__.warn_multiple_tokens_in_named_alternation = True191__diag__.warn_ungrouped_named_tokens_in_collection = True192__diag__.warn_name_set_on_empty_Forward = True193__diag__.warn_on_multiple_string_args_to_oneof = True194__diag__.enable_all_warnings = _enable_all_warnings195196197__all__ = ['__version__', '__versionTime__', '__author__', '__compat__', '__diag__',198'And', 'CaselessKeyword', 'CaselessLiteral', 'CharsNotIn', 'Combine', 'Dict', 'Each', 'Empty',199'FollowedBy', 'Forward', 'GoToColumn', 'Group', 'Keyword', 'LineEnd', 'LineStart', 'Literal',200'PrecededBy', 'MatchFirst', 'NoMatch', 'NotAny', 'OneOrMore', 'OnlyOnce', 'Optional', 'Or',201'ParseBaseException', 'ParseElementEnhance', 'ParseException', 'ParseExpression', 'ParseFatalException',202'ParseResults', 'ParseSyntaxException', 'ParserElement', 'QuotedString', 'RecursiveGrammarException',203'Regex', 'SkipTo', 'StringEnd', 'StringStart', 'Suppress', 'Token', 'TokenConverter',204'White', 'Word', 'WordEnd', 'WordStart', 'ZeroOrMore', 'Char',205'alphanums', 'alphas', 'alphas8bit', 'anyCloseTag', 'anyOpenTag', 'cStyleComment', 'col',206'commaSeparatedList', 'commonHTMLEntity', 'countedArray', 'cppStyleComment', 'dblQuotedString',207'dblSlashComment', 'delimitedList', 'dictOf', 'downcaseTokens', 'empty', 'hexnums',208'htmlComment', 'javaStyleComment', 'line', 'lineEnd', 'lineStart', 'lineno',209'makeHTMLTags', 'makeXMLTags', 'matchOnlyAtCol', 'matchPreviousExpr', 'matchPreviousLiteral',210'nestedExpr', 'nullDebugAction', 'nums', 'oneOf', 'opAssoc', 'operatorPrecedence', 'printables',211'punc8bit', 'pythonStyleComment', 'quotedString', 'removeQuotes', 'replaceHTMLEntity',212'replaceWith', 'restOfLine', 'sglQuotedString', 'srange', 'stringEnd',213'stringStart', 'traceParseAction', 'unicodeString', 'upcaseTokens', 'withAttribute',214'indentedBlock', 'originalTextFor', 'ungroup', 'infixNotation', 'locatedExpr', 'withClass',215'CloseMatch', 'tokenMap', 'pyparsing_common', 'pyparsing_unicode', 'unicode_set',216'conditionAsParseAction', 're',217]218219system_version = tuple(sys.version_info)[:3]220PY_3 = system_version[0] == 3221if PY_3:222_MAX_INT = sys.maxsize223basestring = str224unichr = chr225unicode = str226_ustr = str227228# build list of single arg builtins, that can be used as parse actions229singleArgBuiltins = [sum, len, sorted, reversed, list, tuple, set, any, all, min, max]230231else:232_MAX_INT = sys.maxint233range = xrange234235def _ustr(obj):236"""Drop-in replacement for str(obj) that tries to be Unicode237friendly. It first tries str(obj). If that fails with238a UnicodeEncodeError, then it tries unicode(obj). It then239< returns the unicode object | encodes it with the default240encoding | ... >.241"""242if isinstance(obj, unicode):243return obj244245try:246# If this works, then _ustr(obj) has the same behaviour as str(obj), so247# it won't break any existing code.248return str(obj)249250except UnicodeEncodeError:251# Else encode it252ret = unicode(obj).encode(sys.getdefaultencoding(), 'xmlcharrefreplace')253xmlcharref = Regex(r'&#\d+;')254xmlcharref.setParseAction(lambda t: '\\u' + hex(int(t[0][2:-1]))[2:])255return xmlcharref.transformString(ret)256257# build list of single arg builtins, tolerant of Python version, that can be used as parse actions258singleArgBuiltins = []259import __builtin__260261for fname in "sum len sorted reversed list tuple set any all min max".split():262try:263singleArgBuiltins.append(getattr(__builtin__, fname))264except AttributeError:265continue266267_generatorType = type((y for y in range(1)))268269def _xml_escape(data):270"""Escape &, <, >, ", ', etc. in a string of data."""271272# ampersand must be replaced first273from_symbols = '&><"\''274to_symbols = ('&' + s + ';' for s in "amp gt lt quot apos".split())275for from_, to_ in zip(from_symbols, to_symbols):276data = data.replace(from_, to_)277return data278279alphas = string.ascii_uppercase + string.ascii_lowercase280nums = "0123456789"281hexnums = nums + "ABCDEFabcdef"282alphanums = alphas + nums283_bslash = chr(92)284printables = "".join(c for c in string.printable if c not in string.whitespace)285286287def conditionAsParseAction(fn, message=None, fatal=False):288msg = message if message is not None else "failed user-defined condition"289exc_type = ParseFatalException if fatal else ParseException290fn = _trim_arity(fn)291292@wraps(fn)293def pa(s, l, t):294if not bool(fn(s, l, t)):295raise exc_type(s, l, msg)296297return pa298299class ParseBaseException(Exception):300"""base exception class for all parsing runtime exceptions"""301# Performance tuning: we construct a *lot* of these, so keep this302# constructor as small and fast as possible303def __init__(self, pstr, loc=0, msg=None, elem=None):304self.loc = loc305if msg is None:306self.msg = pstr307self.pstr = ""308else:309self.msg = msg310self.pstr = pstr311self.parserElement = elem312self.args = (pstr, loc, msg)313314@classmethod315def _from_exception(cls, pe):316"""317internal factory method to simplify creating one type of ParseException318from another - avoids having __init__ signature conflicts among subclasses319"""320return cls(pe.pstr, pe.loc, pe.msg, pe.parserElement)321322def __getattr__(self, aname):323"""supported attributes by name are:324- lineno - returns the line number of the exception text325- col - returns the column number of the exception text326- line - returns the line containing the exception text327"""328if aname == "lineno":329return lineno(self.loc, self.pstr)330elif aname in ("col", "column"):331return col(self.loc, self.pstr)332elif aname == "line":333return line(self.loc, self.pstr)334else:335raise AttributeError(aname)336337def __str__(self):338if self.pstr:339if self.loc >= len(self.pstr):340foundstr = ', found end of text'341else:342foundstr = (', found %r' % self.pstr[self.loc:self.loc + 1]).replace(r'\\', '\\')343else:344foundstr = ''345return ("%s%s (at char %d), (line:%d, col:%d)" %346(self.msg, foundstr, self.loc, self.lineno, self.column))347def __repr__(self):348return _ustr(self)349def markInputline(self, markerString=">!<"):350"""Extracts the exception line from the input string, and marks351the location of the exception with a special symbol.352"""353line_str = self.line354line_column = self.column - 1355if markerString:356line_str = "".join((line_str[:line_column],357markerString, line_str[line_column:]))358return line_str.strip()359def __dir__(self):360return "lineno col line".split() + dir(type(self))361362class ParseException(ParseBaseException):363"""364Exception thrown when parse expressions don't match class;365supported attributes by name are:366- lineno - returns the line number of the exception text367- col - returns the column number of the exception text368- line - returns the line containing the exception text369370Example::371372try:373Word(nums).setName("integer").parseString("ABC")374except ParseException as pe:375print(pe)376print("column: {}".format(pe.col))377378prints::379380Expected integer (at char 0), (line:1, col:1)381column: 1382383"""384385@staticmethod386def explain(exc, depth=16):387"""388Method to take an exception and translate the Python internal traceback into a list389of the pyparsing expressions that caused the exception to be raised.390391Parameters:392393- exc - exception raised during parsing (need not be a ParseException, in support394of Python exceptions that might be raised in a parse action)395- depth (default=16) - number of levels back in the stack trace to list expression396and function names; if None, the full stack trace names will be listed; if 0, only397the failing input line, marker, and exception string will be shown398399Returns a multi-line string listing the ParserElements and/or function names in the400exception's stack trace.401402Note: the diagnostic output will include string representations of the expressions403that failed to parse. These representations will be more helpful if you use `setName` to404give identifiable names to your expressions. Otherwise they will use the default string405forms, which may be cryptic to read.406407explain() is only supported under Python 3.408"""409import inspect410411if depth is None:412depth = sys.getrecursionlimit()413ret = []414if isinstance(exc, ParseBaseException):415ret.append(exc.line)416ret.append(' ' * (exc.col - 1) + '^')417ret.append("{0}: {1}".format(type(exc).__name__, exc))418419if depth > 0:420callers = inspect.getinnerframes(exc.__traceback__, context=depth)421seen = set()422for i, ff in enumerate(callers[-depth:]):423frm = ff[0]424425f_self = frm.f_locals.get('self', None)426if isinstance(f_self, ParserElement):427if frm.f_code.co_name not in ('parseImpl', '_parseNoCache'):428continue429if f_self in seen:430continue431seen.add(f_self)432433self_type = type(f_self)434ret.append("{0}.{1} - {2}".format(self_type.__module__,435self_type.__name__,436f_self))437elif f_self is not None:438self_type = type(f_self)439ret.append("{0}.{1}".format(self_type.__module__,440self_type.__name__))441else:442code = frm.f_code443if code.co_name in ('wrapper', '<module>'):444continue445446ret.append("{0}".format(code.co_name))447448depth -= 1449if not depth:450break451452return '\n'.join(ret)453454455class ParseFatalException(ParseBaseException):456"""user-throwable exception thrown when inconsistent parse content457is found; stops all parsing immediately"""458pass459460class ParseSyntaxException(ParseFatalException):461"""just like :class:`ParseFatalException`, but thrown internally462when an :class:`ErrorStop<And._ErrorStop>` ('-' operator) indicates463that parsing is to stop immediately because an unbacktrackable464syntax error has been found.465"""466pass467468#~ class ReparseException(ParseBaseException):469#~ """Experimental class - parse actions can raise this exception to cause470#~ pyparsing to reparse the input string:471#~ - with a modified input string, and/or472#~ - with a modified start location473#~ Set the values of the ReparseException in the constructor, and raise the474#~ exception in a parse action to cause pyparsing to use the new string/location.475#~ Setting the values as None causes no change to be made.476#~ """477#~ def __init_( self, newstring, restartLoc ):478#~ self.newParseText = newstring479#~ self.reparseLoc = restartLoc480481class RecursiveGrammarException(Exception):482"""exception thrown by :class:`ParserElement.validate` if the483grammar could be improperly recursive484"""485def __init__(self, parseElementList):486self.parseElementTrace = parseElementList487488def __str__(self):489return "RecursiveGrammarException: %s" % self.parseElementTrace490491class _ParseResultsWithOffset(object):492def __init__(self, p1, p2):493self.tup = (p1, p2)494def __getitem__(self, i):495return self.tup[i]496def __repr__(self):497return repr(self.tup[0])498def setOffset(self, i):499self.tup = (self.tup[0], i)500501class ParseResults(object):502"""Structured parse results, to provide multiple means of access to503the parsed data:504505- as a list (``len(results)``)506- by list index (``results[0], results[1]``, etc.)507- by attribute (``results.<resultsName>`` - see :class:`ParserElement.setResultsName`)508509Example::510511integer = Word(nums)512date_str = (integer.setResultsName("year") + '/'513+ integer.setResultsName("month") + '/'514+ integer.setResultsName("day"))515# equivalent form:516# date_str = integer("year") + '/' + integer("month") + '/' + integer("day")517518# parseString returns a ParseResults object519result = date_str.parseString("1999/12/31")520521def test(s, fn=repr):522print("%s -> %s" % (s, fn(eval(s))))523test("list(result)")524test("result[0]")525test("result['month']")526test("result.day")527test("'month' in result")528test("'minutes' in result")529test("result.dump()", str)530531prints::532533list(result) -> ['1999', '/', '12', '/', '31']534result[0] -> '1999'535result['month'] -> '12'536result.day -> '31'537'month' in result -> True538'minutes' in result -> False539result.dump() -> ['1999', '/', '12', '/', '31']540- day: 31541- month: 12542- year: 1999543"""544def __new__(cls, toklist=None, name=None, asList=True, modal=True):545if isinstance(toklist, cls):546return toklist547retobj = object.__new__(cls)548retobj.__doinit = True549return retobj550551# Performance tuning: we construct a *lot* of these, so keep this552# constructor as small and fast as possible553def __init__(self, toklist=None, name=None, asList=True, modal=True, isinstance=isinstance):554if self.__doinit:555self.__doinit = False556self.__name = None557self.__parent = None558self.__accumNames = {}559self.__asList = asList560self.__modal = modal561if toklist is None:562toklist = []563if isinstance(toklist, list):564self.__toklist = toklist[:]565elif isinstance(toklist, _generatorType):566self.__toklist = list(toklist)567else:568self.__toklist = [toklist]569self.__tokdict = dict()570571if name is not None and name:572if not modal:573self.__accumNames[name] = 0574if isinstance(name, int):575name = _ustr(name) # will always return a str, but use _ustr for consistency576self.__name = name577if not (isinstance(toklist, (type(None), basestring, list)) and toklist in (None, '', [])):578if isinstance(toklist, basestring):579toklist = [toklist]580if asList:581if isinstance(toklist, ParseResults):582self[name] = _ParseResultsWithOffset(ParseResults(toklist.__toklist), 0)583else:584self[name] = _ParseResultsWithOffset(ParseResults(toklist[0]), 0)585self[name].__name = name586else:587try:588self[name] = toklist[0]589except (KeyError, TypeError, IndexError):590self[name] = toklist591592def __getitem__(self, i):593if isinstance(i, (int, slice)):594return self.__toklist[i]595else:596if i not in self.__accumNames:597return self.__tokdict[i][-1][0]598else:599return ParseResults([v[0] for v in self.__tokdict[i]])600601def __setitem__(self, k, v, isinstance=isinstance):602if isinstance(v, _ParseResultsWithOffset):603self.__tokdict[k] = self.__tokdict.get(k, list()) + [v]604sub = v[0]605elif isinstance(k, (int, slice)):606self.__toklist[k] = v607sub = v608else:609self.__tokdict[k] = self.__tokdict.get(k, list()) + [_ParseResultsWithOffset(v, 0)]610sub = v611if isinstance(sub, ParseResults):612sub.__parent = wkref(self)613614def __delitem__(self, i):615if isinstance(i, (int, slice)):616mylen = len(self.__toklist)617del self.__toklist[i]618619# convert int to slice620if isinstance(i, int):621if i < 0:622i += mylen623i = slice(i, i + 1)624# get removed indices625removed = list(range(*i.indices(mylen)))626removed.reverse()627# fixup indices in token dictionary628for name, occurrences in self.__tokdict.items():629for j in removed:630for k, (value, position) in enumerate(occurrences):631occurrences[k] = _ParseResultsWithOffset(value, position - (position > j))632else:633del self.__tokdict[i]634635def __contains__(self, k):636return k in self.__tokdict637638def __len__(self):639return len(self.__toklist)640641def __bool__(self):642return (not not self.__toklist)643__nonzero__ = __bool__644645def __iter__(self):646return iter(self.__toklist)647648def __reversed__(self):649return iter(self.__toklist[::-1])650651def _iterkeys(self):652if hasattr(self.__tokdict, "iterkeys"):653return self.__tokdict.iterkeys()654else:655return iter(self.__tokdict)656657def _itervalues(self):658return (self[k] for k in self._iterkeys())659660def _iteritems(self):661return ((k, self[k]) for k in self._iterkeys())662663if PY_3:664keys = _iterkeys665"""Returns an iterator of all named result keys."""666667values = _itervalues668"""Returns an iterator of all named result values."""669670items = _iteritems671"""Returns an iterator of all named result key-value tuples."""672673else:674iterkeys = _iterkeys675"""Returns an iterator of all named result keys (Python 2.x only)."""676677itervalues = _itervalues678"""Returns an iterator of all named result values (Python 2.x only)."""679680iteritems = _iteritems681"""Returns an iterator of all named result key-value tuples (Python 2.x only)."""682683def keys(self):684"""Returns all named result keys (as a list in Python 2.x, as an iterator in Python 3.x)."""685return list(self.iterkeys())686687def values(self):688"""Returns all named result values (as a list in Python 2.x, as an iterator in Python 3.x)."""689return list(self.itervalues())690691def items(self):692"""Returns all named result key-values (as a list of tuples in Python 2.x, as an iterator in Python 3.x)."""693return list(self.iteritems())694695def haskeys(self):696"""Since keys() returns an iterator, this method is helpful in bypassing697code that looks for the existence of any defined results names."""698return bool(self.__tokdict)699700def pop(self, *args, **kwargs):701"""702Removes and returns item at specified index (default= ``last``).703Supports both ``list`` and ``dict`` semantics for ``pop()``. If704passed no argument or an integer argument, it will use ``list``705semantics and pop tokens from the list of parsed tokens. If passed706a non-integer argument (most likely a string), it will use ``dict``707semantics and pop the corresponding value from any defined results708names. A second default return value argument is supported, just as in709``dict.pop()``.710711Example::712713def remove_first(tokens):714tokens.pop(0)715print(OneOrMore(Word(nums)).parseString("0 123 321")) # -> ['0', '123', '321']716print(OneOrMore(Word(nums)).addParseAction(remove_first).parseString("0 123 321")) # -> ['123', '321']717718label = Word(alphas)719patt = label("LABEL") + OneOrMore(Word(nums))720print(patt.parseString("AAB 123 321").dump())721722# Use pop() in a parse action to remove named result (note that corresponding value is not723# removed from list form of results)724def remove_LABEL(tokens):725tokens.pop("LABEL")726return tokens727patt.addParseAction(remove_LABEL)728print(patt.parseString("AAB 123 321").dump())729730prints::731732['AAB', '123', '321']733- LABEL: AAB734735['AAB', '123', '321']736"""737if not args:738args = [-1]739for k, v in kwargs.items():740if k == 'default':741args = (args[0], v)742else:743raise TypeError("pop() got an unexpected keyword argument '%s'" % k)744if (isinstance(args[0], int)745or len(args) == 1746or args[0] in self):747index = args[0]748ret = self[index]749del self[index]750return ret751else:752defaultvalue = args[1]753return defaultvalue754755def get(self, key, defaultValue=None):756"""757Returns named result matching the given key, or if there is no758such name, then returns the given ``defaultValue`` or ``None`` if no759``defaultValue`` is specified.760761Similar to ``dict.get()``.762763Example::764765integer = Word(nums)766date_str = integer("year") + '/' + integer("month") + '/' + integer("day")767768result = date_str.parseString("1999/12/31")769print(result.get("year")) # -> '1999'770print(result.get("hour", "not specified")) # -> 'not specified'771print(result.get("hour")) # -> None772"""773if key in self:774return self[key]775else:776return defaultValue777778def insert(self, index, insStr):779"""780Inserts new element at location index in the list of parsed tokens.781782Similar to ``list.insert()``.783784Example::785786print(OneOrMore(Word(nums)).parseString("0 123 321")) # -> ['0', '123', '321']787788# use a parse action to insert the parse location in the front of the parsed results789def insert_locn(locn, tokens):790tokens.insert(0, locn)791print(OneOrMore(Word(nums)).addParseAction(insert_locn).parseString("0 123 321")) # -> [0, '0', '123', '321']792"""793self.__toklist.insert(index, insStr)794# fixup indices in token dictionary795for name, occurrences in self.__tokdict.items():796for k, (value, position) in enumerate(occurrences):797occurrences[k] = _ParseResultsWithOffset(value, position + (position > index))798799def append(self, item):800"""801Add single element to end of ParseResults list of elements.802803Example::804805print(OneOrMore(Word(nums)).parseString("0 123 321")) # -> ['0', '123', '321']806807# use a parse action to compute the sum of the parsed integers, and add it to the end808def append_sum(tokens):809tokens.append(sum(map(int, tokens)))810print(OneOrMore(Word(nums)).addParseAction(append_sum).parseString("0 123 321")) # -> ['0', '123', '321', 444]811"""812self.__toklist.append(item)813814def extend(self, itemseq):815"""816Add sequence of elements to end of ParseResults list of elements.817818Example::819820patt = OneOrMore(Word(alphas))821822# use a parse action to append the reverse of the matched strings, to make a palindrome823def make_palindrome(tokens):824tokens.extend(reversed([t[::-1] for t in tokens]))825return ''.join(tokens)826print(patt.addParseAction(make_palindrome).parseString("lskdj sdlkjf lksd")) # -> 'lskdjsdlkjflksddsklfjkldsjdksl'827"""828if isinstance(itemseq, ParseResults):829self.__iadd__(itemseq)830else:831self.__toklist.extend(itemseq)832833def clear(self):834"""835Clear all elements and results names.836"""837del self.__toklist[:]838self.__tokdict.clear()839840def __getattr__(self, name):841try:842return self[name]843except KeyError:844return ""845846def __add__(self, other):847ret = self.copy()848ret += other849return ret850851def __iadd__(self, other):852if other.__tokdict:853offset = len(self.__toklist)854addoffset = lambda a: offset if a < 0 else a + offset855otheritems = other.__tokdict.items()856otherdictitems = [(k, _ParseResultsWithOffset(v[0], addoffset(v[1])))857for k, vlist in otheritems for v in vlist]858for k, v in otherdictitems:859self[k] = v860if isinstance(v[0], ParseResults):861v[0].__parent = wkref(self)862863self.__toklist += other.__toklist864self.__accumNames.update(other.__accumNames)865return self866867def __radd__(self, other):868if isinstance(other, int) and other == 0:869# useful for merging many ParseResults using sum() builtin870return self.copy()871else:872# this may raise a TypeError - so be it873return other + self874875def __repr__(self):876return "(%s, %s)" % (repr(self.__toklist), repr(self.__tokdict))877878def __str__(self):879return '[' + ', '.join(_ustr(i) if isinstance(i, ParseResults) else repr(i) for i in self.__toklist) + ']'880881def _asStringList(self, sep=''):882out = []883for item in self.__toklist:884if out and sep:885out.append(sep)886if isinstance(item, ParseResults):887out += item._asStringList()888else:889out.append(_ustr(item))890return out891892def asList(self):893"""894Returns the parse results as a nested list of matching tokens, all converted to strings.895896Example::897898patt = OneOrMore(Word(alphas))899result = patt.parseString("sldkj lsdkj sldkj")900# even though the result prints in string-like form, it is actually a pyparsing ParseResults901print(type(result), result) # -> <class 'pyparsing.ParseResults'> ['sldkj', 'lsdkj', 'sldkj']902903# Use asList() to create an actual list904result_list = result.asList()905print(type(result_list), result_list) # -> <class 'list'> ['sldkj', 'lsdkj', 'sldkj']906"""907return [res.asList() if isinstance(res, ParseResults) else res for res in self.__toklist]908909def asDict(self):910"""911Returns the named parse results as a nested dictionary.912913Example::914915integer = Word(nums)916date_str = integer("year") + '/' + integer("month") + '/' + integer("day")917918result = date_str.parseString('12/31/1999')919print(type(result), repr(result)) # -> <class 'pyparsing.ParseResults'> (['12', '/', '31', '/', '1999'], {'day': [('1999', 4)], 'year': [('12', 0)], 'month': [('31', 2)]})920921result_dict = result.asDict()922print(type(result_dict), repr(result_dict)) # -> <class 'dict'> {'day': '1999', 'year': '12', 'month': '31'}923924# even though a ParseResults supports dict-like access, sometime you just need to have a dict925import json926print(json.dumps(result)) # -> Exception: TypeError: ... is not JSON serializable927print(json.dumps(result.asDict())) # -> {"month": "31", "day": "1999", "year": "12"}928"""929if PY_3:930item_fn = self.items931else:932item_fn = self.iteritems933934def toItem(obj):935if isinstance(obj, ParseResults):936if obj.haskeys():937return obj.asDict()938else:939return [toItem(v) for v in obj]940else:941return obj942943return dict((k, toItem(v)) for k, v in item_fn())944945def copy(self):946"""947Returns a new copy of a :class:`ParseResults` object.948"""949ret = ParseResults(self.__toklist)950ret.__tokdict = dict(self.__tokdict.items())951ret.__parent = self.__parent952ret.__accumNames.update(self.__accumNames)953ret.__name = self.__name954return ret955956def asXML(self, doctag=None, namedItemsOnly=False, indent="", formatted=True):957"""958(Deprecated) Returns the parse results as XML. Tags are created for tokens and lists that have defined results names.959"""960nl = "\n"961out = []962namedItems = dict((v[1], k) for (k, vlist) in self.__tokdict.items()963for v in vlist)964nextLevelIndent = indent + " "965966# collapse out indents if formatting is not desired967if not formatted:968indent = ""969nextLevelIndent = ""970nl = ""971972selfTag = None973if doctag is not None:974selfTag = doctag975else:976if self.__name:977selfTag = self.__name978979if not selfTag:980if namedItemsOnly:981return ""982else:983selfTag = "ITEM"984985out += [nl, indent, "<", selfTag, ">"]986987for i, res in enumerate(self.__toklist):988if isinstance(res, ParseResults):989if i in namedItems:990out += [res.asXML(namedItems[i],991namedItemsOnly and doctag is None,992nextLevelIndent,993formatted)]994else:995out += [res.asXML(None,996namedItemsOnly and doctag is None,997nextLevelIndent,998formatted)]999else:1000# individual token, see if there is a name for it1001resTag = None1002if i in namedItems:1003resTag = namedItems[i]1004if not resTag:1005if namedItemsOnly:1006continue1007else:1008resTag = "ITEM"1009xmlBodyText = _xml_escape(_ustr(res))1010out += [nl, nextLevelIndent, "<", resTag, ">",1011xmlBodyText,1012"</", resTag, ">"]10131014out += [nl, indent, "</", selfTag, ">"]1015return "".join(out)10161017def __lookup(self, sub):1018for k, vlist in self.__tokdict.items():1019for v, loc in vlist:1020if sub is v:1021return k1022return None10231024def getName(self):1025r"""1026Returns the results name for this token expression. Useful when several1027different expressions might match at a particular location.10281029Example::10301031integer = Word(nums)1032ssn_expr = Regex(r"\d\d\d-\d\d-\d\d\d\d")1033house_number_expr = Suppress('#') + Word(nums, alphanums)1034user_data = (Group(house_number_expr)("house_number")1035| Group(ssn_expr)("ssn")1036| Group(integer)("age"))1037user_info = OneOrMore(user_data)10381039result = user_info.parseString("22 111-22-3333 #221B")1040for item in result:1041print(item.getName(), ':', item[0])10421043prints::10441045age : 221046ssn : 111-22-33331047house_number : 221B1048"""1049if self.__name:1050return self.__name1051elif self.__parent:1052par = self.__parent()1053if par:1054return par.__lookup(self)1055else:1056return None1057elif (len(self) == 11058and len(self.__tokdict) == 11059and next(iter(self.__tokdict.values()))[0][1] in (0, -1)):1060return next(iter(self.__tokdict.keys()))1061else:1062return None10631064def dump(self, indent='', full=True, include_list=True, _depth=0):1065"""1066Diagnostic method for listing out the contents of1067a :class:`ParseResults`. Accepts an optional ``indent`` argument so1068that this string can be embedded in a nested display of other data.10691070Example::10711072integer = Word(nums)1073date_str = integer("year") + '/' + integer("month") + '/' + integer("day")10741075result = date_str.parseString('12/31/1999')1076print(result.dump())10771078prints::10791080['12', '/', '31', '/', '1999']1081- day: 19991082- month: 311083- year: 121084"""1085out = []1086NL = '\n'1087if include_list:1088out.append(indent + _ustr(self.asList()))1089else:1090out.append('')10911092if full:1093if self.haskeys():1094items = sorted((str(k), v) for k, v in self.items())1095for k, v in items:1096if out:1097out.append(NL)1098out.append("%s%s- %s: " % (indent, (' ' * _depth), k))1099if isinstance(v, ParseResults):1100if v:1101out.append(v.dump(indent=indent, full=full, include_list=include_list, _depth=_depth + 1))1102else:1103out.append(_ustr(v))1104else:1105out.append(repr(v))1106elif any(isinstance(vv, ParseResults) for vv in self):1107v = self1108for i, vv in enumerate(v):1109if isinstance(vv, ParseResults):1110out.append("\n%s%s[%d]:\n%s%s%s" % (indent,1111(' ' * (_depth)),1112i,1113indent,1114(' ' * (_depth + 1)),1115vv.dump(indent=indent,1116full=full,1117include_list=include_list,1118_depth=_depth + 1)))1119else:1120out.append("\n%s%s[%d]:\n%s%s%s" % (indent,1121(' ' * (_depth)),1122i,1123indent,1124(' ' * (_depth + 1)),1125_ustr(vv)))11261127return "".join(out)11281129def pprint(self, *args, **kwargs):1130"""1131Pretty-printer for parsed results as a list, using the1132`pprint <https://docs.python.org/3/library/pprint.html>`_ module.1133Accepts additional positional or keyword args as defined for1134`pprint.pprint <https://docs.python.org/3/library/pprint.html#pprint.pprint>`_ .11351136Example::11371138ident = Word(alphas, alphanums)1139num = Word(nums)1140func = Forward()1141term = ident | num | Group('(' + func + ')')1142func <<= ident + Group(Optional(delimitedList(term)))1143result = func.parseString("fna a,b,(fnb c,d,200),100")1144result.pprint(width=40)11451146prints::11471148['fna',1149['a',1150'b',1151['(', 'fnb', ['c', 'd', '200'], ')'],1152'100']]1153"""1154pprint.pprint(self.asList(), *args, **kwargs)11551156# add support for pickle protocol1157def __getstate__(self):1158return (self.__toklist,1159(self.__tokdict.copy(),1160self.__parent is not None and self.__parent() or None,1161self.__accumNames,1162self.__name))11631164def __setstate__(self, state):1165self.__toklist = state[0]1166self.__tokdict, par, inAccumNames, self.__name = state[1]1167self.__accumNames = {}1168self.__accumNames.update(inAccumNames)1169if par is not None:1170self.__parent = wkref(par)1171else:1172self.__parent = None11731174def __getnewargs__(self):1175return self.__toklist, self.__name, self.__asList, self.__modal11761177def __dir__(self):1178return dir(type(self)) + list(self.keys())11791180@classmethod1181def from_dict(cls, other, name=None):1182"""1183Helper classmethod to construct a ParseResults from a dict, preserving the1184name-value relations as results names. If an optional 'name' argument is1185given, a nested ParseResults will be returned1186"""1187def is_iterable(obj):1188try:1189iter(obj)1190except Exception:1191return False1192else:1193if PY_3:1194return not isinstance(obj, (str, bytes))1195else:1196return not isinstance(obj, basestring)11971198ret = cls([])1199for k, v in other.items():1200if isinstance(v, Mapping):1201ret += cls.from_dict(v, name=k)1202else:1203ret += cls([v], name=k, asList=is_iterable(v))1204if name is not None:1205ret = cls([ret], name=name)1206return ret12071208MutableMapping.register(ParseResults)12091210def col (loc, strg):1211"""Returns current column within a string, counting newlines as line separators.1212The first column is number 1.12131214Note: the default parsing behavior is to expand tabs in the input string1215before starting the parsing process. See1216:class:`ParserElement.parseString` for more1217information on parsing strings containing ``<TAB>`` s, and suggested1218methods to maintain a consistent view of the parsed string, the parse1219location, and line and column positions within the parsed string.1220"""1221s = strg1222return 1 if 0 < loc < len(s) and s[loc-1] == '\n' else loc - s.rfind("\n", 0, loc)12231224def lineno(loc, strg):1225"""Returns current line number within a string, counting newlines as line separators.1226The first line is number 1.12271228Note - the default parsing behavior is to expand tabs in the input string1229before starting the parsing process. See :class:`ParserElement.parseString`1230for more information on parsing strings containing ``<TAB>`` s, and1231suggested methods to maintain a consistent view of the parsed string, the1232parse location, and line and column positions within the parsed string.1233"""1234return strg.count("\n", 0, loc) + 112351236def line(loc, strg):1237"""Returns the line of text containing loc within a string, counting newlines as line separators.1238"""1239lastCR = strg.rfind("\n", 0, loc)1240nextCR = strg.find("\n", loc)1241if nextCR >= 0:1242return strg[lastCR + 1:nextCR]1243else:1244return strg[lastCR + 1:]12451246def _defaultStartDebugAction(instring, loc, expr):1247print(("Match " + _ustr(expr) + " at loc " + _ustr(loc) + "(%d,%d)" % (lineno(loc, instring), col(loc, instring))))12481249def _defaultSuccessDebugAction(instring, startloc, endloc, expr, toks):1250print("Matched " + _ustr(expr) + " -> " + str(toks.asList()))12511252def _defaultExceptionDebugAction(instring, loc, expr, exc):1253print("Exception raised:" + _ustr(exc))12541255def nullDebugAction(*args):1256"""'Do-nothing' debug action, to suppress debugging output during parsing."""1257pass12581259# Only works on Python 3.x - nonlocal is toxic to Python 2 installs1260#~ 'decorator to trim function calls to match the arity of the target'1261#~ def _trim_arity(func, maxargs=3):1262#~ if func in singleArgBuiltins:1263#~ return lambda s,l,t: func(t)1264#~ limit = 01265#~ foundArity = False1266#~ def wrapper(*args):1267#~ nonlocal limit,foundArity1268#~ while 1:1269#~ try:1270#~ ret = func(*args[limit:])1271#~ foundArity = True1272#~ return ret1273#~ except TypeError:1274#~ if limit == maxargs or foundArity:1275#~ raise1276#~ limit += 11277#~ continue1278#~ return wrapper12791280# this version is Python 2.x-3.x cross-compatible1281'decorator to trim function calls to match the arity of the target'1282def _trim_arity(func, maxargs=2):1283if func in singleArgBuiltins:1284return lambda s, l, t: func(t)1285limit = [0]1286foundArity = [False]12871288# traceback return data structure changed in Py3.5 - normalize back to plain tuples1289if system_version[:2] >= (3, 5):1290def extract_stack(limit=0):1291# special handling for Python 3.5.0 - extra deep call stack by 11292offset = -3 if system_version == (3, 5, 0) else -21293frame_summary = traceback.extract_stack(limit=-offset + limit - 1)[offset]1294return [frame_summary[:2]]1295def extract_tb(tb, limit=0):1296frames = traceback.extract_tb(tb, limit=limit)1297frame_summary = frames[-1]1298return [frame_summary[:2]]1299else:1300extract_stack = traceback.extract_stack1301extract_tb = traceback.extract_tb13021303# synthesize what would be returned by traceback.extract_stack at the call to1304# user's parse action 'func', so that we don't incur call penalty at parse time13051306LINE_DIFF = 61307# IF ANY CODE CHANGES, EVEN JUST COMMENTS OR BLANK LINES, BETWEEN THE NEXT LINE AND1308# THE CALL TO FUNC INSIDE WRAPPER, LINE_DIFF MUST BE MODIFIED!!!!1309this_line = extract_stack(limit=2)[-1]1310pa_call_line_synth = (this_line[0], this_line[1] + LINE_DIFF)13111312def wrapper(*args):1313while 1:1314try:1315ret = func(*args[limit[0]:])1316foundArity[0] = True1317return ret1318except TypeError:1319# re-raise TypeErrors if they did not come from our arity testing1320if foundArity[0]:1321raise1322else:1323try:1324tb = sys.exc_info()[-1]1325if not extract_tb(tb, limit=2)[-1][:2] == pa_call_line_synth:1326raise1327finally:1328try:1329del tb1330except NameError:1331pass13321333if limit[0] <= maxargs:1334limit[0] += 11335continue1336raise13371338# copy func name to wrapper for sensible debug output1339func_name = "<parse action>"1340try:1341func_name = getattr(func, '__name__',1342getattr(func, '__class__').__name__)1343except Exception:1344func_name = str(func)1345wrapper.__name__ = func_name13461347return wrapper134813491350class ParserElement(object):1351"""Abstract base level parser element class."""1352DEFAULT_WHITE_CHARS = " \n\t\r"1353verbose_stacktrace = False13541355@staticmethod1356def setDefaultWhitespaceChars(chars):1357r"""1358Overrides the default whitespace chars13591360Example::13611362# default whitespace chars are space, <TAB> and newline1363OneOrMore(Word(alphas)).parseString("abc def\nghi jkl") # -> ['abc', 'def', 'ghi', 'jkl']13641365# change to just treat newline as significant1366ParserElement.setDefaultWhitespaceChars(" \t")1367OneOrMore(Word(alphas)).parseString("abc def\nghi jkl") # -> ['abc', 'def']1368"""1369ParserElement.DEFAULT_WHITE_CHARS = chars13701371@staticmethod1372def inlineLiteralsUsing(cls):1373"""1374Set class to be used for inclusion of string literals into a parser.13751376Example::13771378# default literal class used is Literal1379integer = Word(nums)1380date_str = integer("year") + '/' + integer("month") + '/' + integer("day")13811382date_str.parseString("1999/12/31") # -> ['1999', '/', '12', '/', '31']138313841385# change to Suppress1386ParserElement.inlineLiteralsUsing(Suppress)1387date_str = integer("year") + '/' + integer("month") + '/' + integer("day")13881389date_str.parseString("1999/12/31") # -> ['1999', '12', '31']1390"""1391ParserElement._literalStringClass = cls13921393@classmethod1394def _trim_traceback(cls, tb):1395while tb.tb_next:1396tb = tb.tb_next1397return tb13981399def __init__(self, savelist=False):1400self.parseAction = list()1401self.failAction = None1402# ~ self.name = "<unknown>" # don't define self.name, let subclasses try/except upcall1403self.strRepr = None1404self.resultsName = None1405self.saveAsList = savelist1406self.skipWhitespace = True1407self.whiteChars = set(ParserElement.DEFAULT_WHITE_CHARS)1408self.copyDefaultWhiteChars = True1409self.mayReturnEmpty = False # used when checking for left-recursion1410self.keepTabs = False1411self.ignoreExprs = list()1412self.debug = False1413self.streamlined = False1414self.mayIndexError = True # used to optimize exception handling for subclasses that don't advance parse index1415self.errmsg = ""1416self.modalResults = True # used to mark results names as modal (report only last) or cumulative (list all)1417self.debugActions = (None, None, None) # custom debug actions1418self.re = None1419self.callPreparse = True # used to avoid redundant calls to preParse1420self.callDuringTry = False14211422def copy(self):1423"""1424Make a copy of this :class:`ParserElement`. Useful for defining1425different parse actions for the same parsing pattern, using copies of1426the original parse element.14271428Example::14291430integer = Word(nums).setParseAction(lambda toks: int(toks[0]))1431integerK = integer.copy().addParseAction(lambda toks: toks[0] * 1024) + Suppress("K")1432integerM = integer.copy().addParseAction(lambda toks: toks[0] * 1024 * 1024) + Suppress("M")14331434print(OneOrMore(integerK | integerM | integer).parseString("5K 100 640K 256M"))14351436prints::14371438[5120, 100, 655360, 268435456]14391440Equivalent form of ``expr.copy()`` is just ``expr()``::14411442integerM = integer().addParseAction(lambda toks: toks[0] * 1024 * 1024) + Suppress("M")1443"""1444cpy = copy.copy(self)1445cpy.parseAction = self.parseAction[:]1446cpy.ignoreExprs = self.ignoreExprs[:]1447if self.copyDefaultWhiteChars:1448cpy.whiteChars = ParserElement.DEFAULT_WHITE_CHARS1449return cpy14501451def setName(self, name):1452"""1453Define name for this expression, makes debugging and exception messages clearer.14541455Example::14561457Word(nums).parseString("ABC") # -> Exception: Expected W:(0123...) (at char 0), (line:1, col:1)1458Word(nums).setName("integer").parseString("ABC") # -> Exception: Expected integer (at char 0), (line:1, col:1)1459"""1460self.name = name1461self.errmsg = "Expected " + self.name1462if __diag__.enable_debug_on_named_expressions:1463self.setDebug()1464return self14651466def setResultsName(self, name, listAllMatches=False):1467"""1468Define name for referencing matching tokens as a nested attribute1469of the returned parse results.1470NOTE: this returns a *copy* of the original :class:`ParserElement` object;1471this is so that the client can define a basic element, such as an1472integer, and reference it in multiple places with different names.14731474You can also set results names using the abbreviated syntax,1475``expr("name")`` in place of ``expr.setResultsName("name")``1476- see :class:`__call__`.14771478Example::14791480date_str = (integer.setResultsName("year") + '/'1481+ integer.setResultsName("month") + '/'1482+ integer.setResultsName("day"))14831484# equivalent form:1485date_str = integer("year") + '/' + integer("month") + '/' + integer("day")1486"""1487return self._setResultsName(name, listAllMatches)14881489def _setResultsName(self, name, listAllMatches=False):1490newself = self.copy()1491if name.endswith("*"):1492name = name[:-1]1493listAllMatches = True1494newself.resultsName = name1495newself.modalResults = not listAllMatches1496return newself14971498def setBreak(self, breakFlag=True):1499"""Method to invoke the Python pdb debugger when this element is1500about to be parsed. Set ``breakFlag`` to True to enable, False to1501disable.1502"""1503if breakFlag:1504_parseMethod = self._parse1505def breaker(instring, loc, doActions=True, callPreParse=True):1506import pdb1507# this call to pdb.set_trace() is intentional, not a checkin error1508pdb.set_trace()1509return _parseMethod(instring, loc, doActions, callPreParse)1510breaker._originalParseMethod = _parseMethod1511self._parse = breaker1512else:1513if hasattr(self._parse, "_originalParseMethod"):1514self._parse = self._parse._originalParseMethod1515return self15161517def setParseAction(self, *fns, **kwargs):1518"""1519Define one or more actions to perform when successfully matching parse element definition.1520Parse action fn is a callable method with 0-3 arguments, called as ``fn(s, loc, toks)`` ,1521``fn(loc, toks)`` , ``fn(toks)`` , or just ``fn()`` , where:15221523- s = the original string being parsed (see note below)1524- loc = the location of the matching substring1525- toks = a list of the matched tokens, packaged as a :class:`ParseResults` object15261527If the functions in fns modify the tokens, they can return them as the return1528value from fn, and the modified list of tokens will replace the original.1529Otherwise, fn does not need to return any value.15301531If None is passed as the parse action, all previously added parse actions for this1532expression are cleared.15331534Optional keyword arguments:1535- callDuringTry = (default= ``False``) indicate if parse action should be run during lookaheads and alternate testing15361537Note: the default parsing behavior is to expand tabs in the input string1538before starting the parsing process. See :class:`parseString for more1539information on parsing strings containing ``<TAB>`` s, and suggested1540methods to maintain a consistent view of the parsed string, the parse1541location, and line and column positions within the parsed string.15421543Example::15441545integer = Word(nums)1546date_str = integer + '/' + integer + '/' + integer15471548date_str.parseString("1999/12/31") # -> ['1999', '/', '12', '/', '31']15491550# use parse action to convert to ints at parse time1551integer = Word(nums).setParseAction(lambda toks: int(toks[0]))1552date_str = integer + '/' + integer + '/' + integer15531554# note that integer fields are now ints, not strings1555date_str.parseString("1999/12/31") # -> [1999, '/', 12, '/', 31]1556"""1557if list(fns) == [None,]:1558self.parseAction = []1559else:1560if not all(callable(fn) for fn in fns):1561raise TypeError("parse actions must be callable")1562self.parseAction = list(map(_trim_arity, list(fns)))1563self.callDuringTry = kwargs.get("callDuringTry", False)1564return self15651566def addParseAction(self, *fns, **kwargs):1567"""1568Add one or more parse actions to expression's list of parse actions. See :class:`setParseAction`.15691570See examples in :class:`copy`.1571"""1572self.parseAction += list(map(_trim_arity, list(fns)))1573self.callDuringTry = self.callDuringTry or kwargs.get("callDuringTry", False)1574return self15751576def addCondition(self, *fns, **kwargs):1577"""Add a boolean predicate function to expression's list of parse actions. See1578:class:`setParseAction` for function call signatures. Unlike ``setParseAction``,1579functions passed to ``addCondition`` need to return boolean success/fail of the condition.15801581Optional keyword arguments:1582- message = define a custom message to be used in the raised exception1583- fatal = if True, will raise ParseFatalException to stop parsing immediately; otherwise will raise ParseException15841585Example::15861587integer = Word(nums).setParseAction(lambda toks: int(toks[0]))1588year_int = integer.copy()1589year_int.addCondition(lambda toks: toks[0] >= 2000, message="Only support years 2000 and later")1590date_str = year_int + '/' + integer + '/' + integer15911592result = date_str.parseString("1999/12/31") # -> Exception: Only support years 2000 and later (at char 0), (line:1, col:1)1593"""1594for fn in fns:1595self.parseAction.append(conditionAsParseAction(fn, message=kwargs.get('message'),1596fatal=kwargs.get('fatal', False)))15971598self.callDuringTry = self.callDuringTry or kwargs.get("callDuringTry", False)1599return self16001601def setFailAction(self, fn):1602"""Define action to perform if parsing fails at this expression.1603Fail acton fn is a callable function that takes the arguments1604``fn(s, loc, expr, err)`` where:1605- s = string being parsed1606- loc = location where expression match was attempted and failed1607- expr = the parse expression that failed1608- err = the exception thrown1609The function returns no value. It may throw :class:`ParseFatalException`1610if it is desired to stop parsing immediately."""1611self.failAction = fn1612return self16131614def _skipIgnorables(self, instring, loc):1615exprsFound = True1616while exprsFound:1617exprsFound = False1618for e in self.ignoreExprs:1619try:1620while 1:1621loc, dummy = e._parse(instring, loc)1622exprsFound = True1623except ParseException:1624pass1625return loc16261627def preParse(self, instring, loc):1628if self.ignoreExprs:1629loc = self._skipIgnorables(instring, loc)16301631if self.skipWhitespace:1632wt = self.whiteChars1633instrlen = len(instring)1634while loc < instrlen and instring[loc] in wt:1635loc += 116361637return loc16381639def parseImpl(self, instring, loc, doActions=True):1640return loc, []16411642def postParse(self, instring, loc, tokenlist):1643return tokenlist16441645# ~ @profile1646def _parseNoCache(self, instring, loc, doActions=True, callPreParse=True):1647TRY, MATCH, FAIL = 0, 1, 21648debugging = (self.debug) # and doActions)16491650if debugging or self.failAction:1651# ~ print ("Match", self, "at loc", loc, "(%d, %d)" % (lineno(loc, instring), col(loc, instring)))1652if self.debugActions[TRY]:1653self.debugActions[TRY](instring, loc, self)1654try:1655if callPreParse and self.callPreparse:1656preloc = self.preParse(instring, loc)1657else:1658preloc = loc1659tokensStart = preloc1660if self.mayIndexError or preloc >= len(instring):1661try:1662loc, tokens = self.parseImpl(instring, preloc, doActions)1663except IndexError:1664raise ParseException(instring, len(instring), self.errmsg, self)1665else:1666loc, tokens = self.parseImpl(instring, preloc, doActions)1667except Exception as err:1668# ~ print ("Exception raised:", err)1669if self.debugActions[FAIL]:1670self.debugActions[FAIL](instring, tokensStart, self, err)1671if self.failAction:1672self.failAction(instring, tokensStart, self, err)1673raise1674else:1675if callPreParse and self.callPreparse:1676preloc = self.preParse(instring, loc)1677else:1678preloc = loc1679tokensStart = preloc1680if self.mayIndexError or preloc >= len(instring):1681try:1682loc, tokens = self.parseImpl(instring, preloc, doActions)1683except IndexError:1684raise ParseException(instring, len(instring), self.errmsg, self)1685else:1686loc, tokens = self.parseImpl(instring, preloc, doActions)16871688tokens = self.postParse(instring, loc, tokens)16891690retTokens = ParseResults(tokens, self.resultsName, asList=self.saveAsList, modal=self.modalResults)1691if self.parseAction and (doActions or self.callDuringTry):1692if debugging:1693try:1694for fn in self.parseAction:1695try:1696tokens = fn(instring, tokensStart, retTokens)1697except IndexError as parse_action_exc:1698exc = ParseException("exception raised in parse action")1699exc.__cause__ = parse_action_exc1700raise exc17011702if tokens is not None and tokens is not retTokens:1703retTokens = ParseResults(tokens,1704self.resultsName,1705asList=self.saveAsList and isinstance(tokens, (ParseResults, list)),1706modal=self.modalResults)1707except Exception as err:1708# ~ print "Exception raised in user parse action:", err1709if self.debugActions[FAIL]:1710self.debugActions[FAIL](instring, tokensStart, self, err)1711raise1712else:1713for fn in self.parseAction:1714try:1715tokens = fn(instring, tokensStart, retTokens)1716except IndexError as parse_action_exc:1717exc = ParseException("exception raised in parse action")1718exc.__cause__ = parse_action_exc1719raise exc17201721if tokens is not None and tokens is not retTokens:1722retTokens = ParseResults(tokens,1723self.resultsName,1724asList=self.saveAsList and isinstance(tokens, (ParseResults, list)),1725modal=self.modalResults)1726if debugging:1727# ~ print ("Matched", self, "->", retTokens.asList())1728if self.debugActions[MATCH]:1729self.debugActions[MATCH](instring, tokensStart, loc, self, retTokens)17301731return loc, retTokens17321733def tryParse(self, instring, loc):1734try:1735return self._parse(instring, loc, doActions=False)[0]1736except ParseFatalException:1737raise ParseException(instring, loc, self.errmsg, self)17381739def canParseNext(self, instring, loc):1740try:1741self.tryParse(instring, loc)1742except (ParseException, IndexError):1743return False1744else:1745return True17461747class _UnboundedCache(object):1748def __init__(self):1749cache = {}1750self.not_in_cache = not_in_cache = object()17511752def get(self, key):1753return cache.get(key, not_in_cache)17541755def set(self, key, value):1756cache[key] = value17571758def clear(self):1759cache.clear()17601761def cache_len(self):1762return len(cache)17631764self.get = types.MethodType(get, self)1765self.set = types.MethodType(set, self)1766self.clear = types.MethodType(clear, self)1767self.__len__ = types.MethodType(cache_len, self)17681769if _OrderedDict is not None:1770class _FifoCache(object):1771def __init__(self, size):1772self.not_in_cache = not_in_cache = object()17731774cache = _OrderedDict()17751776def get(self, key):1777return cache.get(key, not_in_cache)17781779def set(self, key, value):1780cache[key] = value1781while len(cache) > size:1782try:1783cache.popitem(False)1784except KeyError:1785pass17861787def clear(self):1788cache.clear()17891790def cache_len(self):1791return len(cache)17921793self.get = types.MethodType(get, self)1794self.set = types.MethodType(set, self)1795self.clear = types.MethodType(clear, self)1796self.__len__ = types.MethodType(cache_len, self)17971798else:1799class _FifoCache(object):1800def __init__(self, size):1801self.not_in_cache = not_in_cache = object()18021803cache = {}1804key_fifo = collections.deque([], size)18051806def get(self, key):1807return cache.get(key, not_in_cache)18081809def set(self, key, value):1810cache[key] = value1811while len(key_fifo) > size:1812cache.pop(key_fifo.popleft(), None)1813key_fifo.append(key)18141815def clear(self):1816cache.clear()1817key_fifo.clear()18181819def cache_len(self):1820return len(cache)18211822self.get = types.MethodType(get, self)1823self.set = types.MethodType(set, self)1824self.clear = types.MethodType(clear, self)1825self.__len__ = types.MethodType(cache_len, self)18261827# argument cache for optimizing repeated calls when backtracking through recursive expressions1828packrat_cache = {} # this is set later by enabledPackrat(); this is here so that resetCache() doesn't fail1829packrat_cache_lock = RLock()1830packrat_cache_stats = [0, 0]18311832# this method gets repeatedly called during backtracking with the same arguments -1833# we can cache these arguments and save ourselves the trouble of re-parsing the contained expression1834def _parseCache(self, instring, loc, doActions=True, callPreParse=True):1835HIT, MISS = 0, 11836lookup = (self, instring, loc, callPreParse, doActions)1837with ParserElement.packrat_cache_lock:1838cache = ParserElement.packrat_cache1839value = cache.get(lookup)1840if value is cache.not_in_cache:1841ParserElement.packrat_cache_stats[MISS] += 11842try:1843value = self._parseNoCache(instring, loc, doActions, callPreParse)1844except ParseBaseException as pe:1845# cache a copy of the exception, without the traceback1846cache.set(lookup, pe.__class__(*pe.args))1847raise1848else:1849cache.set(lookup, (value[0], value[1].copy()))1850return value1851else:1852ParserElement.packrat_cache_stats[HIT] += 11853if isinstance(value, Exception):1854raise value1855return value[0], value[1].copy()18561857_parse = _parseNoCache18581859@staticmethod1860def resetCache():1861ParserElement.packrat_cache.clear()1862ParserElement.packrat_cache_stats[:] = [0] * len(ParserElement.packrat_cache_stats)18631864_packratEnabled = False1865@staticmethod1866def enablePackrat(cache_size_limit=128):1867"""Enables "packrat" parsing, which adds memoizing to the parsing logic.1868Repeated parse attempts at the same string location (which happens1869often in many complex grammars) can immediately return a cached value,1870instead of re-executing parsing/validating code. Memoizing is done of1871both valid results and parsing exceptions.18721873Parameters:18741875- cache_size_limit - (default= ``128``) - if an integer value is provided1876will limit the size of the packrat cache; if None is passed, then1877the cache size will be unbounded; if 0 is passed, the cache will1878be effectively disabled.18791880This speedup may break existing programs that use parse actions that1881have side-effects. For this reason, packrat parsing is disabled when1882you first import pyparsing. To activate the packrat feature, your1883program must call the class method :class:`ParserElement.enablePackrat`.1884For best results, call ``enablePackrat()`` immediately after1885importing pyparsing.18861887Example::18881889from pip._vendor import pyparsing1890pyparsing.ParserElement.enablePackrat()1891"""1892if not ParserElement._packratEnabled:1893ParserElement._packratEnabled = True1894if cache_size_limit is None:1895ParserElement.packrat_cache = ParserElement._UnboundedCache()1896else:1897ParserElement.packrat_cache = ParserElement._FifoCache(cache_size_limit)1898ParserElement._parse = ParserElement._parseCache18991900def parseString(self, instring, parseAll=False):1901"""1902Execute the parse expression with the given string.1903This is the main interface to the client code, once the complete1904expression has been built.19051906Returns the parsed data as a :class:`ParseResults` object, which may be1907accessed as a list, or as a dict or object with attributes if the given parser1908includes results names.19091910If you want the grammar to require that the entire input string be1911successfully parsed, then set ``parseAll`` to True (equivalent to ending1912the grammar with ``StringEnd()``).19131914Note: ``parseString`` implicitly calls ``expandtabs()`` on the input string,1915in order to report proper column numbers in parse actions.1916If the input string contains tabs and1917the grammar uses parse actions that use the ``loc`` argument to index into the1918string being parsed, you can ensure you have a consistent view of the input1919string by:19201921- calling ``parseWithTabs`` on your grammar before calling ``parseString``1922(see :class:`parseWithTabs`)1923- define your parse action using the full ``(s, loc, toks)`` signature, and1924reference the input string using the parse action's ``s`` argument1925- explictly expand the tabs in your input string before calling1926``parseString``19271928Example::19291930Word('a').parseString('aaaaabaaa') # -> ['aaaaa']1931Word('a').parseString('aaaaabaaa', parseAll=True) # -> Exception: Expected end of text1932"""1933ParserElement.resetCache()1934if not self.streamlined:1935self.streamline()1936# ~ self.saveAsList = True1937for e in self.ignoreExprs:1938e.streamline()1939if not self.keepTabs:1940instring = instring.expandtabs()1941try:1942loc, tokens = self._parse(instring, 0)1943if parseAll:1944loc = self.preParse(instring, loc)1945se = Empty() + StringEnd()1946se._parse(instring, loc)1947except ParseBaseException as exc:1948if ParserElement.verbose_stacktrace:1949raise1950else:1951# catch and re-raise exception from here, clearing out pyparsing internal stack trace1952if getattr(exc, '__traceback__', None) is not None:1953exc.__traceback__ = self._trim_traceback(exc.__traceback__)1954raise exc1955else:1956return tokens19571958def scanString(self, instring, maxMatches=_MAX_INT, overlap=False):1959"""1960Scan the input string for expression matches. Each match will return the1961matching tokens, start location, and end location. May be called with optional1962``maxMatches`` argument, to clip scanning after 'n' matches are found. If1963``overlap`` is specified, then overlapping matches will be reported.19641965Note that the start and end locations are reported relative to the string1966being parsed. See :class:`parseString` for more information on parsing1967strings with embedded tabs.19681969Example::19701971source = "sldjf123lsdjjkf345sldkjf879lkjsfd987"1972print(source)1973for tokens, start, end in Word(alphas).scanString(source):1974print(' '*start + '^'*(end-start))1975print(' '*start + tokens[0])19761977prints::19781979sldjf123lsdjjkf345sldkjf879lkjsfd9871980^^^^^1981sldjf1982^^^^^^^1983lsdjjkf1984^^^^^^1985sldkjf1986^^^^^^1987lkjsfd1988"""1989if not self.streamlined:1990self.streamline()1991for e in self.ignoreExprs:1992e.streamline()19931994if not self.keepTabs:1995instring = _ustr(instring).expandtabs()1996instrlen = len(instring)1997loc = 01998preparseFn = self.preParse1999parseFn = self._parse2000ParserElement.resetCache()2001matches = 02002try:2003while loc <= instrlen and matches < maxMatches:2004try:2005preloc = preparseFn(instring, loc)2006nextLoc, tokens = parseFn(instring, preloc, callPreParse=False)2007except ParseException:2008loc = preloc + 12009else:2010if nextLoc > loc:2011matches += 12012yield tokens, preloc, nextLoc2013if overlap:2014nextloc = preparseFn(instring, loc)2015if nextloc > loc:2016loc = nextLoc2017else:2018loc += 12019else:2020loc = nextLoc2021else:2022loc = preloc + 12023except ParseBaseException as exc:2024if ParserElement.verbose_stacktrace:2025raise2026else:2027# catch and re-raise exception from here, clearing out pyparsing internal stack trace2028if getattr(exc, '__traceback__', None) is not None:2029exc.__traceback__ = self._trim_traceback(exc.__traceback__)2030raise exc20312032def transformString(self, instring):2033"""2034Extension to :class:`scanString`, to modify matching text with modified tokens that may2035be returned from a parse action. To use ``transformString``, define a grammar and2036attach a parse action to it that modifies the returned token list.2037Invoking ``transformString()`` on a target string will then scan for matches,2038and replace the matched text patterns according to the logic in the parse2039action. ``transformString()`` returns the resulting transformed string.20402041Example::20422043wd = Word(alphas)2044wd.setParseAction(lambda toks: toks[0].title())20452046print(wd.transformString("now is the winter of our discontent made glorious summer by this sun of york."))20472048prints::20492050Now Is The Winter Of Our Discontent Made Glorious Summer By This Sun Of York.2051"""2052out = []2053lastE = 02054# force preservation of <TAB>s, to minimize unwanted transformation of string, and to2055# keep string locs straight between transformString and scanString2056self.keepTabs = True2057try:2058for t, s, e in self.scanString(instring):2059out.append(instring[lastE:s])2060if t:2061if isinstance(t, ParseResults):2062out += t.asList()2063elif isinstance(t, list):2064out += t2065else:2066out.append(t)2067lastE = e2068out.append(instring[lastE:])2069out = [o for o in out if o]2070return "".join(map(_ustr, _flatten(out)))2071except ParseBaseException as exc:2072if ParserElement.verbose_stacktrace:2073raise2074else:2075# catch and re-raise exception from here, clearing out pyparsing internal stack trace2076if getattr(exc, '__traceback__', None) is not None:2077exc.__traceback__ = self._trim_traceback(exc.__traceback__)2078raise exc20792080def searchString(self, instring, maxMatches=_MAX_INT):2081"""2082Another extension to :class:`scanString`, simplifying the access to the tokens found2083to match the given parse expression. May be called with optional2084``maxMatches`` argument, to clip searching after 'n' matches are found.20852086Example::20872088# a capitalized word starts with an uppercase letter, followed by zero or more lowercase letters2089cap_word = Word(alphas.upper(), alphas.lower())20902091print(cap_word.searchString("More than Iron, more than Lead, more than Gold I need Electricity"))20922093# the sum() builtin can be used to merge results into a single ParseResults object2094print(sum(cap_word.searchString("More than Iron, more than Lead, more than Gold I need Electricity")))20952096prints::20972098[['More'], ['Iron'], ['Lead'], ['Gold'], ['I'], ['Electricity']]2099['More', 'Iron', 'Lead', 'Gold', 'I', 'Electricity']2100"""2101try:2102return ParseResults([t for t, s, e in self.scanString(instring, maxMatches)])2103except ParseBaseException as exc:2104if ParserElement.verbose_stacktrace:2105raise2106else:2107# catch and re-raise exception from here, clearing out pyparsing internal stack trace2108if getattr(exc, '__traceback__', None) is not None:2109exc.__traceback__ = self._trim_traceback(exc.__traceback__)2110raise exc21112112def split(self, instring, maxsplit=_MAX_INT, includeSeparators=False):2113"""2114Generator method to split a string using the given expression as a separator.2115May be called with optional ``maxsplit`` argument, to limit the number of splits;2116and the optional ``includeSeparators`` argument (default= ``False``), if the separating2117matching text should be included in the split results.21182119Example::21202121punc = oneOf(list(".,;:/-!?"))2122print(list(punc.split("This, this?, this sentence, is badly punctuated!")))21232124prints::21252126['This', ' this', '', ' this sentence', ' is badly punctuated', '']2127"""2128splits = 02129last = 02130for t, s, e in self.scanString(instring, maxMatches=maxsplit):2131yield instring[last:s]2132if includeSeparators:2133yield t[0]2134last = e2135yield instring[last:]21362137def __add__(self, other):2138"""2139Implementation of + operator - returns :class:`And`. Adding strings to a ParserElement2140converts them to :class:`Literal`s by default.21412142Example::21432144greet = Word(alphas) + "," + Word(alphas) + "!"2145hello = "Hello, World!"2146print (hello, "->", greet.parseString(hello))21472148prints::21492150Hello, World! -> ['Hello', ',', 'World', '!']21512152``...`` may be used as a parse expression as a short form of :class:`SkipTo`.21532154Literal('start') + ... + Literal('end')21552156is equivalent to:21572158Literal('start') + SkipTo('end')("_skipped*") + Literal('end')21592160Note that the skipped text is returned with '_skipped' as a results name,2161and to support having multiple skips in the same parser, the value returned is2162a list of all skipped text.2163"""2164if other is Ellipsis:2165return _PendingSkip(self)21662167if isinstance(other, basestring):2168other = self._literalStringClass(other)2169if not isinstance(other, ParserElement):2170warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2171SyntaxWarning, stacklevel=2)2172return None2173return And([self, other])21742175def __radd__(self, other):2176"""2177Implementation of + operator when left operand is not a :class:`ParserElement`2178"""2179if other is Ellipsis:2180return SkipTo(self)("_skipped*") + self21812182if isinstance(other, basestring):2183other = self._literalStringClass(other)2184if not isinstance(other, ParserElement):2185warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2186SyntaxWarning, stacklevel=2)2187return None2188return other + self21892190def __sub__(self, other):2191"""2192Implementation of - operator, returns :class:`And` with error stop2193"""2194if isinstance(other, basestring):2195other = self._literalStringClass(other)2196if not isinstance(other, ParserElement):2197warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2198SyntaxWarning, stacklevel=2)2199return None2200return self + And._ErrorStop() + other22012202def __rsub__(self, other):2203"""2204Implementation of - operator when left operand is not a :class:`ParserElement`2205"""2206if isinstance(other, basestring):2207other = self._literalStringClass(other)2208if not isinstance(other, ParserElement):2209warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2210SyntaxWarning, stacklevel=2)2211return None2212return other - self22132214def __mul__(self, other):2215"""2216Implementation of * operator, allows use of ``expr * 3`` in place of2217``expr + expr + expr``. Expressions may also me multiplied by a 2-integer2218tuple, similar to ``{min, max}`` multipliers in regular expressions. Tuples2219may also include ``None`` as in:2220- ``expr*(n, None)`` or ``expr*(n, )`` is equivalent2221to ``expr*n + ZeroOrMore(expr)``2222(read as "at least n instances of ``expr``")2223- ``expr*(None, n)`` is equivalent to ``expr*(0, n)``2224(read as "0 to n instances of ``expr``")2225- ``expr*(None, None)`` is equivalent to ``ZeroOrMore(expr)``2226- ``expr*(1, None)`` is equivalent to ``OneOrMore(expr)``22272228Note that ``expr*(None, n)`` does not raise an exception if2229more than n exprs exist in the input stream; that is,2230``expr*(None, n)`` does not enforce a maximum number of expr2231occurrences. If this behavior is desired, then write2232``expr*(None, n) + ~expr``2233"""2234if other is Ellipsis:2235other = (0, None)2236elif isinstance(other, tuple) and other[:1] == (Ellipsis,):2237other = ((0, ) + other[1:] + (None,))[:2]22382239if isinstance(other, int):2240minElements, optElements = other, 02241elif isinstance(other, tuple):2242other = tuple(o if o is not Ellipsis else None for o in other)2243other = (other + (None, None))[:2]2244if other[0] is None:2245other = (0, other[1])2246if isinstance(other[0], int) and other[1] is None:2247if other[0] == 0:2248return ZeroOrMore(self)2249if other[0] == 1:2250return OneOrMore(self)2251else:2252return self * other[0] + ZeroOrMore(self)2253elif isinstance(other[0], int) and isinstance(other[1], int):2254minElements, optElements = other2255optElements -= minElements2256else:2257raise TypeError("cannot multiply 'ParserElement' and ('%s', '%s') objects", type(other[0]), type(other[1]))2258else:2259raise TypeError("cannot multiply 'ParserElement' and '%s' objects", type(other))22602261if minElements < 0:2262raise ValueError("cannot multiply ParserElement by negative value")2263if optElements < 0:2264raise ValueError("second tuple value must be greater or equal to first tuple value")2265if minElements == optElements == 0:2266raise ValueError("cannot multiply ParserElement by 0 or (0, 0)")22672268if optElements:2269def makeOptionalList(n):2270if n > 1:2271return Optional(self + makeOptionalList(n - 1))2272else:2273return Optional(self)2274if minElements:2275if minElements == 1:2276ret = self + makeOptionalList(optElements)2277else:2278ret = And([self] * minElements) + makeOptionalList(optElements)2279else:2280ret = makeOptionalList(optElements)2281else:2282if minElements == 1:2283ret = self2284else:2285ret = And([self] * minElements)2286return ret22872288def __rmul__(self, other):2289return self.__mul__(other)22902291def __or__(self, other):2292"""2293Implementation of | operator - returns :class:`MatchFirst`2294"""2295if other is Ellipsis:2296return _PendingSkip(self, must_skip=True)22972298if isinstance(other, basestring):2299other = self._literalStringClass(other)2300if not isinstance(other, ParserElement):2301warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2302SyntaxWarning, stacklevel=2)2303return None2304return MatchFirst([self, other])23052306def __ror__(self, other):2307"""2308Implementation of | operator when left operand is not a :class:`ParserElement`2309"""2310if isinstance(other, basestring):2311other = self._literalStringClass(other)2312if not isinstance(other, ParserElement):2313warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2314SyntaxWarning, stacklevel=2)2315return None2316return other | self23172318def __xor__(self, other):2319"""2320Implementation of ^ operator - returns :class:`Or`2321"""2322if isinstance(other, basestring):2323other = self._literalStringClass(other)2324if not isinstance(other, ParserElement):2325warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2326SyntaxWarning, stacklevel=2)2327return None2328return Or([self, other])23292330def __rxor__(self, other):2331"""2332Implementation of ^ operator when left operand is not a :class:`ParserElement`2333"""2334if isinstance(other, basestring):2335other = self._literalStringClass(other)2336if not isinstance(other, ParserElement):2337warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2338SyntaxWarning, stacklevel=2)2339return None2340return other ^ self23412342def __and__(self, other):2343"""2344Implementation of & operator - returns :class:`Each`2345"""2346if isinstance(other, basestring):2347other = self._literalStringClass(other)2348if not isinstance(other, ParserElement):2349warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2350SyntaxWarning, stacklevel=2)2351return None2352return Each([self, other])23532354def __rand__(self, other):2355"""2356Implementation of & operator when left operand is not a :class:`ParserElement`2357"""2358if isinstance(other, basestring):2359other = self._literalStringClass(other)2360if not isinstance(other, ParserElement):2361warnings.warn("Cannot combine element of type %s with ParserElement" % type(other),2362SyntaxWarning, stacklevel=2)2363return None2364return other & self23652366def __invert__(self):2367"""2368Implementation of ~ operator - returns :class:`NotAny`2369"""2370return NotAny(self)23712372def __iter__(self):2373# must implement __iter__ to override legacy use of sequential access to __getitem__ to2374# iterate over a sequence2375raise TypeError('%r object is not iterable' % self.__class__.__name__)23762377def __getitem__(self, key):2378"""2379use ``[]`` indexing notation as a short form for expression repetition:2380- ``expr[n]`` is equivalent to ``expr*n``2381- ``expr[m, n]`` is equivalent to ``expr*(m, n)``2382- ``expr[n, ...]`` or ``expr[n,]`` is equivalent2383to ``expr*n + ZeroOrMore(expr)``2384(read as "at least n instances of ``expr``")2385- ``expr[..., n]`` is equivalent to ``expr*(0, n)``2386(read as "0 to n instances of ``expr``")2387- ``expr[...]`` and ``expr[0, ...]`` are equivalent to ``ZeroOrMore(expr)``2388- ``expr[1, ...]`` is equivalent to ``OneOrMore(expr)``2389``None`` may be used in place of ``...``.23902391Note that ``expr[..., n]`` and ``expr[m, n]``do not raise an exception2392if more than ``n`` ``expr``s exist in the input stream. If this behavior is2393desired, then write ``expr[..., n] + ~expr``.2394"""23952396# convert single arg keys to tuples2397try:2398if isinstance(key, str):2399key = (key,)2400iter(key)2401except TypeError:2402key = (key, key)24032404if len(key) > 2:2405warnings.warn("only 1 or 2 index arguments supported ({0}{1})".format(key[:5],2406'... [{0}]'.format(len(key))2407if len(key) > 5 else ''))24082409# clip to 2 elements2410ret = self * tuple(key[:2])2411return ret24122413def __call__(self, name=None):2414"""2415Shortcut for :class:`setResultsName`, with ``listAllMatches=False``.24162417If ``name`` is given with a trailing ``'*'`` character, then ``listAllMatches`` will be2418passed as ``True``.24192420If ``name` is omitted, same as calling :class:`copy`.24212422Example::24232424# these are equivalent2425userdata = Word(alphas).setResultsName("name") + Word(nums + "-").setResultsName("socsecno")2426userdata = Word(alphas)("name") + Word(nums + "-")("socsecno")2427"""2428if name is not None:2429return self._setResultsName(name)2430else:2431return self.copy()24322433def suppress(self):2434"""2435Suppresses the output of this :class:`ParserElement`; useful to keep punctuation from2436cluttering up returned output.2437"""2438return Suppress(self)24392440def leaveWhitespace(self):2441"""2442Disables the skipping of whitespace before matching the characters in the2443:class:`ParserElement`'s defined pattern. This is normally only used internally by2444the pyparsing module, but may be needed in some whitespace-sensitive grammars.2445"""2446self.skipWhitespace = False2447return self24482449def setWhitespaceChars(self, chars):2450"""2451Overrides the default whitespace chars2452"""2453self.skipWhitespace = True2454self.whiteChars = chars2455self.copyDefaultWhiteChars = False2456return self24572458def parseWithTabs(self):2459"""2460Overrides default behavior to expand ``<TAB>``s to spaces before parsing the input string.2461Must be called before ``parseString`` when the input grammar contains elements that2462match ``<TAB>`` characters.2463"""2464self.keepTabs = True2465return self24662467def ignore(self, other):2468"""2469Define expression to be ignored (e.g., comments) while doing pattern2470matching; may be called repeatedly, to define multiple comment or other2471ignorable patterns.24722473Example::24742475patt = OneOrMore(Word(alphas))2476patt.parseString('ablaj /* comment */ lskjd') # -> ['ablaj']24772478patt.ignore(cStyleComment)2479patt.parseString('ablaj /* comment */ lskjd') # -> ['ablaj', 'lskjd']2480"""2481if isinstance(other, basestring):2482other = Suppress(other)24832484if isinstance(other, Suppress):2485if other not in self.ignoreExprs:2486self.ignoreExprs.append(other)2487else:2488self.ignoreExprs.append(Suppress(other.copy()))2489return self24902491def setDebugActions(self, startAction, successAction, exceptionAction):2492"""2493Enable display of debugging messages while doing pattern matching.2494"""2495self.debugActions = (startAction or _defaultStartDebugAction,2496successAction or _defaultSuccessDebugAction,2497exceptionAction or _defaultExceptionDebugAction)2498self.debug = True2499return self25002501def setDebug(self, flag=True):2502"""2503Enable display of debugging messages while doing pattern matching.2504Set ``flag`` to True to enable, False to disable.25052506Example::25072508wd = Word(alphas).setName("alphaword")2509integer = Word(nums).setName("numword")2510term = wd | integer25112512# turn on debugging for wd2513wd.setDebug()25142515OneOrMore(term).parseString("abc 123 xyz 890")25162517prints::25182519Match alphaword at loc 0(1,1)2520Matched alphaword -> ['abc']2521Match alphaword at loc 3(1,4)2522Exception raised:Expected alphaword (at char 4), (line:1, col:5)2523Match alphaword at loc 7(1,8)2524Matched alphaword -> ['xyz']2525Match alphaword at loc 11(1,12)2526Exception raised:Expected alphaword (at char 12), (line:1, col:13)2527Match alphaword at loc 15(1,16)2528Exception raised:Expected alphaword (at char 15), (line:1, col:16)25292530The output shown is that produced by the default debug actions - custom debug actions can be2531specified using :class:`setDebugActions`. Prior to attempting2532to match the ``wd`` expression, the debugging message ``"Match <exprname> at loc <n>(<line>,<col>)"``2533is shown. Then if the parse succeeds, a ``"Matched"`` message is shown, or an ``"Exception raised"``2534message is shown. Also note the use of :class:`setName` to assign a human-readable name to the expression,2535which makes debugging and exception messages easier to understand - for instance, the default2536name created for the :class:`Word` expression without calling ``setName`` is ``"W:(ABCD...)"``.2537"""2538if flag:2539self.setDebugActions(_defaultStartDebugAction, _defaultSuccessDebugAction, _defaultExceptionDebugAction)2540else:2541self.debug = False2542return self25432544def __str__(self):2545return self.name25462547def __repr__(self):2548return _ustr(self)25492550def streamline(self):2551self.streamlined = True2552self.strRepr = None2553return self25542555def checkRecursion(self, parseElementList):2556pass25572558def validate(self, validateTrace=None):2559"""2560Check defined expressions for valid structure, check for infinite recursive definitions.2561"""2562self.checkRecursion([])25632564def parseFile(self, file_or_filename, parseAll=False):2565"""2566Execute the parse expression on the given file or filename.2567If a filename is specified (instead of a file object),2568the entire file is opened, read, and closed before parsing.2569"""2570try:2571file_contents = file_or_filename.read()2572except AttributeError:2573with open(file_or_filename, "r") as f:2574file_contents = f.read()2575try:2576return self.parseString(file_contents, parseAll)2577except ParseBaseException as exc:2578if ParserElement.verbose_stacktrace:2579raise2580else:2581# catch and re-raise exception from here, clearing out pyparsing internal stack trace2582if getattr(exc, '__traceback__', None) is not None:2583exc.__traceback__ = self._trim_traceback(exc.__traceback__)2584raise exc25852586def __eq__(self, other):2587if self is other:2588return True2589elif isinstance(other, basestring):2590return self.matches(other)2591elif isinstance(other, ParserElement):2592return vars(self) == vars(other)2593return False25942595def __ne__(self, other):2596return not (self == other)25972598def __hash__(self):2599return id(self)26002601def __req__(self, other):2602return self == other26032604def __rne__(self, other):2605return not (self == other)26062607def matches(self, testString, parseAll=True):2608"""2609Method for quick testing of a parser against a test string. Good for simple2610inline microtests of sub expressions while building up larger parser.26112612Parameters:2613- testString - to test against this expression for a match2614- parseAll - (default= ``True``) - flag to pass to :class:`parseString` when running tests26152616Example::26172618expr = Word(nums)2619assert expr.matches("100")2620"""2621try:2622self.parseString(_ustr(testString), parseAll=parseAll)2623return True2624except ParseBaseException:2625return False26262627def runTests(self, tests, parseAll=True, comment='#',2628fullDump=True, printResults=True, failureTests=False, postParse=None,2629file=None):2630"""2631Execute the parse expression on a series of test strings, showing each2632test, the parsed results or where the parse failed. Quick and easy way to2633run a parse expression against a list of sample strings.26342635Parameters:2636- tests - a list of separate test strings, or a multiline string of test strings2637- parseAll - (default= ``True``) - flag to pass to :class:`parseString` when running tests2638- comment - (default= ``'#'``) - expression for indicating embedded comments in the test2639string; pass None to disable comment filtering2640- fullDump - (default= ``True``) - dump results as list followed by results names in nested outline;2641if False, only dump nested list2642- printResults - (default= ``True``) prints test output to stdout2643- failureTests - (default= ``False``) indicates if these tests are expected to fail parsing2644- postParse - (default= ``None``) optional callback for successful parse results; called as2645`fn(test_string, parse_results)` and returns a string to be added to the test output2646- file - (default=``None``) optional file-like object to which test output will be written;2647if None, will default to ``sys.stdout``26482649Returns: a (success, results) tuple, where success indicates that all tests succeeded2650(or failed if ``failureTests`` is True), and the results contain a list of lines of each2651test's output26522653Example::26542655number_expr = pyparsing_common.number.copy()26562657result = number_expr.runTests('''2658# unsigned integer26591002660# negative integer2661-1002662# float with scientific notation26636.02e232664# integer with scientific notation26651e-122666''')2667print("Success" if result[0] else "Failed!")26682669result = number_expr.runTests('''2670# stray character2671100Z2672# missing leading digit before '.'2673-.1002674# too many '.'26753.14.1592676''', failureTests=True)2677print("Success" if result[0] else "Failed!")26782679prints::26802681# unsigned integer26821002683[100]26842685# negative integer2686-1002687[-100]26882689# float with scientific notation26906.02e232691[6.02e+23]26922693# integer with scientific notation26941e-122695[1e-12]26962697Success26982699# stray character2700100Z2701^2702FAIL: Expected end of text (at char 3), (line:1, col:4)27032704# missing leading digit before '.'2705-.1002706^2707FAIL: Expected {real number with scientific notation | real number | signed integer} (at char 0), (line:1, col:1)27082709# too many '.'27103.14.1592711^2712FAIL: Expected end of text (at char 4), (line:1, col:5)27132714Success27152716Each test string must be on a single line. If you want to test a string that spans multiple2717lines, create a test like this::27182719expr.runTest(r"this is a test\\n of strings that spans \\n 3 lines")27202721(Note that this is a raw string literal, you must include the leading 'r'.)2722"""2723if isinstance(tests, basestring):2724tests = list(map(str.strip, tests.rstrip().splitlines()))2725if isinstance(comment, basestring):2726comment = Literal(comment)2727if file is None:2728file = sys.stdout2729print_ = file.write27302731allResults = []2732comments = []2733success = True2734NL = Literal(r'\n').addParseAction(replaceWith('\n')).ignore(quotedString)2735BOM = u'\ufeff'2736for t in tests:2737if comment is not None and comment.matches(t, False) or comments and not t:2738comments.append(t)2739continue2740if not t:2741continue2742out = ['\n' + '\n'.join(comments) if comments else '', t]2743comments = []2744try:2745# convert newline marks to actual newlines, and strip leading BOM if present2746t = NL.transformString(t.lstrip(BOM))2747result = self.parseString(t, parseAll=parseAll)2748except ParseBaseException as pe:2749fatal = "(FATAL)" if isinstance(pe, ParseFatalException) else ""2750if '\n' in t:2751out.append(line(pe.loc, t))2752out.append(' ' * (col(pe.loc, t) - 1) + '^' + fatal)2753else:2754out.append(' ' * pe.loc + '^' + fatal)2755out.append("FAIL: " + str(pe))2756success = success and failureTests2757result = pe2758except Exception as exc:2759out.append("FAIL-EXCEPTION: " + str(exc))2760success = success and failureTests2761result = exc2762else:2763success = success and not failureTests2764if postParse is not None:2765try:2766pp_value = postParse(t, result)2767if pp_value is not None:2768if isinstance(pp_value, ParseResults):2769out.append(pp_value.dump())2770else:2771out.append(str(pp_value))2772else:2773out.append(result.dump())2774except Exception as e:2775out.append(result.dump(full=fullDump))2776out.append("{0} failed: {1}: {2}".format(postParse.__name__, type(e).__name__, e))2777else:2778out.append(result.dump(full=fullDump))27792780if printResults:2781if fullDump:2782out.append('')2783print_('\n'.join(out))27842785allResults.append((t, result))27862787return success, allResults278827892790class _PendingSkip(ParserElement):2791# internal placeholder class to hold a place were '...' is added to a parser element,2792# once another ParserElement is added, this placeholder will be replaced with a SkipTo2793def __init__(self, expr, must_skip=False):2794super(_PendingSkip, self).__init__()2795self.strRepr = str(expr + Empty()).replace('Empty', '...')2796self.name = self.strRepr2797self.anchor = expr2798self.must_skip = must_skip27992800def __add__(self, other):2801skipper = SkipTo(other).setName("...")("_skipped*")2802if self.must_skip:2803def must_skip(t):2804if not t._skipped or t._skipped.asList() == ['']:2805del t[0]2806t.pop("_skipped", None)2807def show_skip(t):2808if t._skipped.asList()[-1:] == ['']:2809skipped = t.pop('_skipped')2810t['_skipped'] = 'missing <' + repr(self.anchor) + '>'2811return (self.anchor + skipper().addParseAction(must_skip)2812| skipper().addParseAction(show_skip)) + other28132814return self.anchor + skipper + other28152816def __repr__(self):2817return self.strRepr28182819def parseImpl(self, *args):2820raise Exception("use of `...` expression without following SkipTo target expression")282128222823class Token(ParserElement):2824"""Abstract :class:`ParserElement` subclass, for defining atomic2825matching patterns.2826"""2827def __init__(self):2828super(Token, self).__init__(savelist=False)282928302831class Empty(Token):2832"""An empty token, will always match.2833"""2834def __init__(self):2835super(Empty, self).__init__()2836self.name = "Empty"2837self.mayReturnEmpty = True2838self.mayIndexError = False283928402841class NoMatch(Token):2842"""A token that will never match.2843"""2844def __init__(self):2845super(NoMatch, self).__init__()2846self.name = "NoMatch"2847self.mayReturnEmpty = True2848self.mayIndexError = False2849self.errmsg = "Unmatchable token"28502851def parseImpl(self, instring, loc, doActions=True):2852raise ParseException(instring, loc, self.errmsg, self)285328542855class Literal(Token):2856"""Token to exactly match a specified string.28572858Example::28592860Literal('blah').parseString('blah') # -> ['blah']2861Literal('blah').parseString('blahfooblah') # -> ['blah']2862Literal('blah').parseString('bla') # -> Exception: Expected "blah"28632864For case-insensitive matching, use :class:`CaselessLiteral`.28652866For keyword matching (force word break before and after the matched string),2867use :class:`Keyword` or :class:`CaselessKeyword`.2868"""2869def __init__(self, matchString):2870super(Literal, self).__init__()2871self.match = matchString2872self.matchLen = len(matchString)2873try:2874self.firstMatchChar = matchString[0]2875except IndexError:2876warnings.warn("null string passed to Literal; use Empty() instead",2877SyntaxWarning, stacklevel=2)2878self.__class__ = Empty2879self.name = '"%s"' % _ustr(self.match)2880self.errmsg = "Expected " + self.name2881self.mayReturnEmpty = False2882self.mayIndexError = False28832884# Performance tuning: modify __class__ to select2885# a parseImpl optimized for single-character check2886if self.matchLen == 1 and type(self) is Literal:2887self.__class__ = _SingleCharLiteral28882889def parseImpl(self, instring, loc, doActions=True):2890if instring[loc] == self.firstMatchChar and instring.startswith(self.match, loc):2891return loc + self.matchLen, self.match2892raise ParseException(instring, loc, self.errmsg, self)28932894class _SingleCharLiteral(Literal):2895def parseImpl(self, instring, loc, doActions=True):2896if instring[loc] == self.firstMatchChar:2897return loc + 1, self.match2898raise ParseException(instring, loc, self.errmsg, self)28992900_L = Literal2901ParserElement._literalStringClass = Literal29022903class Keyword(Token):2904"""Token to exactly match a specified string as a keyword, that is,2905it must be immediately followed by a non-keyword character. Compare2906with :class:`Literal`:29072908- ``Literal("if")`` will match the leading ``'if'`` in2909``'ifAndOnlyIf'``.2910- ``Keyword("if")`` will not; it will only match the leading2911``'if'`` in ``'if x=1'``, or ``'if(y==2)'``29122913Accepts two optional constructor arguments in addition to the2914keyword string:29152916- ``identChars`` is a string of characters that would be valid2917identifier characters, defaulting to all alphanumerics + "_" and2918"$"2919- ``caseless`` allows case-insensitive matching, default is ``False``.29202921Example::29222923Keyword("start").parseString("start") # -> ['start']2924Keyword("start").parseString("starting") # -> Exception29252926For case-insensitive matching, use :class:`CaselessKeyword`.2927"""2928DEFAULT_KEYWORD_CHARS = alphanums + "_$"29292930def __init__(self, matchString, identChars=None, caseless=False):2931super(Keyword, self).__init__()2932if identChars is None:2933identChars = Keyword.DEFAULT_KEYWORD_CHARS2934self.match = matchString2935self.matchLen = len(matchString)2936try:2937self.firstMatchChar = matchString[0]2938except IndexError:2939warnings.warn("null string passed to Keyword; use Empty() instead",2940SyntaxWarning, stacklevel=2)2941self.name = '"%s"' % self.match2942self.errmsg = "Expected " + self.name2943self.mayReturnEmpty = False2944self.mayIndexError = False2945self.caseless = caseless2946if caseless:2947self.caselessmatch = matchString.upper()2948identChars = identChars.upper()2949self.identChars = set(identChars)29502951def parseImpl(self, instring, loc, doActions=True):2952if self.caseless:2953if ((instring[loc:loc + self.matchLen].upper() == self.caselessmatch)2954and (loc >= len(instring) - self.matchLen2955or instring[loc + self.matchLen].upper() not in self.identChars)2956and (loc == 02957or instring[loc - 1].upper() not in self.identChars)):2958return loc + self.matchLen, self.match29592960else:2961if instring[loc] == self.firstMatchChar:2962if ((self.matchLen == 1 or instring.startswith(self.match, loc))2963and (loc >= len(instring) - self.matchLen2964or instring[loc + self.matchLen] not in self.identChars)2965and (loc == 0 or instring[loc - 1] not in self.identChars)):2966return loc + self.matchLen, self.match29672968raise ParseException(instring, loc, self.errmsg, self)29692970def copy(self):2971c = super(Keyword, self).copy()2972c.identChars = Keyword.DEFAULT_KEYWORD_CHARS2973return c29742975@staticmethod2976def setDefaultKeywordChars(chars):2977"""Overrides the default Keyword chars2978"""2979Keyword.DEFAULT_KEYWORD_CHARS = chars29802981class CaselessLiteral(Literal):2982"""Token to match a specified string, ignoring case of letters.2983Note: the matched results will always be in the case of the given2984match string, NOT the case of the input text.29852986Example::29872988OneOrMore(CaselessLiteral("CMD")).parseString("cmd CMD Cmd10") # -> ['CMD', 'CMD', 'CMD']29892990(Contrast with example for :class:`CaselessKeyword`.)2991"""2992def __init__(self, matchString):2993super(CaselessLiteral, self).__init__(matchString.upper())2994# Preserve the defining literal.2995self.returnString = matchString2996self.name = "'%s'" % self.returnString2997self.errmsg = "Expected " + self.name29982999def parseImpl(self, instring, loc, doActions=True):3000if instring[loc:loc + self.matchLen].upper() == self.match:3001return loc + self.matchLen, self.returnString3002raise ParseException(instring, loc, self.errmsg, self)30033004class CaselessKeyword(Keyword):3005"""3006Caseless version of :class:`Keyword`.30073008Example::30093010OneOrMore(CaselessKeyword("CMD")).parseString("cmd CMD Cmd10") # -> ['CMD', 'CMD']30113012(Contrast with example for :class:`CaselessLiteral`.)3013"""3014def __init__(self, matchString, identChars=None):3015super(CaselessKeyword, self).__init__(matchString, identChars, caseless=True)30163017class CloseMatch(Token):3018"""A variation on :class:`Literal` which matches "close" matches,3019that is, strings with at most 'n' mismatching characters.3020:class:`CloseMatch` takes parameters:30213022- ``match_string`` - string to be matched3023- ``maxMismatches`` - (``default=1``) maximum number of3024mismatches allowed to count as a match30253026The results from a successful parse will contain the matched text3027from the input string and the following named results:30283029- ``mismatches`` - a list of the positions within the3030match_string where mismatches were found3031- ``original`` - the original match_string used to compare3032against the input string30333034If ``mismatches`` is an empty list, then the match was an exact3035match.30363037Example::30383039patt = CloseMatch("ATCATCGAATGGA")3040patt.parseString("ATCATCGAAXGGA") # -> (['ATCATCGAAXGGA'], {'mismatches': [[9]], 'original': ['ATCATCGAATGGA']})3041patt.parseString("ATCAXCGAAXGGA") # -> Exception: Expected 'ATCATCGAATGGA' (with up to 1 mismatches) (at char 0), (line:1, col:1)30423043# exact match3044patt.parseString("ATCATCGAATGGA") # -> (['ATCATCGAATGGA'], {'mismatches': [[]], 'original': ['ATCATCGAATGGA']})30453046# close match allowing up to 2 mismatches3047patt = CloseMatch("ATCATCGAATGGA", maxMismatches=2)3048patt.parseString("ATCAXCGAAXGGA") # -> (['ATCAXCGAAXGGA'], {'mismatches': [[4, 9]], 'original': ['ATCATCGAATGGA']})3049"""3050def __init__(self, match_string, maxMismatches=1):3051super(CloseMatch, self).__init__()3052self.name = match_string3053self.match_string = match_string3054self.maxMismatches = maxMismatches3055self.errmsg = "Expected %r (with up to %d mismatches)" % (self.match_string, self.maxMismatches)3056self.mayIndexError = False3057self.mayReturnEmpty = False30583059def parseImpl(self, instring, loc, doActions=True):3060start = loc3061instrlen = len(instring)3062maxloc = start + len(self.match_string)30633064if maxloc <= instrlen:3065match_string = self.match_string3066match_stringloc = 03067mismatches = []3068maxMismatches = self.maxMismatches30693070for match_stringloc, s_m in enumerate(zip(instring[loc:maxloc], match_string)):3071src, mat = s_m3072if src != mat:3073mismatches.append(match_stringloc)3074if len(mismatches) > maxMismatches:3075break3076else:3077loc = match_stringloc + 13078results = ParseResults([instring[start:loc]])3079results['original'] = match_string3080results['mismatches'] = mismatches3081return loc, results30823083raise ParseException(instring, loc, self.errmsg, self)308430853086class Word(Token):3087"""Token for matching words composed of allowed character sets.3088Defined with string containing all allowed initial characters, an3089optional string containing allowed body characters (if omitted,3090defaults to the initial character set), and an optional minimum,3091maximum, and/or exact length. The default value for ``min`` is30921 (a minimum value < 1 is not valid); the default values for3093``max`` and ``exact`` are 0, meaning no maximum or exact3094length restriction. An optional ``excludeChars`` parameter can3095list characters that might be found in the input ``bodyChars``3096string; useful to define a word of all printables except for one or3097two characters, for instance.30983099:class:`srange` is useful for defining custom character set strings3100for defining ``Word`` expressions, using range notation from3101regular expression character sets.31023103A common mistake is to use :class:`Word` to match a specific literal3104string, as in ``Word("Address")``. Remember that :class:`Word`3105uses the string argument to define *sets* of matchable characters.3106This expression would match "Add", "AAA", "dAred", or any other word3107made up of the characters 'A', 'd', 'r', 'e', and 's'. To match an3108exact literal string, use :class:`Literal` or :class:`Keyword`.31093110pyparsing includes helper strings for building Words:31113112- :class:`alphas`3113- :class:`nums`3114- :class:`alphanums`3115- :class:`hexnums`3116- :class:`alphas8bit` (alphabetic characters in ASCII range 128-2553117- accented, tilded, umlauted, etc.)3118- :class:`punc8bit` (non-alphabetic characters in ASCII range3119128-255 - currency, symbols, superscripts, diacriticals, etc.)3120- :class:`printables` (any non-whitespace character)31213122Example::31233124# a word composed of digits3125integer = Word(nums) # equivalent to Word("0123456789") or Word(srange("0-9"))31263127# a word with a leading capital, and zero or more lowercase3128capital_word = Word(alphas.upper(), alphas.lower())31293130# hostnames are alphanumeric, with leading alpha, and '-'3131hostname = Word(alphas, alphanums + '-')31323133# roman numeral (not a strict parser, accepts invalid mix of characters)3134roman = Word("IVXLCDM")31353136# any string of non-whitespace characters, except for ','3137csv_value = Word(printables, excludeChars=",")3138"""3139def __init__(self, initChars, bodyChars=None, min=1, max=0, exact=0, asKeyword=False, excludeChars=None):3140super(Word, self).__init__()3141if excludeChars:3142excludeChars = set(excludeChars)3143initChars = ''.join(c for c in initChars if c not in excludeChars)3144if bodyChars:3145bodyChars = ''.join(c for c in bodyChars if c not in excludeChars)3146self.initCharsOrig = initChars3147self.initChars = set(initChars)3148if bodyChars:3149self.bodyCharsOrig = bodyChars3150self.bodyChars = set(bodyChars)3151else:3152self.bodyCharsOrig = initChars3153self.bodyChars = set(initChars)31543155self.maxSpecified = max > 031563157if min < 1:3158raise ValueError("cannot specify a minimum length < 1; use Optional(Word()) if zero-length word is permitted")31593160self.minLen = min31613162if max > 0:3163self.maxLen = max3164else:3165self.maxLen = _MAX_INT31663167if exact > 0:3168self.maxLen = exact3169self.minLen = exact31703171self.name = _ustr(self)3172self.errmsg = "Expected " + self.name3173self.mayIndexError = False3174self.asKeyword = asKeyword31753176if ' ' not in self.initCharsOrig + self.bodyCharsOrig and (min == 1 and max == 0 and exact == 0):3177if self.bodyCharsOrig == self.initCharsOrig:3178self.reString = "[%s]+" % _escapeRegexRangeChars(self.initCharsOrig)3179elif len(self.initCharsOrig) == 1:3180self.reString = "%s[%s]*" % (re.escape(self.initCharsOrig),3181_escapeRegexRangeChars(self.bodyCharsOrig),)3182else:3183self.reString = "[%s][%s]*" % (_escapeRegexRangeChars(self.initCharsOrig),3184_escapeRegexRangeChars(self.bodyCharsOrig),)3185if self.asKeyword:3186self.reString = r"\b" + self.reString + r"\b"31873188try:3189self.re = re.compile(self.reString)3190except Exception:3191self.re = None3192else:3193self.re_match = self.re.match3194self.__class__ = _WordRegex31953196def parseImpl(self, instring, loc, doActions=True):3197if instring[loc] not in self.initChars:3198raise ParseException(instring, loc, self.errmsg, self)31993200start = loc3201loc += 13202instrlen = len(instring)3203bodychars = self.bodyChars3204maxloc = start + self.maxLen3205maxloc = min(maxloc, instrlen)3206while loc < maxloc and instring[loc] in bodychars:3207loc += 132083209throwException = False3210if loc - start < self.minLen:3211throwException = True3212elif self.maxSpecified and loc < instrlen and instring[loc] in bodychars:3213throwException = True3214elif self.asKeyword:3215if (start > 0 and instring[start - 1] in bodychars3216or loc < instrlen and instring[loc] in bodychars):3217throwException = True32183219if throwException:3220raise ParseException(instring, loc, self.errmsg, self)32213222return loc, instring[start:loc]32233224def __str__(self):3225try:3226return super(Word, self).__str__()3227except Exception:3228pass32293230if self.strRepr is None:32313232def charsAsStr(s):3233if len(s) > 4:3234return s[:4] + "..."3235else:3236return s32373238if self.initCharsOrig != self.bodyCharsOrig:3239self.strRepr = "W:(%s, %s)" % (charsAsStr(self.initCharsOrig), charsAsStr(self.bodyCharsOrig))3240else:3241self.strRepr = "W:(%s)" % charsAsStr(self.initCharsOrig)32423243return self.strRepr32443245class _WordRegex(Word):3246def parseImpl(self, instring, loc, doActions=True):3247result = self.re_match(instring, loc)3248if not result:3249raise ParseException(instring, loc, self.errmsg, self)32503251loc = result.end()3252return loc, result.group()325332543255class Char(_WordRegex):3256"""A short-cut class for defining ``Word(characters, exact=1)``,3257when defining a match of any single character in a string of3258characters.3259"""3260def __init__(self, charset, asKeyword=False, excludeChars=None):3261super(Char, self).__init__(charset, exact=1, asKeyword=asKeyword, excludeChars=excludeChars)3262self.reString = "[%s]" % _escapeRegexRangeChars(''.join(self.initChars))3263if asKeyword:3264self.reString = r"\b%s\b" % self.reString3265self.re = re.compile(self.reString)3266self.re_match = self.re.match326732683269class Regex(Token):3270r"""Token for matching strings that match a given regular3271expression. Defined with string specifying the regular expression in3272a form recognized by the stdlib Python `re module <https://docs.python.org/3/library/re.html>`_.3273If the given regex contains named groups (defined using ``(?P<name>...)``),3274these will be preserved as named parse results.32753276If instead of the Python stdlib re module you wish to use a different RE module3277(such as the `regex` module), you can replace it by either building your3278Regex object with a compiled RE that was compiled using regex:32793280Example::32813282realnum = Regex(r"[+-]?\d+\.\d*")3283date = Regex(r'(?P<year>\d{4})-(?P<month>\d\d?)-(?P<day>\d\d?)')3284# ref: https://stackoverflow.com/questions/267399/how-do-you-match-only-valid-roman-numerals-with-a-regular-expression3285roman = Regex(r"M{0,4}(CM|CD|D?{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})")32863287# use regex module instead of stdlib re module to construct a Regex using3288# a compiled regular expression3289import regex3290parser = pp.Regex(regex.compile(r'[0-9]'))32913292"""3293def __init__(self, pattern, flags=0, asGroupList=False, asMatch=False):3294"""The parameters ``pattern`` and ``flags`` are passed3295to the ``re.compile()`` function as-is. See the Python3296`re module <https://docs.python.org/3/library/re.html>`_ module for an3297explanation of the acceptable patterns and flags.3298"""3299super(Regex, self).__init__()33003301if isinstance(pattern, basestring):3302if not pattern:3303warnings.warn("null string passed to Regex; use Empty() instead",3304SyntaxWarning, stacklevel=2)33053306self.pattern = pattern3307self.flags = flags33083309try:3310self.re = re.compile(self.pattern, self.flags)3311self.reString = self.pattern3312except sre_constants.error:3313warnings.warn("invalid pattern (%s) passed to Regex" % pattern,3314SyntaxWarning, stacklevel=2)3315raise33163317elif hasattr(pattern, 'pattern') and hasattr(pattern, 'match'):3318self.re = pattern3319self.pattern = self.reString = pattern.pattern3320self.flags = flags33213322else:3323raise TypeError("Regex may only be constructed with a string or a compiled RE object")33243325self.re_match = self.re.match33263327self.name = _ustr(self)3328self.errmsg = "Expected " + self.name3329self.mayIndexError = False3330self.mayReturnEmpty = self.re_match("") is not None3331self.asGroupList = asGroupList3332self.asMatch = asMatch3333if self.asGroupList:3334self.parseImpl = self.parseImplAsGroupList3335if self.asMatch:3336self.parseImpl = self.parseImplAsMatch33373338def parseImpl(self, instring, loc, doActions=True):3339result = self.re_match(instring, loc)3340if not result:3341raise ParseException(instring, loc, self.errmsg, self)33423343loc = result.end()3344ret = ParseResults(result.group())3345d = result.groupdict()3346if d:3347for k, v in d.items():3348ret[k] = v3349return loc, ret33503351def parseImplAsGroupList(self, instring, loc, doActions=True):3352result = self.re_match(instring, loc)3353if not result:3354raise ParseException(instring, loc, self.errmsg, self)33553356loc = result.end()3357ret = result.groups()3358return loc, ret33593360def parseImplAsMatch(self, instring, loc, doActions=True):3361result = self.re_match(instring, loc)3362if not result:3363raise ParseException(instring, loc, self.errmsg, self)33643365loc = result.end()3366ret = result3367return loc, ret33683369def __str__(self):3370try:3371return super(Regex, self).__str__()3372except Exception:3373pass33743375if self.strRepr is None:3376self.strRepr = "Re:(%s)" % repr(self.pattern)33773378return self.strRepr33793380def sub(self, repl):3381r"""3382Return Regex with an attached parse action to transform the parsed3383result as if called using `re.sub(expr, repl, string) <https://docs.python.org/3/library/re.html#re.sub>`_.33843385Example::33863387make_html = Regex(r"(\w+):(.*?):").sub(r"<\1>\2</\1>")3388print(make_html.transformString("h1:main title:"))3389# prints "<h1>main title</h1>"3390"""3391if self.asGroupList:3392warnings.warn("cannot use sub() with Regex(asGroupList=True)",3393SyntaxWarning, stacklevel=2)3394raise SyntaxError()33953396if self.asMatch and callable(repl):3397warnings.warn("cannot use sub() with a callable with Regex(asMatch=True)",3398SyntaxWarning, stacklevel=2)3399raise SyntaxError()34003401if self.asMatch:3402def pa(tokens):3403return tokens[0].expand(repl)3404else:3405def pa(tokens):3406return self.re.sub(repl, tokens[0])3407return self.addParseAction(pa)34083409class QuotedString(Token):3410r"""3411Token for matching strings that are delimited by quoting characters.34123413Defined with the following parameters:34143415- quoteChar - string of one or more characters defining the3416quote delimiting string3417- escChar - character to escape quotes, typically backslash3418(default= ``None``)3419- escQuote - special quote sequence to escape an embedded quote3420string (such as SQL's ``""`` to escape an embedded ``"``)3421(default= ``None``)3422- multiline - boolean indicating whether quotes can span3423multiple lines (default= ``False``)3424- unquoteResults - boolean indicating whether the matched text3425should be unquoted (default= ``True``)3426- endQuoteChar - string of one or more characters defining the3427end of the quote delimited string (default= ``None`` => same as3428quoteChar)3429- convertWhitespaceEscapes - convert escaped whitespace3430(``'\t'``, ``'\n'``, etc.) to actual whitespace3431(default= ``True``)34323433Example::34343435qs = QuotedString('"')3436print(qs.searchString('lsjdf "This is the quote" sldjf'))3437complex_qs = QuotedString('{{', endQuoteChar='}}')3438print(complex_qs.searchString('lsjdf {{This is the "quote"}} sldjf'))3439sql_qs = QuotedString('"', escQuote='""')3440print(sql_qs.searchString('lsjdf "This is the quote with ""embedded"" quotes" sldjf'))34413442prints::34433444[['This is the quote']]3445[['This is the "quote"']]3446[['This is the quote with "embedded" quotes']]3447"""3448def __init__(self, quoteChar, escChar=None, escQuote=None, multiline=False,3449unquoteResults=True, endQuoteChar=None, convertWhitespaceEscapes=True):3450super(QuotedString, self).__init__()34513452# remove white space from quote chars - wont work anyway3453quoteChar = quoteChar.strip()3454if not quoteChar:3455warnings.warn("quoteChar cannot be the empty string", SyntaxWarning, stacklevel=2)3456raise SyntaxError()34573458if endQuoteChar is None:3459endQuoteChar = quoteChar3460else:3461endQuoteChar = endQuoteChar.strip()3462if not endQuoteChar:3463warnings.warn("endQuoteChar cannot be the empty string", SyntaxWarning, stacklevel=2)3464raise SyntaxError()34653466self.quoteChar = quoteChar3467self.quoteCharLen = len(quoteChar)3468self.firstQuoteChar = quoteChar[0]3469self.endQuoteChar = endQuoteChar3470self.endQuoteCharLen = len(endQuoteChar)3471self.escChar = escChar3472self.escQuote = escQuote3473self.unquoteResults = unquoteResults3474self.convertWhitespaceEscapes = convertWhitespaceEscapes34753476if multiline:3477self.flags = re.MULTILINE | re.DOTALL3478self.pattern = r'%s(?:[^%s%s]' % (re.escape(self.quoteChar),3479_escapeRegexRangeChars(self.endQuoteChar[0]),3480(escChar is not None and _escapeRegexRangeChars(escChar) or ''))3481else:3482self.flags = 03483self.pattern = r'%s(?:[^%s\n\r%s]' % (re.escape(self.quoteChar),3484_escapeRegexRangeChars(self.endQuoteChar[0]),3485(escChar is not None and _escapeRegexRangeChars(escChar) or ''))3486if len(self.endQuoteChar) > 1:3487self.pattern += (3488'|(?:' + ')|(?:'.join("%s[^%s]" % (re.escape(self.endQuoteChar[:i]),3489_escapeRegexRangeChars(self.endQuoteChar[i]))3490for i in range(len(self.endQuoteChar) - 1, 0, -1)) + ')')34913492if escQuote:3493self.pattern += (r'|(?:%s)' % re.escape(escQuote))3494if escChar:3495self.pattern += (r'|(?:%s.)' % re.escape(escChar))3496self.escCharReplacePattern = re.escape(self.escChar) + "(.)"3497self.pattern += (r')*%s' % re.escape(self.endQuoteChar))34983499try:3500self.re = re.compile(self.pattern, self.flags)3501self.reString = self.pattern3502self.re_match = self.re.match3503except sre_constants.error:3504warnings.warn("invalid pattern (%s) passed to Regex" % self.pattern,3505SyntaxWarning, stacklevel=2)3506raise35073508self.name = _ustr(self)3509self.errmsg = "Expected " + self.name3510self.mayIndexError = False3511self.mayReturnEmpty = True35123513def parseImpl(self, instring, loc, doActions=True):3514result = instring[loc] == self.firstQuoteChar and self.re_match(instring, loc) or None3515if not result:3516raise ParseException(instring, loc, self.errmsg, self)35173518loc = result.end()3519ret = result.group()35203521if self.unquoteResults:35223523# strip off quotes3524ret = ret[self.quoteCharLen: -self.endQuoteCharLen]35253526if isinstance(ret, basestring):3527# replace escaped whitespace3528if '\\' in ret and self.convertWhitespaceEscapes:3529ws_map = {3530r'\t': '\t',3531r'\n': '\n',3532r'\f': '\f',3533r'\r': '\r',3534}3535for wslit, wschar in ws_map.items():3536ret = ret.replace(wslit, wschar)35373538# replace escaped characters3539if self.escChar:3540ret = re.sub(self.escCharReplacePattern, r"\g<1>", ret)35413542# replace escaped quotes3543if self.escQuote:3544ret = ret.replace(self.escQuote, self.endQuoteChar)35453546return loc, ret35473548def __str__(self):3549try:3550return super(QuotedString, self).__str__()3551except Exception:3552pass35533554if self.strRepr is None:3555self.strRepr = "quoted string, starting with %s ending with %s" % (self.quoteChar, self.endQuoteChar)35563557return self.strRepr355835593560class CharsNotIn(Token):3561"""Token for matching words composed of characters *not* in a given3562set (will include whitespace in matched characters if not listed in3563the provided exclusion set - see example). Defined with string3564containing all disallowed characters, and an optional minimum,3565maximum, and/or exact length. The default value for ``min`` is35661 (a minimum value < 1 is not valid); the default values for3567``max`` and ``exact`` are 0, meaning no maximum or exact3568length restriction.35693570Example::35713572# define a comma-separated-value as anything that is not a ','3573csv_value = CharsNotIn(',')3574print(delimitedList(csv_value).parseString("dkls,lsdkjf,s12 34,@!#,213"))35753576prints::35773578['dkls', 'lsdkjf', 's12 34', '@!#', '213']3579"""3580def __init__(self, notChars, min=1, max=0, exact=0):3581super(CharsNotIn, self).__init__()3582self.skipWhitespace = False3583self.notChars = notChars35843585if min < 1:3586raise ValueError("cannot specify a minimum length < 1; use "3587"Optional(CharsNotIn()) if zero-length char group is permitted")35883589self.minLen = min35903591if max > 0:3592self.maxLen = max3593else:3594self.maxLen = _MAX_INT35953596if exact > 0:3597self.maxLen = exact3598self.minLen = exact35993600self.name = _ustr(self)3601self.errmsg = "Expected " + self.name3602self.mayReturnEmpty = (self.minLen == 0)3603self.mayIndexError = False36043605def parseImpl(self, instring, loc, doActions=True):3606if instring[loc] in self.notChars:3607raise ParseException(instring, loc, self.errmsg, self)36083609start = loc3610loc += 13611notchars = self.notChars3612maxlen = min(start + self.maxLen, len(instring))3613while loc < maxlen and instring[loc] not in notchars:3614loc += 136153616if loc - start < self.minLen:3617raise ParseException(instring, loc, self.errmsg, self)36183619return loc, instring[start:loc]36203621def __str__(self):3622try:3623return super(CharsNotIn, self).__str__()3624except Exception:3625pass36263627if self.strRepr is None:3628if len(self.notChars) > 4:3629self.strRepr = "!W:(%s...)" % self.notChars[:4]3630else:3631self.strRepr = "!W:(%s)" % self.notChars36323633return self.strRepr36343635class White(Token):3636"""Special matching class for matching whitespace. Normally,3637whitespace is ignored by pyparsing grammars. This class is included3638when some whitespace structures are significant. Define with3639a string containing the whitespace characters to be matched; default3640is ``" \\t\\r\\n"``. Also takes optional ``min``,3641``max``, and ``exact`` arguments, as defined for the3642:class:`Word` class.3643"""3644whiteStrs = {3645' ' : '<SP>',3646'\t': '<TAB>',3647'\n': '<LF>',3648'\r': '<CR>',3649'\f': '<FF>',3650u'\u00A0': '<NBSP>',3651u'\u1680': '<OGHAM_SPACE_MARK>',3652u'\u180E': '<MONGOLIAN_VOWEL_SEPARATOR>',3653u'\u2000': '<EN_QUAD>',3654u'\u2001': '<EM_QUAD>',3655u'\u2002': '<EN_SPACE>',3656u'\u2003': '<EM_SPACE>',3657u'\u2004': '<THREE-PER-EM_SPACE>',3658u'\u2005': '<FOUR-PER-EM_SPACE>',3659u'\u2006': '<SIX-PER-EM_SPACE>',3660u'\u2007': '<FIGURE_SPACE>',3661u'\u2008': '<PUNCTUATION_SPACE>',3662u'\u2009': '<THIN_SPACE>',3663u'\u200A': '<HAIR_SPACE>',3664u'\u200B': '<ZERO_WIDTH_SPACE>',3665u'\u202F': '<NNBSP>',3666u'\u205F': '<MMSP>',3667u'\u3000': '<IDEOGRAPHIC_SPACE>',3668}3669def __init__(self, ws=" \t\r\n", min=1, max=0, exact=0):3670super(White, self).__init__()3671self.matchWhite = ws3672self.setWhitespaceChars("".join(c for c in self.whiteChars if c not in self.matchWhite))3673# ~ self.leaveWhitespace()3674self.name = ("".join(White.whiteStrs[c] for c in self.matchWhite))3675self.mayReturnEmpty = True3676self.errmsg = "Expected " + self.name36773678self.minLen = min36793680if max > 0:3681self.maxLen = max3682else:3683self.maxLen = _MAX_INT36843685if exact > 0:3686self.maxLen = exact3687self.minLen = exact36883689def parseImpl(self, instring, loc, doActions=True):3690if instring[loc] not in self.matchWhite:3691raise ParseException(instring, loc, self.errmsg, self)3692start = loc3693loc += 13694maxloc = start + self.maxLen3695maxloc = min(maxloc, len(instring))3696while loc < maxloc and instring[loc] in self.matchWhite:3697loc += 136983699if loc - start < self.minLen:3700raise ParseException(instring, loc, self.errmsg, self)37013702return loc, instring[start:loc]370337043705class _PositionToken(Token):3706def __init__(self):3707super(_PositionToken, self).__init__()3708self.name = self.__class__.__name__3709self.mayReturnEmpty = True3710self.mayIndexError = False37113712class GoToColumn(_PositionToken):3713"""Token to advance to a specific column of input text; useful for3714tabular report scraping.3715"""3716def __init__(self, colno):3717super(GoToColumn, self).__init__()3718self.col = colno37193720def preParse(self, instring, loc):3721if col(loc, instring) != self.col:3722instrlen = len(instring)3723if self.ignoreExprs:3724loc = self._skipIgnorables(instring, loc)3725while loc < instrlen and instring[loc].isspace() and col(loc, instring) != self.col:3726loc += 13727return loc37283729def parseImpl(self, instring, loc, doActions=True):3730thiscol = col(loc, instring)3731if thiscol > self.col:3732raise ParseException(instring, loc, "Text not in expected column", self)3733newloc = loc + self.col - thiscol3734ret = instring[loc: newloc]3735return newloc, ret373637373738class LineStart(_PositionToken):3739r"""Matches if current position is at the beginning of a line within3740the parse string37413742Example::37433744test = '''\3745AAA this line3746AAA and this line3747AAA but not this one3748B AAA and definitely not this one3749'''37503751for t in (LineStart() + 'AAA' + restOfLine).searchString(test):3752print(t)37533754prints::37553756['AAA', ' this line']3757['AAA', ' and this line']37583759"""3760def __init__(self):3761super(LineStart, self).__init__()3762self.errmsg = "Expected start of line"37633764def parseImpl(self, instring, loc, doActions=True):3765if col(loc, instring) == 1:3766return loc, []3767raise ParseException(instring, loc, self.errmsg, self)37683769class LineEnd(_PositionToken):3770"""Matches if current position is at the end of a line within the3771parse string3772"""3773def __init__(self):3774super(LineEnd, self).__init__()3775self.setWhitespaceChars(ParserElement.DEFAULT_WHITE_CHARS.replace("\n", ""))3776self.errmsg = "Expected end of line"37773778def parseImpl(self, instring, loc, doActions=True):3779if loc < len(instring):3780if instring[loc] == "\n":3781return loc + 1, "\n"3782else:3783raise ParseException(instring, loc, self.errmsg, self)3784elif loc == len(instring):3785return loc + 1, []3786else:3787raise ParseException(instring, loc, self.errmsg, self)37883789class StringStart(_PositionToken):3790"""Matches if current position is at the beginning of the parse3791string3792"""3793def __init__(self):3794super(StringStart, self).__init__()3795self.errmsg = "Expected start of text"37963797def parseImpl(self, instring, loc, doActions=True):3798if loc != 0:3799# see if entire string up to here is just whitespace and ignoreables3800if loc != self.preParse(instring, 0):3801raise ParseException(instring, loc, self.errmsg, self)3802return loc, []38033804class StringEnd(_PositionToken):3805"""Matches if current position is at the end of the parse string3806"""3807def __init__(self):3808super(StringEnd, self).__init__()3809self.errmsg = "Expected end of text"38103811def parseImpl(self, instring, loc, doActions=True):3812if loc < len(instring):3813raise ParseException(instring, loc, self.errmsg, self)3814elif loc == len(instring):3815return loc + 1, []3816elif loc > len(instring):3817return loc, []3818else:3819raise ParseException(instring, loc, self.errmsg, self)38203821class WordStart(_PositionToken):3822"""Matches if the current position is at the beginning of a Word,3823and is not preceded by any character in a given set of3824``wordChars`` (default= ``printables``). To emulate the3825``\b`` behavior of regular expressions, use3826``WordStart(alphanums)``. ``WordStart`` will also match at3827the beginning of the string being parsed, or at the beginning of3828a line.3829"""3830def __init__(self, wordChars=printables):3831super(WordStart, self).__init__()3832self.wordChars = set(wordChars)3833self.errmsg = "Not at the start of a word"38343835def parseImpl(self, instring, loc, doActions=True):3836if loc != 0:3837if (instring[loc - 1] in self.wordChars3838or instring[loc] not in self.wordChars):3839raise ParseException(instring, loc, self.errmsg, self)3840return loc, []38413842class WordEnd(_PositionToken):3843"""Matches if the current position is at the end of a Word, and is3844not followed by any character in a given set of ``wordChars``3845(default= ``printables``). To emulate the ``\b`` behavior of3846regular expressions, use ``WordEnd(alphanums)``. ``WordEnd``3847will also match at the end of the string being parsed, or at the end3848of a line.3849"""3850def __init__(self, wordChars=printables):3851super(WordEnd, self).__init__()3852self.wordChars = set(wordChars)3853self.skipWhitespace = False3854self.errmsg = "Not at the end of a word"38553856def parseImpl(self, instring, loc, doActions=True):3857instrlen = len(instring)3858if instrlen > 0 and loc < instrlen:3859if (instring[loc] in self.wordChars or3860instring[loc - 1] not in self.wordChars):3861raise ParseException(instring, loc, self.errmsg, self)3862return loc, []386338643865class ParseExpression(ParserElement):3866"""Abstract subclass of ParserElement, for combining and3867post-processing parsed tokens.3868"""3869def __init__(self, exprs, savelist=False):3870super(ParseExpression, self).__init__(savelist)3871if isinstance(exprs, _generatorType):3872exprs = list(exprs)38733874if isinstance(exprs, basestring):3875self.exprs = [self._literalStringClass(exprs)]3876elif isinstance(exprs, ParserElement):3877self.exprs = [exprs]3878elif isinstance(exprs, Iterable):3879exprs = list(exprs)3880# if sequence of strings provided, wrap with Literal3881if any(isinstance(expr, basestring) for expr in exprs):3882exprs = (self._literalStringClass(e) if isinstance(e, basestring) else e for e in exprs)3883self.exprs = list(exprs)3884else:3885try:3886self.exprs = list(exprs)3887except TypeError:3888self.exprs = [exprs]3889self.callPreparse = False38903891def append(self, other):3892self.exprs.append(other)3893self.strRepr = None3894return self38953896def leaveWhitespace(self):3897"""Extends ``leaveWhitespace`` defined in base class, and also invokes ``leaveWhitespace`` on3898all contained expressions."""3899self.skipWhitespace = False3900self.exprs = [e.copy() for e in self.exprs]3901for e in self.exprs:3902e.leaveWhitespace()3903return self39043905def ignore(self, other):3906if isinstance(other, Suppress):3907if other not in self.ignoreExprs:3908super(ParseExpression, self).ignore(other)3909for e in self.exprs:3910e.ignore(self.ignoreExprs[-1])3911else:3912super(ParseExpression, self).ignore(other)3913for e in self.exprs:3914e.ignore(self.ignoreExprs[-1])3915return self39163917def __str__(self):3918try:3919return super(ParseExpression, self).__str__()3920except Exception:3921pass39223923if self.strRepr is None:3924self.strRepr = "%s:(%s)" % (self.__class__.__name__, _ustr(self.exprs))3925return self.strRepr39263927def streamline(self):3928super(ParseExpression, self).streamline()39293930for e in self.exprs:3931e.streamline()39323933# collapse nested And's of the form And(And(And(a, b), c), d) to And(a, b, c, d)3934# but only if there are no parse actions or resultsNames on the nested And's3935# (likewise for Or's and MatchFirst's)3936if len(self.exprs) == 2:3937other = self.exprs[0]3938if (isinstance(other, self.__class__)3939and not other.parseAction3940and other.resultsName is None3941and not other.debug):3942self.exprs = other.exprs[:] + [self.exprs[1]]3943self.strRepr = None3944self.mayReturnEmpty |= other.mayReturnEmpty3945self.mayIndexError |= other.mayIndexError39463947other = self.exprs[-1]3948if (isinstance(other, self.__class__)3949and not other.parseAction3950and other.resultsName is None3951and not other.debug):3952self.exprs = self.exprs[:-1] + other.exprs[:]3953self.strRepr = None3954self.mayReturnEmpty |= other.mayReturnEmpty3955self.mayIndexError |= other.mayIndexError39563957self.errmsg = "Expected " + _ustr(self)39583959return self39603961def validate(self, validateTrace=None):3962tmp = (validateTrace if validateTrace is not None else [])[:] + [self]3963for e in self.exprs:3964e.validate(tmp)3965self.checkRecursion([])39663967def copy(self):3968ret = super(ParseExpression, self).copy()3969ret.exprs = [e.copy() for e in self.exprs]3970return ret39713972def _setResultsName(self, name, listAllMatches=False):3973if __diag__.warn_ungrouped_named_tokens_in_collection:3974for e in self.exprs:3975if isinstance(e, ParserElement) and e.resultsName:3976warnings.warn("{0}: setting results name {1!r} on {2} expression "3977"collides with {3!r} on contained expression".format("warn_ungrouped_named_tokens_in_collection",3978name,3979type(self).__name__,3980e.resultsName),3981stacklevel=3)39823983return super(ParseExpression, self)._setResultsName(name, listAllMatches)398439853986class And(ParseExpression):3987"""3988Requires all given :class:`ParseExpression` s to be found in the given order.3989Expressions may be separated by whitespace.3990May be constructed using the ``'+'`` operator.3991May also be constructed using the ``'-'`` operator, which will3992suppress backtracking.39933994Example::39953996integer = Word(nums)3997name_expr = OneOrMore(Word(alphas))39983999expr = And([integer("id"), name_expr("name"), integer("age")])4000# more easily written as:4001expr = integer("id") + name_expr("name") + integer("age")4002"""40034004class _ErrorStop(Empty):4005def __init__(self, *args, **kwargs):4006super(And._ErrorStop, self).__init__(*args, **kwargs)4007self.name = '-'4008self.leaveWhitespace()40094010def __init__(self, exprs, savelist=True):4011exprs = list(exprs)4012if exprs and Ellipsis in exprs:4013tmp = []4014for i, expr in enumerate(exprs):4015if expr is Ellipsis:4016if i < len(exprs) - 1:4017skipto_arg = (Empty() + exprs[i + 1]).exprs[-1]4018tmp.append(SkipTo(skipto_arg)("_skipped*"))4019else:4020raise Exception("cannot construct And with sequence ending in ...")4021else:4022tmp.append(expr)4023exprs[:] = tmp4024super(And, self).__init__(exprs, savelist)4025self.mayReturnEmpty = all(e.mayReturnEmpty for e in self.exprs)4026self.setWhitespaceChars(self.exprs[0].whiteChars)4027self.skipWhitespace = self.exprs[0].skipWhitespace4028self.callPreparse = True40294030def streamline(self):4031# collapse any _PendingSkip's4032if self.exprs:4033if any(isinstance(e, ParseExpression) and e.exprs and isinstance(e.exprs[-1], _PendingSkip)4034for e in self.exprs[:-1]):4035for i, e in enumerate(self.exprs[:-1]):4036if e is None:4037continue4038if (isinstance(e, ParseExpression)4039and e.exprs and isinstance(e.exprs[-1], _PendingSkip)):4040e.exprs[-1] = e.exprs[-1] + self.exprs[i + 1]4041self.exprs[i + 1] = None4042self.exprs = [e for e in self.exprs if e is not None]40434044super(And, self).streamline()4045self.mayReturnEmpty = all(e.mayReturnEmpty for e in self.exprs)4046return self40474048def parseImpl(self, instring, loc, doActions=True):4049# pass False as last arg to _parse for first element, since we already4050# pre-parsed the string as part of our And pre-parsing4051loc, resultlist = self.exprs[0]._parse(instring, loc, doActions, callPreParse=False)4052errorStop = False4053for e in self.exprs[1:]:4054if isinstance(e, And._ErrorStop):4055errorStop = True4056continue4057if errorStop:4058try:4059loc, exprtokens = e._parse(instring, loc, doActions)4060except ParseSyntaxException:4061raise4062except ParseBaseException as pe:4063pe.__traceback__ = None4064raise ParseSyntaxException._from_exception(pe)4065except IndexError:4066raise ParseSyntaxException(instring, len(instring), self.errmsg, self)4067else:4068loc, exprtokens = e._parse(instring, loc, doActions)4069if exprtokens or exprtokens.haskeys():4070resultlist += exprtokens4071return loc, resultlist40724073def __iadd__(self, other):4074if isinstance(other, basestring):4075other = self._literalStringClass(other)4076return self.append(other) # And([self, other])40774078def checkRecursion(self, parseElementList):4079subRecCheckList = parseElementList[:] + [self]4080for e in self.exprs:4081e.checkRecursion(subRecCheckList)4082if not e.mayReturnEmpty:4083break40844085def __str__(self):4086if hasattr(self, "name"):4087return self.name40884089if self.strRepr is None:4090self.strRepr = "{" + " ".join(_ustr(e) for e in self.exprs) + "}"40914092return self.strRepr409340944095class Or(ParseExpression):4096"""Requires that at least one :class:`ParseExpression` is found. If4097two expressions match, the expression that matches the longest4098string will be used. May be constructed using the ``'^'``4099operator.41004101Example::41024103# construct Or using '^' operator41044105number = Word(nums) ^ Combine(Word(nums) + '.' + Word(nums))4106print(number.searchString("123 3.1416 789"))41074108prints::41094110[['123'], ['3.1416'], ['789']]4111"""4112def __init__(self, exprs, savelist=False):4113super(Or, self).__init__(exprs, savelist)4114if self.exprs:4115self.mayReturnEmpty = any(e.mayReturnEmpty for e in self.exprs)4116else:4117self.mayReturnEmpty = True41184119def streamline(self):4120super(Or, self).streamline()4121if __compat__.collect_all_And_tokens:4122self.saveAsList = any(e.saveAsList for e in self.exprs)4123return self41244125def parseImpl(self, instring, loc, doActions=True):4126maxExcLoc = -14127maxException = None4128matches = []4129for e in self.exprs:4130try:4131loc2 = e.tryParse(instring, loc)4132except ParseException as err:4133err.__traceback__ = None4134if err.loc > maxExcLoc:4135maxException = err4136maxExcLoc = err.loc4137except IndexError:4138if len(instring) > maxExcLoc:4139maxException = ParseException(instring, len(instring), e.errmsg, self)4140maxExcLoc = len(instring)4141else:4142# save match among all matches, to retry longest to shortest4143matches.append((loc2, e))41444145if matches:4146# re-evaluate all matches in descending order of length of match, in case attached actions4147# might change whether or how much they match of the input.4148matches.sort(key=itemgetter(0), reverse=True)41494150if not doActions:4151# no further conditions or parse actions to change the selection of4152# alternative, so the first match will be the best match4153best_expr = matches[0][1]4154return best_expr._parse(instring, loc, doActions)41554156longest = -1, None4157for loc1, expr1 in matches:4158if loc1 <= longest[0]:4159# already have a longer match than this one will deliver, we are done4160return longest41614162try:4163loc2, toks = expr1._parse(instring, loc, doActions)4164except ParseException as err:4165err.__traceback__ = None4166if err.loc > maxExcLoc:4167maxException = err4168maxExcLoc = err.loc4169else:4170if loc2 >= loc1:4171return loc2, toks4172# didn't match as much as before4173elif loc2 > longest[0]:4174longest = loc2, toks41754176if longest != (-1, None):4177return longest41784179if maxException is not None:4180maxException.msg = self.errmsg4181raise maxException4182else:4183raise ParseException(instring, loc, "no defined alternatives to match", self)418441854186def __ixor__(self, other):4187if isinstance(other, basestring):4188other = self._literalStringClass(other)4189return self.append(other) # Or([self, other])41904191def __str__(self):4192if hasattr(self, "name"):4193return self.name41944195if self.strRepr is None:4196self.strRepr = "{" + " ^ ".join(_ustr(e) for e in self.exprs) + "}"41974198return self.strRepr41994200def checkRecursion(self, parseElementList):4201subRecCheckList = parseElementList[:] + [self]4202for e in self.exprs:4203e.checkRecursion(subRecCheckList)42044205def _setResultsName(self, name, listAllMatches=False):4206if (not __compat__.collect_all_And_tokens4207and __diag__.warn_multiple_tokens_in_named_alternation):4208if any(isinstance(e, And) for e in self.exprs):4209warnings.warn("{0}: setting results name {1!r} on {2} expression "4210"may only return a single token for an And alternative, "4211"in future will return the full list of tokens".format(4212"warn_multiple_tokens_in_named_alternation", name, type(self).__name__),4213stacklevel=3)42144215return super(Or, self)._setResultsName(name, listAllMatches)421642174218class MatchFirst(ParseExpression):4219"""Requires that at least one :class:`ParseExpression` is found. If4220two expressions match, the first one listed is the one that will4221match. May be constructed using the ``'|'`` operator.42224223Example::42244225# construct MatchFirst using '|' operator42264227# watch the order of expressions to match4228number = Word(nums) | Combine(Word(nums) + '.' + Word(nums))4229print(number.searchString("123 3.1416 789")) # Fail! -> [['123'], ['3'], ['1416'], ['789']]42304231# put more selective expression first4232number = Combine(Word(nums) + '.' + Word(nums)) | Word(nums)4233print(number.searchString("123 3.1416 789")) # Better -> [['123'], ['3.1416'], ['789']]4234"""4235def __init__(self, exprs, savelist=False):4236super(MatchFirst, self).__init__(exprs, savelist)4237if self.exprs:4238self.mayReturnEmpty = any(e.mayReturnEmpty for e in self.exprs)4239else:4240self.mayReturnEmpty = True42414242def streamline(self):4243super(MatchFirst, self).streamline()4244if __compat__.collect_all_And_tokens:4245self.saveAsList = any(e.saveAsList for e in self.exprs)4246return self42474248def parseImpl(self, instring, loc, doActions=True):4249maxExcLoc = -14250maxException = None4251for e in self.exprs:4252try:4253ret = e._parse(instring, loc, doActions)4254return ret4255except ParseException as err:4256if err.loc > maxExcLoc:4257maxException = err4258maxExcLoc = err.loc4259except IndexError:4260if len(instring) > maxExcLoc:4261maxException = ParseException(instring, len(instring), e.errmsg, self)4262maxExcLoc = len(instring)42634264# only got here if no expression matched, raise exception for match that made it the furthest4265else:4266if maxException is not None:4267maxException.msg = self.errmsg4268raise maxException4269else:4270raise ParseException(instring, loc, "no defined alternatives to match", self)42714272def __ior__(self, other):4273if isinstance(other, basestring):4274other = self._literalStringClass(other)4275return self.append(other) # MatchFirst([self, other])42764277def __str__(self):4278if hasattr(self, "name"):4279return self.name42804281if self.strRepr is None:4282self.strRepr = "{" + " | ".join(_ustr(e) for e in self.exprs) + "}"42834284return self.strRepr42854286def checkRecursion(self, parseElementList):4287subRecCheckList = parseElementList[:] + [self]4288for e in self.exprs:4289e.checkRecursion(subRecCheckList)42904291def _setResultsName(self, name, listAllMatches=False):4292if (not __compat__.collect_all_And_tokens4293and __diag__.warn_multiple_tokens_in_named_alternation):4294if any(isinstance(e, And) for e in self.exprs):4295warnings.warn("{0}: setting results name {1!r} on {2} expression "4296"may only return a single token for an And alternative, "4297"in future will return the full list of tokens".format(4298"warn_multiple_tokens_in_named_alternation", name, type(self).__name__),4299stacklevel=3)43004301return super(MatchFirst, self)._setResultsName(name, listAllMatches)430243034304class Each(ParseExpression):4305"""Requires all given :class:`ParseExpression` s to be found, but in4306any order. Expressions may be separated by whitespace.43074308May be constructed using the ``'&'`` operator.43094310Example::43114312color = oneOf("RED ORANGE YELLOW GREEN BLUE PURPLE BLACK WHITE BROWN")4313shape_type = oneOf("SQUARE CIRCLE TRIANGLE STAR HEXAGON OCTAGON")4314integer = Word(nums)4315shape_attr = "shape:" + shape_type("shape")4316posn_attr = "posn:" + Group(integer("x") + ',' + integer("y"))("posn")4317color_attr = "color:" + color("color")4318size_attr = "size:" + integer("size")43194320# use Each (using operator '&') to accept attributes in any order4321# (shape and posn are required, color and size are optional)4322shape_spec = shape_attr & posn_attr & Optional(color_attr) & Optional(size_attr)43234324shape_spec.runTests('''4325shape: SQUARE color: BLACK posn: 100, 1204326shape: CIRCLE size: 50 color: BLUE posn: 50,804327color:GREEN size:20 shape:TRIANGLE posn:20,404328'''4329)43304331prints::43324333shape: SQUARE color: BLACK posn: 100, 1204334['shape:', 'SQUARE', 'color:', 'BLACK', 'posn:', ['100', ',', '120']]4335- color: BLACK4336- posn: ['100', ',', '120']4337- x: 1004338- y: 1204339- shape: SQUARE434043414342shape: CIRCLE size: 50 color: BLUE posn: 50,804343['shape:', 'CIRCLE', 'size:', '50', 'color:', 'BLUE', 'posn:', ['50', ',', '80']]4344- color: BLUE4345- posn: ['50', ',', '80']4346- x: 504347- y: 804348- shape: CIRCLE4349- size: 50435043514352color: GREEN size: 20 shape: TRIANGLE posn: 20,404353['color:', 'GREEN', 'size:', '20', 'shape:', 'TRIANGLE', 'posn:', ['20', ',', '40']]4354- color: GREEN4355- posn: ['20', ',', '40']4356- x: 204357- y: 404358- shape: TRIANGLE4359- size: 204360"""4361def __init__(self, exprs, savelist=True):4362super(Each, self).__init__(exprs, savelist)4363self.mayReturnEmpty = all(e.mayReturnEmpty for e in self.exprs)4364self.skipWhitespace = True4365self.initExprGroups = True4366self.saveAsList = True43674368def streamline(self):4369super(Each, self).streamline()4370self.mayReturnEmpty = all(e.mayReturnEmpty for e in self.exprs)4371return self43724373def parseImpl(self, instring, loc, doActions=True):4374if self.initExprGroups:4375self.opt1map = dict((id(e.expr), e) for e in self.exprs if isinstance(e, Optional))4376opt1 = [e.expr for e in self.exprs if isinstance(e, Optional)]4377opt2 = [e for e in self.exprs if e.mayReturnEmpty and not isinstance(e, (Optional, Regex))]4378self.optionals = opt1 + opt24379self.multioptionals = [e.expr for e in self.exprs if isinstance(e, ZeroOrMore)]4380self.multirequired = [e.expr for e in self.exprs if isinstance(e, OneOrMore)]4381self.required = [e for e in self.exprs if not isinstance(e, (Optional, ZeroOrMore, OneOrMore))]4382self.required += self.multirequired4383self.initExprGroups = False4384tmpLoc = loc4385tmpReqd = self.required[:]4386tmpOpt = self.optionals[:]4387matchOrder = []43884389keepMatching = True4390while keepMatching:4391tmpExprs = tmpReqd + tmpOpt + self.multioptionals + self.multirequired4392failed = []4393for e in tmpExprs:4394try:4395tmpLoc = e.tryParse(instring, tmpLoc)4396except ParseException:4397failed.append(e)4398else:4399matchOrder.append(self.opt1map.get(id(e), e))4400if e in tmpReqd:4401tmpReqd.remove(e)4402elif e in tmpOpt:4403tmpOpt.remove(e)4404if len(failed) == len(tmpExprs):4405keepMatching = False44064407if tmpReqd:4408missing = ", ".join(_ustr(e) for e in tmpReqd)4409raise ParseException(instring, loc, "Missing one or more required elements (%s)" % missing)44104411# add any unmatched Optionals, in case they have default values defined4412matchOrder += [e for e in self.exprs if isinstance(e, Optional) and e.expr in tmpOpt]44134414resultlist = []4415for e in matchOrder:4416loc, results = e._parse(instring, loc, doActions)4417resultlist.append(results)44184419finalResults = sum(resultlist, ParseResults([]))4420return loc, finalResults44214422def __str__(self):4423if hasattr(self, "name"):4424return self.name44254426if self.strRepr is None:4427self.strRepr = "{" + " & ".join(_ustr(e) for e in self.exprs) + "}"44284429return self.strRepr44304431def checkRecursion(self, parseElementList):4432subRecCheckList = parseElementList[:] + [self]4433for e in self.exprs:4434e.checkRecursion(subRecCheckList)443544364437class ParseElementEnhance(ParserElement):4438"""Abstract subclass of :class:`ParserElement`, for combining and4439post-processing parsed tokens.4440"""4441def __init__(self, expr, savelist=False):4442super(ParseElementEnhance, self).__init__(savelist)4443if isinstance(expr, basestring):4444if issubclass(self._literalStringClass, Token):4445expr = self._literalStringClass(expr)4446else:4447expr = self._literalStringClass(Literal(expr))4448self.expr = expr4449self.strRepr = None4450if expr is not None:4451self.mayIndexError = expr.mayIndexError4452self.mayReturnEmpty = expr.mayReturnEmpty4453self.setWhitespaceChars(expr.whiteChars)4454self.skipWhitespace = expr.skipWhitespace4455self.saveAsList = expr.saveAsList4456self.callPreparse = expr.callPreparse4457self.ignoreExprs.extend(expr.ignoreExprs)44584459def parseImpl(self, instring, loc, doActions=True):4460if self.expr is not None:4461return self.expr._parse(instring, loc, doActions, callPreParse=False)4462else:4463raise ParseException("", loc, self.errmsg, self)44644465def leaveWhitespace(self):4466self.skipWhitespace = False4467self.expr = self.expr.copy()4468if self.expr is not None:4469self.expr.leaveWhitespace()4470return self44714472def ignore(self, other):4473if isinstance(other, Suppress):4474if other not in self.ignoreExprs:4475super(ParseElementEnhance, self).ignore(other)4476if self.expr is not None:4477self.expr.ignore(self.ignoreExprs[-1])4478else:4479super(ParseElementEnhance, self).ignore(other)4480if self.expr is not None:4481self.expr.ignore(self.ignoreExprs[-1])4482return self44834484def streamline(self):4485super(ParseElementEnhance, self).streamline()4486if self.expr is not None:4487self.expr.streamline()4488return self44894490def checkRecursion(self, parseElementList):4491if self in parseElementList:4492raise RecursiveGrammarException(parseElementList + [self])4493subRecCheckList = parseElementList[:] + [self]4494if self.expr is not None:4495self.expr.checkRecursion(subRecCheckList)44964497def validate(self, validateTrace=None):4498if validateTrace is None:4499validateTrace = []4500tmp = validateTrace[:] + [self]4501if self.expr is not None:4502self.expr.validate(tmp)4503self.checkRecursion([])45044505def __str__(self):4506try:4507return super(ParseElementEnhance, self).__str__()4508except Exception:4509pass45104511if self.strRepr is None and self.expr is not None:4512self.strRepr = "%s:(%s)" % (self.__class__.__name__, _ustr(self.expr))4513return self.strRepr451445154516class FollowedBy(ParseElementEnhance):4517"""Lookahead matching of the given parse expression.4518``FollowedBy`` does *not* advance the parsing position within4519the input string, it only verifies that the specified parse4520expression matches at the current position. ``FollowedBy``4521always returns a null token list. If any results names are defined4522in the lookahead expression, those *will* be returned for access by4523name.45244525Example::45264527# use FollowedBy to match a label only if it is followed by a ':'4528data_word = Word(alphas)4529label = data_word + FollowedBy(':')4530attr_expr = Group(label + Suppress(':') + OneOrMore(data_word, stopOn=label).setParseAction(' '.join))45314532OneOrMore(attr_expr).parseString("shape: SQUARE color: BLACK posn: upper left").pprint()45334534prints::45354536[['shape', 'SQUARE'], ['color', 'BLACK'], ['posn', 'upper left']]4537"""4538def __init__(self, expr):4539super(FollowedBy, self).__init__(expr)4540self.mayReturnEmpty = True45414542def parseImpl(self, instring, loc, doActions=True):4543# by using self._expr.parse and deleting the contents of the returned ParseResults list4544# we keep any named results that were defined in the FollowedBy expression4545_, ret = self.expr._parse(instring, loc, doActions=doActions)4546del ret[:]45474548return loc, ret454945504551class PrecededBy(ParseElementEnhance):4552"""Lookbehind matching of the given parse expression.4553``PrecededBy`` does not advance the parsing position within the4554input string, it only verifies that the specified parse expression4555matches prior to the current position. ``PrecededBy`` always4556returns a null token list, but if a results name is defined on the4557given expression, it is returned.45584559Parameters:45604561- expr - expression that must match prior to the current parse4562location4563- retreat - (default= ``None``) - (int) maximum number of characters4564to lookbehind prior to the current parse location45654566If the lookbehind expression is a string, Literal, Keyword, or4567a Word or CharsNotIn with a specified exact or maximum length, then4568the retreat parameter is not required. Otherwise, retreat must be4569specified to give a maximum number of characters to look back from4570the current parse position for a lookbehind match.45714572Example::45734574# VB-style variable names with type prefixes4575int_var = PrecededBy("#") + pyparsing_common.identifier4576str_var = PrecededBy("$") + pyparsing_common.identifier45774578"""4579def __init__(self, expr, retreat=None):4580super(PrecededBy, self).__init__(expr)4581self.expr = self.expr().leaveWhitespace()4582self.mayReturnEmpty = True4583self.mayIndexError = False4584self.exact = False4585if isinstance(expr, str):4586retreat = len(expr)4587self.exact = True4588elif isinstance(expr, (Literal, Keyword)):4589retreat = expr.matchLen4590self.exact = True4591elif isinstance(expr, (Word, CharsNotIn)) and expr.maxLen != _MAX_INT:4592retreat = expr.maxLen4593self.exact = True4594elif isinstance(expr, _PositionToken):4595retreat = 04596self.exact = True4597self.retreat = retreat4598self.errmsg = "not preceded by " + str(expr)4599self.skipWhitespace = False4600self.parseAction.append(lambda s, l, t: t.__delitem__(slice(None, None)))46014602def parseImpl(self, instring, loc=0, doActions=True):4603if self.exact:4604if loc < self.retreat:4605raise ParseException(instring, loc, self.errmsg)4606start = loc - self.retreat4607_, ret = self.expr._parse(instring, start)4608else:4609# retreat specified a maximum lookbehind window, iterate4610test_expr = self.expr + StringEnd()4611instring_slice = instring[max(0, loc - self.retreat):loc]4612last_expr = ParseException(instring, loc, self.errmsg)4613for offset in range(1, min(loc, self.retreat + 1)+1):4614try:4615# print('trying', offset, instring_slice, repr(instring_slice[loc - offset:]))4616_, ret = test_expr._parse(instring_slice, len(instring_slice) - offset)4617except ParseBaseException as pbe:4618last_expr = pbe4619else:4620break4621else:4622raise last_expr4623return loc, ret462446254626class NotAny(ParseElementEnhance):4627"""Lookahead to disallow matching with the given parse expression.4628``NotAny`` does *not* advance the parsing position within the4629input string, it only verifies that the specified parse expression4630does *not* match at the current position. Also, ``NotAny`` does4631*not* skip over leading whitespace. ``NotAny`` always returns4632a null token list. May be constructed using the '~' operator.46334634Example::46354636AND, OR, NOT = map(CaselessKeyword, "AND OR NOT".split())46374638# take care not to mistake keywords for identifiers4639ident = ~(AND | OR | NOT) + Word(alphas)4640boolean_term = Optional(NOT) + ident46414642# very crude boolean expression - to support parenthesis groups and4643# operation hierarchy, use infixNotation4644boolean_expr = boolean_term + ZeroOrMore((AND | OR) + boolean_term)46454646# integers that are followed by "." are actually floats4647integer = Word(nums) + ~Char(".")4648"""4649def __init__(self, expr):4650super(NotAny, self).__init__(expr)4651# ~ self.leaveWhitespace()4652self.skipWhitespace = False # do NOT use self.leaveWhitespace(), don't want to propagate to exprs4653self.mayReturnEmpty = True4654self.errmsg = "Found unwanted token, " + _ustr(self.expr)46554656def parseImpl(self, instring, loc, doActions=True):4657if self.expr.canParseNext(instring, loc):4658raise ParseException(instring, loc, self.errmsg, self)4659return loc, []46604661def __str__(self):4662if hasattr(self, "name"):4663return self.name46644665if self.strRepr is None:4666self.strRepr = "~{" + _ustr(self.expr) + "}"46674668return self.strRepr46694670class _MultipleMatch(ParseElementEnhance):4671def __init__(self, expr, stopOn=None):4672super(_MultipleMatch, self).__init__(expr)4673self.saveAsList = True4674ender = stopOn4675if isinstance(ender, basestring):4676ender = self._literalStringClass(ender)4677self.stopOn(ender)46784679def stopOn(self, ender):4680if isinstance(ender, basestring):4681ender = self._literalStringClass(ender)4682self.not_ender = ~ender if ender is not None else None4683return self46844685def parseImpl(self, instring, loc, doActions=True):4686self_expr_parse = self.expr._parse4687self_skip_ignorables = self._skipIgnorables4688check_ender = self.not_ender is not None4689if check_ender:4690try_not_ender = self.not_ender.tryParse46914692# must be at least one (but first see if we are the stopOn sentinel;4693# if so, fail)4694if check_ender:4695try_not_ender(instring, loc)4696loc, tokens = self_expr_parse(instring, loc, doActions, callPreParse=False)4697try:4698hasIgnoreExprs = (not not self.ignoreExprs)4699while 1:4700if check_ender:4701try_not_ender(instring, loc)4702if hasIgnoreExprs:4703preloc = self_skip_ignorables(instring, loc)4704else:4705preloc = loc4706loc, tmptokens = self_expr_parse(instring, preloc, doActions)4707if tmptokens or tmptokens.haskeys():4708tokens += tmptokens4709except (ParseException, IndexError):4710pass47114712return loc, tokens47134714def _setResultsName(self, name, listAllMatches=False):4715if __diag__.warn_ungrouped_named_tokens_in_collection:4716for e in [self.expr] + getattr(self.expr, 'exprs', []):4717if isinstance(e, ParserElement) and e.resultsName:4718warnings.warn("{0}: setting results name {1!r} on {2} expression "4719"collides with {3!r} on contained expression".format("warn_ungrouped_named_tokens_in_collection",4720name,4721type(self).__name__,4722e.resultsName),4723stacklevel=3)47244725return super(_MultipleMatch, self)._setResultsName(name, listAllMatches)472647274728class OneOrMore(_MultipleMatch):4729"""Repetition of one or more of the given expression.47304731Parameters:4732- expr - expression that must match one or more times4733- stopOn - (default= ``None``) - expression for a terminating sentinel4734(only required if the sentinel would ordinarily match the repetition4735expression)47364737Example::47384739data_word = Word(alphas)4740label = data_word + FollowedBy(':')4741attr_expr = Group(label + Suppress(':') + OneOrMore(data_word).setParseAction(' '.join))47424743text = "shape: SQUARE posn: upper left color: BLACK"4744OneOrMore(attr_expr).parseString(text).pprint() # Fail! read 'color' as data instead of next label -> [['shape', 'SQUARE color']]47454746# use stopOn attribute for OneOrMore to avoid reading label string as part of the data4747attr_expr = Group(label + Suppress(':') + OneOrMore(data_word, stopOn=label).setParseAction(' '.join))4748OneOrMore(attr_expr).parseString(text).pprint() # Better -> [['shape', 'SQUARE'], ['posn', 'upper left'], ['color', 'BLACK']]47494750# could also be written as4751(attr_expr * (1,)).parseString(text).pprint()4752"""47534754def __str__(self):4755if hasattr(self, "name"):4756return self.name47574758if self.strRepr is None:4759self.strRepr = "{" + _ustr(self.expr) + "}..."47604761return self.strRepr47624763class ZeroOrMore(_MultipleMatch):4764"""Optional repetition of zero or more of the given expression.47654766Parameters:4767- expr - expression that must match zero or more times4768- stopOn - (default= ``None``) - expression for a terminating sentinel4769(only required if the sentinel would ordinarily match the repetition4770expression)47714772Example: similar to :class:`OneOrMore`4773"""4774def __init__(self, expr, stopOn=None):4775super(ZeroOrMore, self).__init__(expr, stopOn=stopOn)4776self.mayReturnEmpty = True47774778def parseImpl(self, instring, loc, doActions=True):4779try:4780return super(ZeroOrMore, self).parseImpl(instring, loc, doActions)4781except (ParseException, IndexError):4782return loc, []47834784def __str__(self):4785if hasattr(self, "name"):4786return self.name47874788if self.strRepr is None:4789self.strRepr = "[" + _ustr(self.expr) + "]..."47904791return self.strRepr479247934794class _NullToken(object):4795def __bool__(self):4796return False4797__nonzero__ = __bool__4798def __str__(self):4799return ""48004801class Optional(ParseElementEnhance):4802"""Optional matching of the given expression.48034804Parameters:4805- expr - expression that must match zero or more times4806- default (optional) - value to be returned if the optional expression is not found.48074808Example::48094810# US postal code can be a 5-digit zip, plus optional 4-digit qualifier4811zip = Combine(Word(nums, exact=5) + Optional('-' + Word(nums, exact=4)))4812zip.runTests('''4813# traditional ZIP code48141234548154816# ZIP+4 form481712101-000148184819# invalid ZIP482098765-4821''')48224823prints::48244825# traditional ZIP code4826123454827['12345']48284829# ZIP+4 form483012101-00014831['12101-0001']48324833# invalid ZIP483498765-4835^4836FAIL: Expected end of text (at char 5), (line:1, col:6)4837"""4838__optionalNotMatched = _NullToken()48394840def __init__(self, expr, default=__optionalNotMatched):4841super(Optional, self).__init__(expr, savelist=False)4842self.saveAsList = self.expr.saveAsList4843self.defaultValue = default4844self.mayReturnEmpty = True48454846def parseImpl(self, instring, loc, doActions=True):4847try:4848loc, tokens = self.expr._parse(instring, loc, doActions, callPreParse=False)4849except (ParseException, IndexError):4850if self.defaultValue is not self.__optionalNotMatched:4851if self.expr.resultsName:4852tokens = ParseResults([self.defaultValue])4853tokens[self.expr.resultsName] = self.defaultValue4854else:4855tokens = [self.defaultValue]4856else:4857tokens = []4858return loc, tokens48594860def __str__(self):4861if hasattr(self, "name"):4862return self.name48634864if self.strRepr is None:4865self.strRepr = "[" + _ustr(self.expr) + "]"48664867return self.strRepr48684869class SkipTo(ParseElementEnhance):4870"""Token for skipping over all undefined text until the matched4871expression is found.48724873Parameters:4874- expr - target expression marking the end of the data to be skipped4875- include - (default= ``False``) if True, the target expression is also parsed4876(the skipped text and target expression are returned as a 2-element list).4877- ignore - (default= ``None``) used to define grammars (typically quoted strings and4878comments) that might contain false matches to the target expression4879- failOn - (default= ``None``) define expressions that are not allowed to be4880included in the skipped test; if found before the target expression is found,4881the SkipTo is not a match48824883Example::48844885report = '''4886Outstanding Issues Report - 1 Jan 200048874888# | Severity | Description | Days Open4889-----+----------+-------------------------------------------+-----------4890101 | Critical | Intermittent system crash | 6489194 | Cosmetic | Spelling error on Login ('log|n') | 14489279 | Minor | System slow when running too many reports | 474893'''4894integer = Word(nums)4895SEP = Suppress('|')4896# use SkipTo to simply match everything up until the next SEP4897# - ignore quoted strings, so that a '|' character inside a quoted string does not match4898# - parse action will call token.strip() for each matched token, i.e., the description body4899string_data = SkipTo(SEP, ignore=quotedString)4900string_data.setParseAction(tokenMap(str.strip))4901ticket_expr = (integer("issue_num") + SEP4902+ string_data("sev") + SEP4903+ string_data("desc") + SEP4904+ integer("days_open"))49054906for tkt in ticket_expr.searchString(report):4907print tkt.dump()49084909prints::49104911['101', 'Critical', 'Intermittent system crash', '6']4912- days_open: 64913- desc: Intermittent system crash4914- issue_num: 1014915- sev: Critical4916['94', 'Cosmetic', "Spelling error on Login ('log|n')", '14']4917- days_open: 144918- desc: Spelling error on Login ('log|n')4919- issue_num: 944920- sev: Cosmetic4921['79', 'Minor', 'System slow when running too many reports', '47']4922- days_open: 474923- desc: System slow when running too many reports4924- issue_num: 794925- sev: Minor4926"""4927def __init__(self, other, include=False, ignore=None, failOn=None):4928super(SkipTo, self).__init__(other)4929self.ignoreExpr = ignore4930self.mayReturnEmpty = True4931self.mayIndexError = False4932self.includeMatch = include4933self.saveAsList = False4934if isinstance(failOn, basestring):4935self.failOn = self._literalStringClass(failOn)4936else:4937self.failOn = failOn4938self.errmsg = "No match found for " + _ustr(self.expr)49394940def parseImpl(self, instring, loc, doActions=True):4941startloc = loc4942instrlen = len(instring)4943expr = self.expr4944expr_parse = self.expr._parse4945self_failOn_canParseNext = self.failOn.canParseNext if self.failOn is not None else None4946self_ignoreExpr_tryParse = self.ignoreExpr.tryParse if self.ignoreExpr is not None else None49474948tmploc = loc4949while tmploc <= instrlen:4950if self_failOn_canParseNext is not None:4951# break if failOn expression matches4952if self_failOn_canParseNext(instring, tmploc):4953break49544955if self_ignoreExpr_tryParse is not None:4956# advance past ignore expressions4957while 1:4958try:4959tmploc = self_ignoreExpr_tryParse(instring, tmploc)4960except ParseBaseException:4961break49624963try:4964expr_parse(instring, tmploc, doActions=False, callPreParse=False)4965except (ParseException, IndexError):4966# no match, advance loc in string4967tmploc += 14968else:4969# matched skipto expr, done4970break49714972else:4973# ran off the end of the input string without matching skipto expr, fail4974raise ParseException(instring, loc, self.errmsg, self)49754976# build up return values4977loc = tmploc4978skiptext = instring[startloc:loc]4979skipresult = ParseResults(skiptext)49804981if self.includeMatch:4982loc, mat = expr_parse(instring, loc, doActions, callPreParse=False)4983skipresult += mat49844985return loc, skipresult49864987class Forward(ParseElementEnhance):4988"""Forward declaration of an expression to be defined later -4989used for recursive grammars, such as algebraic infix notation.4990When the expression is known, it is assigned to the ``Forward``4991variable using the '<<' operator.49924993Note: take care when assigning to ``Forward`` not to overlook4994precedence of operators.49954996Specifically, '|' has a lower precedence than '<<', so that::49974998fwdExpr << a | b | c49995000will actually be evaluated as::50015002(fwdExpr << a) | b | c50035004thereby leaving b and c out as parseable alternatives. It is recommended that you5005explicitly group the values inserted into the ``Forward``::50065007fwdExpr << (a | b | c)50085009Converting to use the '<<=' operator instead will avoid this problem.50105011See :class:`ParseResults.pprint` for an example of a recursive5012parser created using ``Forward``.5013"""5014def __init__(self, other=None):5015super(Forward, self).__init__(other, savelist=False)50165017def __lshift__(self, other):5018if isinstance(other, basestring):5019other = self._literalStringClass(other)5020self.expr = other5021self.strRepr = None5022self.mayIndexError = self.expr.mayIndexError5023self.mayReturnEmpty = self.expr.mayReturnEmpty5024self.setWhitespaceChars(self.expr.whiteChars)5025self.skipWhitespace = self.expr.skipWhitespace5026self.saveAsList = self.expr.saveAsList5027self.ignoreExprs.extend(self.expr.ignoreExprs)5028return self50295030def __ilshift__(self, other):5031return self << other50325033def leaveWhitespace(self):5034self.skipWhitespace = False5035return self50365037def streamline(self):5038if not self.streamlined:5039self.streamlined = True5040if self.expr is not None:5041self.expr.streamline()5042return self50435044def validate(self, validateTrace=None):5045if validateTrace is None:5046validateTrace = []50475048if self not in validateTrace:5049tmp = validateTrace[:] + [self]5050if self.expr is not None:5051self.expr.validate(tmp)5052self.checkRecursion([])50535054def __str__(self):5055if hasattr(self, "name"):5056return self.name5057if self.strRepr is not None:5058return self.strRepr50595060# Avoid infinite recursion by setting a temporary strRepr5061self.strRepr = ": ..."50625063# Use the string representation of main expression.5064retString = '...'5065try:5066if self.expr is not None:5067retString = _ustr(self.expr)[:1000]5068else:5069retString = "None"5070finally:5071self.strRepr = self.__class__.__name__ + ": " + retString5072return self.strRepr50735074def copy(self):5075if self.expr is not None:5076return super(Forward, self).copy()5077else:5078ret = Forward()5079ret <<= self5080return ret50815082def _setResultsName(self, name, listAllMatches=False):5083if __diag__.warn_name_set_on_empty_Forward:5084if self.expr is None:5085warnings.warn("{0}: setting results name {0!r} on {1} expression "5086"that has no contained expression".format("warn_name_set_on_empty_Forward",5087name,5088type(self).__name__),5089stacklevel=3)50905091return super(Forward, self)._setResultsName(name, listAllMatches)50925093class TokenConverter(ParseElementEnhance):5094"""5095Abstract subclass of :class:`ParseExpression`, for converting parsed results.5096"""5097def __init__(self, expr, savelist=False):5098super(TokenConverter, self).__init__(expr) # , savelist)5099self.saveAsList = False51005101class Combine(TokenConverter):5102"""Converter to concatenate all matching tokens to a single string.5103By default, the matching patterns must also be contiguous in the5104input string; this can be disabled by specifying5105``'adjacent=False'`` in the constructor.51065107Example::51085109real = Word(nums) + '.' + Word(nums)5110print(real.parseString('3.1416')) # -> ['3', '.', '1416']5111# will also erroneously match the following5112print(real.parseString('3. 1416')) # -> ['3', '.', '1416']51135114real = Combine(Word(nums) + '.' + Word(nums))5115print(real.parseString('3.1416')) # -> ['3.1416']5116# no match when there are internal spaces5117print(real.parseString('3. 1416')) # -> Exception: Expected W:(0123...)5118"""5119def __init__(self, expr, joinString="", adjacent=True):5120super(Combine, self).__init__(expr)5121# suppress whitespace-stripping in contained parse expressions, but re-enable it on the Combine itself5122if adjacent:5123self.leaveWhitespace()5124self.adjacent = adjacent5125self.skipWhitespace = True5126self.joinString = joinString5127self.callPreparse = True51285129def ignore(self, other):5130if self.adjacent:5131ParserElement.ignore(self, other)5132else:5133super(Combine, self).ignore(other)5134return self51355136def postParse(self, instring, loc, tokenlist):5137retToks = tokenlist.copy()5138del retToks[:]5139retToks += ParseResults(["".join(tokenlist._asStringList(self.joinString))], modal=self.modalResults)51405141if self.resultsName and retToks.haskeys():5142return [retToks]5143else:5144return retToks51455146class Group(TokenConverter):5147"""Converter to return the matched tokens as a list - useful for5148returning tokens of :class:`ZeroOrMore` and :class:`OneOrMore` expressions.51495150Example::51515152ident = Word(alphas)5153num = Word(nums)5154term = ident | num5155func = ident + Optional(delimitedList(term))5156print(func.parseString("fn a, b, 100")) # -> ['fn', 'a', 'b', '100']51575158func = ident + Group(Optional(delimitedList(term)))5159print(func.parseString("fn a, b, 100")) # -> ['fn', ['a', 'b', '100']]5160"""5161def __init__(self, expr):5162super(Group, self).__init__(expr)5163self.saveAsList = True51645165def postParse(self, instring, loc, tokenlist):5166return [tokenlist]51675168class Dict(TokenConverter):5169"""Converter to return a repetitive expression as a list, but also5170as a dictionary. Each element can also be referenced using the first5171token in the expression as its key. Useful for tabular report5172scraping when the first column can be used as a item key.51735174Example::51755176data_word = Word(alphas)5177label = data_word + FollowedBy(':')5178attr_expr = Group(label + Suppress(':') + OneOrMore(data_word).setParseAction(' '.join))51795180text = "shape: SQUARE posn: upper left color: light blue texture: burlap"5181attr_expr = (label + Suppress(':') + OneOrMore(data_word, stopOn=label).setParseAction(' '.join))51825183# print attributes as plain groups5184print(OneOrMore(attr_expr).parseString(text).dump())51855186# instead of OneOrMore(expr), parse using Dict(OneOrMore(Group(expr))) - Dict will auto-assign names5187result = Dict(OneOrMore(Group(attr_expr))).parseString(text)5188print(result.dump())51895190# access named fields as dict entries, or output as dict5191print(result['shape'])5192print(result.asDict())51935194prints::51955196['shape', 'SQUARE', 'posn', 'upper left', 'color', 'light blue', 'texture', 'burlap']5197[['shape', 'SQUARE'], ['posn', 'upper left'], ['color', 'light blue'], ['texture', 'burlap']]5198- color: light blue5199- posn: upper left5200- shape: SQUARE5201- texture: burlap5202SQUARE5203{'color': 'light blue', 'posn': 'upper left', 'texture': 'burlap', 'shape': 'SQUARE'}52045205See more examples at :class:`ParseResults` of accessing fields by results name.5206"""5207def __init__(self, expr):5208super(Dict, self).__init__(expr)5209self.saveAsList = True52105211def postParse(self, instring, loc, tokenlist):5212for i, tok in enumerate(tokenlist):5213if len(tok) == 0:5214continue5215ikey = tok[0]5216if isinstance(ikey, int):5217ikey = _ustr(tok[0]).strip()5218if len(tok) == 1:5219tokenlist[ikey] = _ParseResultsWithOffset("", i)5220elif len(tok) == 2 and not isinstance(tok[1], ParseResults):5221tokenlist[ikey] = _ParseResultsWithOffset(tok[1], i)5222else:5223dictvalue = tok.copy() # ParseResults(i)5224del dictvalue[0]5225if len(dictvalue) != 1 or (isinstance(dictvalue, ParseResults) and dictvalue.haskeys()):5226tokenlist[ikey] = _ParseResultsWithOffset(dictvalue, i)5227else:5228tokenlist[ikey] = _ParseResultsWithOffset(dictvalue[0], i)52295230if self.resultsName:5231return [tokenlist]5232else:5233return tokenlist523452355236class Suppress(TokenConverter):5237"""Converter for ignoring the results of a parsed expression.52385239Example::52405241source = "a, b, c,d"5242wd = Word(alphas)5243wd_list1 = wd + ZeroOrMore(',' + wd)5244print(wd_list1.parseString(source))52455246# often, delimiters that are useful during parsing are just in the5247# way afterward - use Suppress to keep them out of the parsed output5248wd_list2 = wd + ZeroOrMore(Suppress(',') + wd)5249print(wd_list2.parseString(source))52505251prints::52525253['a', ',', 'b', ',', 'c', ',', 'd']5254['a', 'b', 'c', 'd']52555256(See also :class:`delimitedList`.)5257"""5258def postParse(self, instring, loc, tokenlist):5259return []52605261def suppress(self):5262return self526352645265class OnlyOnce(object):5266"""Wrapper for parse actions, to ensure they are only called once.5267"""5268def __init__(self, methodCall):5269self.callable = _trim_arity(methodCall)5270self.called = False5271def __call__(self, s, l, t):5272if not self.called:5273results = self.callable(s, l, t)5274self.called = True5275return results5276raise ParseException(s, l, "")5277def reset(self):5278self.called = False52795280def traceParseAction(f):5281"""Decorator for debugging parse actions.52825283When the parse action is called, this decorator will print5284``">> entering method-name(line:<current_source_line>, <parse_location>, <matched_tokens>)"``.5285When the parse action completes, the decorator will print5286``"<<"`` followed by the returned value, or any exception that the parse action raised.52875288Example::52895290wd = Word(alphas)52915292@traceParseAction5293def remove_duplicate_chars(tokens):5294return ''.join(sorted(set(''.join(tokens))))52955296wds = OneOrMore(wd).setParseAction(remove_duplicate_chars)5297print(wds.parseString("slkdjs sld sldd sdlf sdljf"))52985299prints::53005301>>entering remove_duplicate_chars(line: 'slkdjs sld sldd sdlf sdljf', 0, (['slkdjs', 'sld', 'sldd', 'sdlf', 'sdljf'], {}))5302<<leaving remove_duplicate_chars (ret: 'dfjkls')5303['dfjkls']5304"""5305f = _trim_arity(f)5306def z(*paArgs):5307thisFunc = f.__name__5308s, l, t = paArgs[-3:]5309if len(paArgs) > 3:5310thisFunc = paArgs[0].__class__.__name__ + '.' + thisFunc5311sys.stderr.write(">>entering %s(line: '%s', %d, %r)\n" % (thisFunc, line(l, s), l, t))5312try:5313ret = f(*paArgs)5314except Exception as exc:5315sys.stderr.write("<<leaving %s (exception: %s)\n" % (thisFunc, exc))5316raise5317sys.stderr.write("<<leaving %s (ret: %r)\n" % (thisFunc, ret))5318return ret5319try:5320z.__name__ = f.__name__5321except AttributeError:5322pass5323return z53245325#5326# global helpers5327#5328def delimitedList(expr, delim=",", combine=False):5329"""Helper to define a delimited list of expressions - the delimiter5330defaults to ','. By default, the list elements and delimiters can5331have intervening whitespace, and comments, but this can be5332overridden by passing ``combine=True`` in the constructor. If5333``combine`` is set to ``True``, the matching tokens are5334returned as a single token string, with the delimiters included;5335otherwise, the matching tokens are returned as a list of tokens,5336with the delimiters suppressed.53375338Example::53395340delimitedList(Word(alphas)).parseString("aa,bb,cc") # -> ['aa', 'bb', 'cc']5341delimitedList(Word(hexnums), delim=':', combine=True).parseString("AA:BB:CC:DD:EE") # -> ['AA:BB:CC:DD:EE']5342"""5343dlName = _ustr(expr) + " [" + _ustr(delim) + " " + _ustr(expr) + "]..."5344if combine:5345return Combine(expr + ZeroOrMore(delim + expr)).setName(dlName)5346else:5347return (expr + ZeroOrMore(Suppress(delim) + expr)).setName(dlName)53485349def countedArray(expr, intExpr=None):5350"""Helper to define a counted list of expressions.53515352This helper defines a pattern of the form::53535354integer expr expr expr...53555356where the leading integer tells how many expr expressions follow.5357The matched tokens returns the array of expr tokens as a list - the5358leading count token is suppressed.53595360If ``intExpr`` is specified, it should be a pyparsing expression5361that produces an integer value.53625363Example::53645365countedArray(Word(alphas)).parseString('2 ab cd ef') # -> ['ab', 'cd']53665367# in this parser, the leading integer value is given in binary,5368# '10' indicating that 2 values are in the array5369binaryConstant = Word('01').setParseAction(lambda t: int(t[0], 2))5370countedArray(Word(alphas), intExpr=binaryConstant).parseString('10 ab cd ef') # -> ['ab', 'cd']5371"""5372arrayExpr = Forward()5373def countFieldParseAction(s, l, t):5374n = t[0]5375arrayExpr << (n and Group(And([expr] * n)) or Group(empty))5376return []5377if intExpr is None:5378intExpr = Word(nums).setParseAction(lambda t: int(t[0]))5379else:5380intExpr = intExpr.copy()5381intExpr.setName("arrayLen")5382intExpr.addParseAction(countFieldParseAction, callDuringTry=True)5383return (intExpr + arrayExpr).setName('(len) ' + _ustr(expr) + '...')53845385def _flatten(L):5386ret = []5387for i in L:5388if isinstance(i, list):5389ret.extend(_flatten(i))5390else:5391ret.append(i)5392return ret53935394def matchPreviousLiteral(expr):5395"""Helper to define an expression that is indirectly defined from5396the tokens matched in a previous expression, that is, it looks for5397a 'repeat' of a previous expression. For example::53985399first = Word(nums)5400second = matchPreviousLiteral(first)5401matchExpr = first + ":" + second54025403will match ``"1:1"``, but not ``"1:2"``. Because this5404matches a previous literal, will also match the leading5405``"1:1"`` in ``"1:10"``. If this is not desired, use5406:class:`matchPreviousExpr`. Do *not* use with packrat parsing5407enabled.5408"""5409rep = Forward()5410def copyTokenToRepeater(s, l, t):5411if t:5412if len(t) == 1:5413rep << t[0]5414else:5415# flatten t tokens5416tflat = _flatten(t.asList())5417rep << And(Literal(tt) for tt in tflat)5418else:5419rep << Empty()5420expr.addParseAction(copyTokenToRepeater, callDuringTry=True)5421rep.setName('(prev) ' + _ustr(expr))5422return rep54235424def matchPreviousExpr(expr):5425"""Helper to define an expression that is indirectly defined from5426the tokens matched in a previous expression, that is, it looks for5427a 'repeat' of a previous expression. For example::54285429first = Word(nums)5430second = matchPreviousExpr(first)5431matchExpr = first + ":" + second54325433will match ``"1:1"``, but not ``"1:2"``. Because this5434matches by expressions, will *not* match the leading ``"1:1"``5435in ``"1:10"``; the expressions are evaluated first, and then5436compared, so ``"1"`` is compared with ``"10"``. Do *not* use5437with packrat parsing enabled.5438"""5439rep = Forward()5440e2 = expr.copy()5441rep <<= e25442def copyTokenToRepeater(s, l, t):5443matchTokens = _flatten(t.asList())5444def mustMatchTheseTokens(s, l, t):5445theseTokens = _flatten(t.asList())5446if theseTokens != matchTokens:5447raise ParseException('', 0, '')5448rep.setParseAction(mustMatchTheseTokens, callDuringTry=True)5449expr.addParseAction(copyTokenToRepeater, callDuringTry=True)5450rep.setName('(prev) ' + _ustr(expr))5451return rep54525453def _escapeRegexRangeChars(s):5454# ~ escape these chars: ^-[]5455for c in r"\^-[]":5456s = s.replace(c, _bslash + c)5457s = s.replace("\n", r"\n")5458s = s.replace("\t", r"\t")5459return _ustr(s)54605461def oneOf(strs, caseless=False, useRegex=True, asKeyword=False):5462"""Helper to quickly define a set of alternative Literals, and makes5463sure to do longest-first testing when there is a conflict,5464regardless of the input order, but returns5465a :class:`MatchFirst` for best performance.54665467Parameters:54685469- strs - a string of space-delimited literals, or a collection of5470string literals5471- caseless - (default= ``False``) - treat all literals as5472caseless5473- useRegex - (default= ``True``) - as an optimization, will5474generate a Regex object; otherwise, will generate5475a :class:`MatchFirst` object (if ``caseless=True`` or ``asKeyword=True``, or if5476creating a :class:`Regex` raises an exception)5477- asKeyword - (default=``False``) - enforce Keyword-style matching on the5478generated expressions54795480Example::54815482comp_oper = oneOf("< = > <= >= !=")5483var = Word(alphas)5484number = Word(nums)5485term = var | number5486comparison_expr = term + comp_oper + term5487print(comparison_expr.searchString("B = 12 AA=23 B<=AA AA>12"))54885489prints::54905491[['B', '=', '12'], ['AA', '=', '23'], ['B', '<=', 'AA'], ['AA', '>', '12']]5492"""5493if isinstance(caseless, basestring):5494warnings.warn("More than one string argument passed to oneOf, pass "5495"choices as a list or space-delimited string", stacklevel=2)54965497if caseless:5498isequal = (lambda a, b: a.upper() == b.upper())5499masks = (lambda a, b: b.upper().startswith(a.upper()))5500parseElementClass = CaselessKeyword if asKeyword else CaselessLiteral5501else:5502isequal = (lambda a, b: a == b)5503masks = (lambda a, b: b.startswith(a))5504parseElementClass = Keyword if asKeyword else Literal55055506symbols = []5507if isinstance(strs, basestring):5508symbols = strs.split()5509elif isinstance(strs, Iterable):5510symbols = list(strs)5511else:5512warnings.warn("Invalid argument to oneOf, expected string or iterable",5513SyntaxWarning, stacklevel=2)5514if not symbols:5515return NoMatch()55165517if not asKeyword:5518# if not producing keywords, need to reorder to take care to avoid masking5519# longer choices with shorter ones5520i = 05521while i < len(symbols) - 1:5522cur = symbols[i]5523for j, other in enumerate(symbols[i + 1:]):5524if isequal(other, cur):5525del symbols[i + j + 1]5526break5527elif masks(cur, other):5528del symbols[i + j + 1]5529symbols.insert(i, other)5530break5531else:5532i += 155335534if not (caseless or asKeyword) and useRegex:5535# ~ print (strs, "->", "|".join([_escapeRegexChars(sym) for sym in symbols]))5536try:5537if len(symbols) == len("".join(symbols)):5538return Regex("[%s]" % "".join(_escapeRegexRangeChars(sym) for sym in symbols)).setName(' | '.join(symbols))5539else:5540return Regex("|".join(re.escape(sym) for sym in symbols)).setName(' | '.join(symbols))5541except Exception:5542warnings.warn("Exception creating Regex for oneOf, building MatchFirst",5543SyntaxWarning, stacklevel=2)55445545# last resort, just use MatchFirst5546return MatchFirst(parseElementClass(sym) for sym in symbols).setName(' | '.join(symbols))55475548def dictOf(key, value):5549"""Helper to easily and clearly define a dictionary by specifying5550the respective patterns for the key and value. Takes care of5551defining the :class:`Dict`, :class:`ZeroOrMore`, and5552:class:`Group` tokens in the proper order. The key pattern5553can include delimiting markers or punctuation, as long as they are5554suppressed, thereby leaving the significant key text. The value5555pattern can include named results, so that the :class:`Dict` results5556can include named token fields.55575558Example::55595560text = "shape: SQUARE posn: upper left color: light blue texture: burlap"5561attr_expr = (label + Suppress(':') + OneOrMore(data_word, stopOn=label).setParseAction(' '.join))5562print(OneOrMore(attr_expr).parseString(text).dump())55635564attr_label = label5565attr_value = Suppress(':') + OneOrMore(data_word, stopOn=label).setParseAction(' '.join)55665567# similar to Dict, but simpler call format5568result = dictOf(attr_label, attr_value).parseString(text)5569print(result.dump())5570print(result['shape'])5571print(result.shape) # object attribute access works too5572print(result.asDict())55735574prints::55755576[['shape', 'SQUARE'], ['posn', 'upper left'], ['color', 'light blue'], ['texture', 'burlap']]5577- color: light blue5578- posn: upper left5579- shape: SQUARE5580- texture: burlap5581SQUARE5582SQUARE5583{'color': 'light blue', 'shape': 'SQUARE', 'posn': 'upper left', 'texture': 'burlap'}5584"""5585return Dict(OneOrMore(Group(key + value)))55865587def originalTextFor(expr, asString=True):5588"""Helper to return the original, untokenized text for a given5589expression. Useful to restore the parsed fields of an HTML start5590tag into the raw tag text itself, or to revert separate tokens with5591intervening whitespace back to the original matching input text. By5592default, returns astring containing the original parsed text.55935594If the optional ``asString`` argument is passed as5595``False``, then the return value is5596a :class:`ParseResults` containing any results names that5597were originally matched, and a single token containing the original5598matched text from the input string. So if the expression passed to5599:class:`originalTextFor` contains expressions with defined5600results names, you must set ``asString`` to ``False`` if you5601want to preserve those results name values.56025603Example::56045605src = "this is test <b> bold <i>text</i> </b> normal text "5606for tag in ("b", "i"):5607opener, closer = makeHTMLTags(tag)5608patt = originalTextFor(opener + SkipTo(closer) + closer)5609print(patt.searchString(src)[0])56105611prints::56125613['<b> bold <i>text</i> </b>']5614['<i>text</i>']5615"""5616locMarker = Empty().setParseAction(lambda s, loc, t: loc)5617endlocMarker = locMarker.copy()5618endlocMarker.callPreparse = False5619matchExpr = locMarker("_original_start") + expr + endlocMarker("_original_end")5620if asString:5621extractText = lambda s, l, t: s[t._original_start: t._original_end]5622else:5623def extractText(s, l, t):5624t[:] = [s[t.pop('_original_start'):t.pop('_original_end')]]5625matchExpr.setParseAction(extractText)5626matchExpr.ignoreExprs = expr.ignoreExprs5627return matchExpr56285629def ungroup(expr):5630"""Helper to undo pyparsing's default grouping of And expressions,5631even if all but one are non-empty.5632"""5633return TokenConverter(expr).addParseAction(lambda t: t[0])56345635def locatedExpr(expr):5636"""Helper to decorate a returned token with its starting and ending5637locations in the input string.56385639This helper adds the following results names:56405641- locn_start = location where matched expression begins5642- locn_end = location where matched expression ends5643- value = the actual parsed results56445645Be careful if the input text contains ``<TAB>`` characters, you5646may want to call :class:`ParserElement.parseWithTabs`56475648Example::56495650wd = Word(alphas)5651for match in locatedExpr(wd).searchString("ljsdf123lksdjjf123lkkjj1222"):5652print(match)56535654prints::56555656[[0, 'ljsdf', 5]]5657[[8, 'lksdjjf', 15]]5658[[18, 'lkkjj', 23]]5659"""5660locator = Empty().setParseAction(lambda s, l, t: l)5661return Group(locator("locn_start") + expr("value") + locator.copy().leaveWhitespace()("locn_end"))566256635664# convenience constants for positional expressions5665empty = Empty().setName("empty")5666lineStart = LineStart().setName("lineStart")5667lineEnd = LineEnd().setName("lineEnd")5668stringStart = StringStart().setName("stringStart")5669stringEnd = StringEnd().setName("stringEnd")56705671_escapedPunc = Word(_bslash, r"\[]-*.$+^?()~ ", exact=2).setParseAction(lambda s, l, t: t[0][1])5672_escapedHexChar = Regex(r"\\0?[xX][0-9a-fA-F]+").setParseAction(lambda s, l, t: unichr(int(t[0].lstrip(r'\0x'), 16)))5673_escapedOctChar = Regex(r"\\0[0-7]+").setParseAction(lambda s, l, t: unichr(int(t[0][1:], 8)))5674_singleChar = _escapedPunc | _escapedHexChar | _escapedOctChar | CharsNotIn(r'\]', exact=1)5675_charRange = Group(_singleChar + Suppress("-") + _singleChar)5676_reBracketExpr = Literal("[") + Optional("^").setResultsName("negate") + Group(OneOrMore(_charRange | _singleChar)).setResultsName("body") + "]"56775678def srange(s):5679r"""Helper to easily define string ranges for use in Word5680construction. Borrows syntax from regexp '[]' string range5681definitions::56825683srange("[0-9]") -> "0123456789"5684srange("[a-z]") -> "abcdefghijklmnopqrstuvwxyz"5685srange("[a-z$_]") -> "abcdefghijklmnopqrstuvwxyz$_"56865687The input string must be enclosed in []'s, and the returned string5688is the expanded character set joined into a single string. The5689values enclosed in the []'s may be:56905691- a single character5692- an escaped character with a leading backslash (such as ``\-``5693or ``\]``)5694- an escaped hex character with a leading ``'\x'``5695(``\x21``, which is a ``'!'`` character) (``\0x##``5696is also supported for backwards compatibility)5697- an escaped octal character with a leading ``'\0'``5698(``\041``, which is a ``'!'`` character)5699- a range of any of the above, separated by a dash (``'a-z'``,5700etc.)5701- any combination of the above (``'aeiouy'``,5702``'a-zA-Z0-9_$'``, etc.)5703"""5704_expanded = lambda p: p if not isinstance(p, ParseResults) else ''.join(unichr(c) for c in range(ord(p[0]), ord(p[1]) + 1))5705try:5706return "".join(_expanded(part) for part in _reBracketExpr.parseString(s).body)5707except Exception:5708return ""57095710def matchOnlyAtCol(n):5711"""Helper method for defining parse actions that require matching at5712a specific column in the input text.5713"""5714def verifyCol(strg, locn, toks):5715if col(locn, strg) != n:5716raise ParseException(strg, locn, "matched token not at column %d" % n)5717return verifyCol57185719def replaceWith(replStr):5720"""Helper method for common parse actions that simply return5721a literal value. Especially useful when used with5722:class:`transformString<ParserElement.transformString>` ().57235724Example::57255726num = Word(nums).setParseAction(lambda toks: int(toks[0]))5727na = oneOf("N/A NA").setParseAction(replaceWith(math.nan))5728term = na | num57295730OneOrMore(term).parseString("324 234 N/A 234") # -> [324, 234, nan, 234]5731"""5732return lambda s, l, t: [replStr]57335734def removeQuotes(s, l, t):5735"""Helper parse action for removing quotation marks from parsed5736quoted strings.57375738Example::57395740# by default, quotation marks are included in parsed results5741quotedString.parseString("'Now is the Winter of our Discontent'") # -> ["'Now is the Winter of our Discontent'"]57425743# use removeQuotes to strip quotation marks from parsed results5744quotedString.setParseAction(removeQuotes)5745quotedString.parseString("'Now is the Winter of our Discontent'") # -> ["Now is the Winter of our Discontent"]5746"""5747return t[0][1:-1]57485749def tokenMap(func, *args):5750"""Helper to define a parse action by mapping a function to all5751elements of a ParseResults list. If any additional args are passed,5752they are forwarded to the given function as additional arguments5753after the token, as in5754``hex_integer = Word(hexnums).setParseAction(tokenMap(int, 16))``,5755which will convert the parsed data to an integer using base 16.57565757Example (compare the last to example in :class:`ParserElement.transformString`::57585759hex_ints = OneOrMore(Word(hexnums)).setParseAction(tokenMap(int, 16))5760hex_ints.runTests('''576100 11 22 aa FF 0a 0d 1a5762''')57635764upperword = Word(alphas).setParseAction(tokenMap(str.upper))5765OneOrMore(upperword).runTests('''5766my kingdom for a horse5767''')57685769wd = Word(alphas).setParseAction(tokenMap(str.title))5770OneOrMore(wd).setParseAction(' '.join).runTests('''5771now is the winter of our discontent made glorious summer by this sun of york5772''')57735774prints::5775577600 11 22 aa FF 0a 0d 1a5777[0, 17, 34, 170, 255, 10, 13, 26]57785779my kingdom for a horse5780['MY', 'KINGDOM', 'FOR', 'A', 'HORSE']57815782now is the winter of our discontent made glorious summer by this sun of york5783['Now Is The Winter Of Our Discontent Made Glorious Summer By This Sun Of York']5784"""5785def pa(s, l, t):5786return [func(tokn, *args) for tokn in t]57875788try:5789func_name = getattr(func, '__name__',5790getattr(func, '__class__').__name__)5791except Exception:5792func_name = str(func)5793pa.__name__ = func_name57945795return pa57965797upcaseTokens = tokenMap(lambda t: _ustr(t).upper())5798"""(Deprecated) Helper parse action to convert tokens to upper case.5799Deprecated in favor of :class:`pyparsing_common.upcaseTokens`"""58005801downcaseTokens = tokenMap(lambda t: _ustr(t).lower())5802"""(Deprecated) Helper parse action to convert tokens to lower case.5803Deprecated in favor of :class:`pyparsing_common.downcaseTokens`"""58045805def _makeTags(tagStr, xml,5806suppress_LT=Suppress("<"),5807suppress_GT=Suppress(">")):5808"""Internal helper to construct opening and closing tag expressions, given a tag name"""5809if isinstance(tagStr, basestring):5810resname = tagStr5811tagStr = Keyword(tagStr, caseless=not xml)5812else:5813resname = tagStr.name58145815tagAttrName = Word(alphas, alphanums + "_-:")5816if xml:5817tagAttrValue = dblQuotedString.copy().setParseAction(removeQuotes)5818openTag = (suppress_LT5819+ tagStr("tag")5820+ Dict(ZeroOrMore(Group(tagAttrName + Suppress("=") + tagAttrValue)))5821+ Optional("/", default=[False])("empty").setParseAction(lambda s, l, t: t[0] == '/')5822+ suppress_GT)5823else:5824tagAttrValue = quotedString.copy().setParseAction(removeQuotes) | Word(printables, excludeChars=">")5825openTag = (suppress_LT5826+ tagStr("tag")5827+ Dict(ZeroOrMore(Group(tagAttrName.setParseAction(downcaseTokens)5828+ Optional(Suppress("=") + tagAttrValue))))5829+ Optional("/", default=[False])("empty").setParseAction(lambda s, l, t: t[0] == '/')5830+ suppress_GT)5831closeTag = Combine(_L("</") + tagStr + ">", adjacent=False)58325833openTag.setName("<%s>" % resname)5834# add start<tagname> results name in parse action now that ungrouped names are not reported at two levels5835openTag.addParseAction(lambda t: t.__setitem__("start" + "".join(resname.replace(":", " ").title().split()), t.copy()))5836closeTag = closeTag("end" + "".join(resname.replace(":", " ").title().split())).setName("</%s>" % resname)5837openTag.tag = resname5838closeTag.tag = resname5839openTag.tag_body = SkipTo(closeTag())5840return openTag, closeTag58415842def makeHTMLTags(tagStr):5843"""Helper to construct opening and closing tag expressions for HTML,5844given a tag name. Matches tags in either upper or lower case,5845attributes with namespaces and with quoted or unquoted values.58465847Example::58485849text = '<td>More info at the <a href="https://github.com/pyparsing/pyparsing/wiki">pyparsing</a> wiki page</td>'5850# makeHTMLTags returns pyparsing expressions for the opening and5851# closing tags as a 2-tuple5852a, a_end = makeHTMLTags("A")5853link_expr = a + SkipTo(a_end)("link_text") + a_end58545855for link in link_expr.searchString(text):5856# attributes in the <A> tag (like "href" shown here) are5857# also accessible as named results5858print(link.link_text, '->', link.href)58595860prints::58615862pyparsing -> https://github.com/pyparsing/pyparsing/wiki5863"""5864return _makeTags(tagStr, False)58655866def makeXMLTags(tagStr):5867"""Helper to construct opening and closing tag expressions for XML,5868given a tag name. Matches tags only in the given upper/lower case.58695870Example: similar to :class:`makeHTMLTags`5871"""5872return _makeTags(tagStr, True)58735874def withAttribute(*args, **attrDict):5875"""Helper to create a validating parse action to be used with start5876tags created with :class:`makeXMLTags` or5877:class:`makeHTMLTags`. Use ``withAttribute`` to qualify5878a starting tag with a required attribute value, to avoid false5879matches on common tags such as ``<TD>`` or ``<DIV>``.58805881Call ``withAttribute`` with a series of attribute names and5882values. Specify the list of filter attributes names and values as:58835884- keyword arguments, as in ``(align="right")``, or5885- as an explicit dict with ``**`` operator, when an attribute5886name is also a Python reserved word, as in ``**{"class":"Customer", "align":"right"}``5887- a list of name-value tuples, as in ``(("ns1:class", "Customer"), ("ns2:align", "right"))``58885889For attribute names with a namespace prefix, you must use the second5890form. Attribute names are matched insensitive to upper/lower case.58915892If just testing for ``class`` (with or without a namespace), use5893:class:`withClass`.58945895To verify that the attribute exists, but without specifying a value,5896pass ``withAttribute.ANY_VALUE`` as the value.58975898Example::58995900html = '''5901<div>5902Some text5903<div type="grid">1 4 0 1 0</div>5904<div type="graph">1,3 2,3 1,1</div>5905<div>this has no type</div>5906</div>59075908'''5909div,div_end = makeHTMLTags("div")59105911# only match div tag having a type attribute with value "grid"5912div_grid = div().setParseAction(withAttribute(type="grid"))5913grid_expr = div_grid + SkipTo(div | div_end)("body")5914for grid_header in grid_expr.searchString(html):5915print(grid_header.body)59165917# construct a match with any div tag having a type attribute, regardless of the value5918div_any_type = div().setParseAction(withAttribute(type=withAttribute.ANY_VALUE))5919div_expr = div_any_type + SkipTo(div | div_end)("body")5920for div_header in div_expr.searchString(html):5921print(div_header.body)59225923prints::592459251 4 0 1 0592659271 4 0 1 059281,3 2,3 1,15929"""5930if args:5931attrs = args[:]5932else:5933attrs = attrDict.items()5934attrs = [(k, v) for k, v in attrs]5935def pa(s, l, tokens):5936for attrName, attrValue in attrs:5937if attrName not in tokens:5938raise ParseException(s, l, "no matching attribute " + attrName)5939if attrValue != withAttribute.ANY_VALUE and tokens[attrName] != attrValue:5940raise ParseException(s, l, "attribute '%s' has value '%s', must be '%s'" %5941(attrName, tokens[attrName], attrValue))5942return pa5943withAttribute.ANY_VALUE = object()59445945def withClass(classname, namespace=''):5946"""Simplified version of :class:`withAttribute` when5947matching on a div class - made difficult because ``class`` is5948a reserved word in Python.59495950Example::59515952html = '''5953<div>5954Some text5955<div class="grid">1 4 0 1 0</div>5956<div class="graph">1,3 2,3 1,1</div>5957<div>this <div> has no class</div>5958</div>59595960'''5961div,div_end = makeHTMLTags("div")5962div_grid = div().setParseAction(withClass("grid"))59635964grid_expr = div_grid + SkipTo(div | div_end)("body")5965for grid_header in grid_expr.searchString(html):5966print(grid_header.body)59675968div_any_type = div().setParseAction(withClass(withAttribute.ANY_VALUE))5969div_expr = div_any_type + SkipTo(div | div_end)("body")5970for div_header in div_expr.searchString(html):5971print(div_header.body)59725973prints::597459751 4 0 1 0597659771 4 0 1 059781,3 2,3 1,15979"""5980classattr = "%s:class" % namespace if namespace else "class"5981return withAttribute(**{classattr: classname})59825983opAssoc = SimpleNamespace()5984opAssoc.LEFT = object()5985opAssoc.RIGHT = object()59865987def infixNotation(baseExpr, opList, lpar=Suppress('('), rpar=Suppress(')')):5988"""Helper method for constructing grammars of expressions made up of5989operators working in a precedence hierarchy. Operators may be unary5990or binary, left- or right-associative. Parse actions can also be5991attached to operator expressions. The generated parser will also5992recognize the use of parentheses to override operator precedences5993(see example below).59945995Note: if you define a deep operator list, you may see performance5996issues when using infixNotation. See5997:class:`ParserElement.enablePackrat` for a mechanism to potentially5998improve your parser performance.59996000Parameters:6001- baseExpr - expression representing the most basic element for the6002nested6003- opList - list of tuples, one for each operator precedence level6004in the expression grammar; each tuple is of the form ``(opExpr,6005numTerms, rightLeftAssoc, parseAction)``, where:60066007- opExpr is the pyparsing expression for the operator; may also6008be a string, which will be converted to a Literal; if numTerms6009is 3, opExpr is a tuple of two expressions, for the two6010operators separating the 3 terms6011- numTerms is the number of terms for this operator (must be 1,60122, or 3)6013- rightLeftAssoc is the indicator whether the operator is right6014or left associative, using the pyparsing-defined constants6015``opAssoc.RIGHT`` and ``opAssoc.LEFT``.6016- parseAction is the parse action to be associated with6017expressions matching this operator expression (the parse action6018tuple member may be omitted); if the parse action is passed6019a tuple or list of functions, this is equivalent to calling6020``setParseAction(*fn)``6021(:class:`ParserElement.setParseAction`)6022- lpar - expression for matching left-parentheses6023(default= ``Suppress('(')``)6024- rpar - expression for matching right-parentheses6025(default= ``Suppress(')')``)60266027Example::60286029# simple example of four-function arithmetic with ints and6030# variable names6031integer = pyparsing_common.signed_integer6032varname = pyparsing_common.identifier60336034arith_expr = infixNotation(integer | varname,6035[6036('-', 1, opAssoc.RIGHT),6037(oneOf('* /'), 2, opAssoc.LEFT),6038(oneOf('+ -'), 2, opAssoc.LEFT),6039])60406041arith_expr.runTests('''60425+3*66043(5+3)*66044-2--116045''', fullDump=False)60466047prints::604860495+3*66050[[5, '+', [3, '*', 6]]]60516052(5+3)*66053[[[5, '+', 3], '*', 6]]60546055-2--116056[[['-', 2], '-', ['-', 11]]]6057"""6058# captive version of FollowedBy that does not do parse actions or capture results names6059class _FB(FollowedBy):6060def parseImpl(self, instring, loc, doActions=True):6061self.expr.tryParse(instring, loc)6062return loc, []60636064ret = Forward()6065lastExpr = baseExpr | (lpar + ret + rpar)6066for i, operDef in enumerate(opList):6067opExpr, arity, rightLeftAssoc, pa = (operDef + (None, ))[:4]6068termName = "%s term" % opExpr if arity < 3 else "%s%s term" % opExpr6069if arity == 3:6070if opExpr is None or len(opExpr) != 2:6071raise ValueError(6072"if numterms=3, opExpr must be a tuple or list of two expressions")6073opExpr1, opExpr2 = opExpr6074thisExpr = Forward().setName(termName)6075if rightLeftAssoc == opAssoc.LEFT:6076if arity == 1:6077matchExpr = _FB(lastExpr + opExpr) + Group(lastExpr + OneOrMore(opExpr))6078elif arity == 2:6079if opExpr is not None:6080matchExpr = _FB(lastExpr + opExpr + lastExpr) + Group(lastExpr + OneOrMore(opExpr + lastExpr))6081else:6082matchExpr = _FB(lastExpr + lastExpr) + Group(lastExpr + OneOrMore(lastExpr))6083elif arity == 3:6084matchExpr = (_FB(lastExpr + opExpr1 + lastExpr + opExpr2 + lastExpr)6085+ Group(lastExpr + OneOrMore(opExpr1 + lastExpr + opExpr2 + lastExpr)))6086else:6087raise ValueError("operator must be unary (1), binary (2), or ternary (3)")6088elif rightLeftAssoc == opAssoc.RIGHT:6089if arity == 1:6090# try to avoid LR with this extra test6091if not isinstance(opExpr, Optional):6092opExpr = Optional(opExpr)6093matchExpr = _FB(opExpr.expr + thisExpr) + Group(opExpr + thisExpr)6094elif arity == 2:6095if opExpr is not None:6096matchExpr = _FB(lastExpr + opExpr + thisExpr) + Group(lastExpr + OneOrMore(opExpr + thisExpr))6097else:6098matchExpr = _FB(lastExpr + thisExpr) + Group(lastExpr + OneOrMore(thisExpr))6099elif arity == 3:6100matchExpr = (_FB(lastExpr + opExpr1 + thisExpr + opExpr2 + thisExpr)6101+ Group(lastExpr + opExpr1 + thisExpr + opExpr2 + thisExpr))6102else:6103raise ValueError("operator must be unary (1), binary (2), or ternary (3)")6104else:6105raise ValueError("operator must indicate right or left associativity")6106if pa:6107if isinstance(pa, (tuple, list)):6108matchExpr.setParseAction(*pa)6109else:6110matchExpr.setParseAction(pa)6111thisExpr <<= (matchExpr.setName(termName) | lastExpr)6112lastExpr = thisExpr6113ret <<= lastExpr6114return ret61156116operatorPrecedence = infixNotation6117"""(Deprecated) Former name of :class:`infixNotation`, will be6118dropped in a future release."""61196120dblQuotedString = Combine(Regex(r'"(?:[^"\n\r\\]|(?:"")|(?:\\(?:[^x]|x[0-9a-fA-F]+)))*') + '"').setName("string enclosed in double quotes")6121sglQuotedString = Combine(Regex(r"'(?:[^'\n\r\\]|(?:'')|(?:\\(?:[^x]|x[0-9a-fA-F]+)))*") + "'").setName("string enclosed in single quotes")6122quotedString = Combine(Regex(r'"(?:[^"\n\r\\]|(?:"")|(?:\\(?:[^x]|x[0-9a-fA-F]+)))*') + '"'6123| Regex(r"'(?:[^'\n\r\\]|(?:'')|(?:\\(?:[^x]|x[0-9a-fA-F]+)))*") + "'").setName("quotedString using single or double quotes")6124unicodeString = Combine(_L('u') + quotedString.copy()).setName("unicode string literal")61256126def nestedExpr(opener="(", closer=")", content=None, ignoreExpr=quotedString.copy()):6127"""Helper method for defining nested lists enclosed in opening and6128closing delimiters ("(" and ")" are the default).61296130Parameters:6131- opener - opening character for a nested list6132(default= ``"("``); can also be a pyparsing expression6133- closer - closing character for a nested list6134(default= ``")"``); can also be a pyparsing expression6135- content - expression for items within the nested lists6136(default= ``None``)6137- ignoreExpr - expression for ignoring opening and closing6138delimiters (default= :class:`quotedString`)61396140If an expression is not provided for the content argument, the6141nested expression will capture all whitespace-delimited content6142between delimiters as a list of separate values.61436144Use the ``ignoreExpr`` argument to define expressions that may6145contain opening or closing characters that should not be treated as6146opening or closing characters for nesting, such as quotedString or6147a comment expression. Specify multiple expressions using an6148:class:`Or` or :class:`MatchFirst`. The default is6149:class:`quotedString`, but if no expressions are to be ignored, then6150pass ``None`` for this argument.61516152Example::61536154data_type = oneOf("void int short long char float double")6155decl_data_type = Combine(data_type + Optional(Word('*')))6156ident = Word(alphas+'_', alphanums+'_')6157number = pyparsing_common.number6158arg = Group(decl_data_type + ident)6159LPAR, RPAR = map(Suppress, "()")61606161code_body = nestedExpr('{', '}', ignoreExpr=(quotedString | cStyleComment))61626163c_function = (decl_data_type("type")6164+ ident("name")6165+ LPAR + Optional(delimitedList(arg), [])("args") + RPAR6166+ code_body("body"))6167c_function.ignore(cStyleComment)61686169source_code = '''6170int is_odd(int x) {6171return (x%2);6172}61736174int dec_to_hex(char hchar) {6175if (hchar >= '0' && hchar <= '9') {6176return (ord(hchar)-ord('0'));6177} else {6178return (10+ord(hchar)-ord('A'));6179}6180}6181'''6182for func in c_function.searchString(source_code):6183print("%(name)s (%(type)s) args: %(args)s" % func)618461856186prints::61876188is_odd (int) args: [['int', 'x']]6189dec_to_hex (int) args: [['char', 'hchar']]6190"""6191if opener == closer:6192raise ValueError("opening and closing strings cannot be the same")6193if content is None:6194if isinstance(opener, basestring) and isinstance(closer, basestring):6195if len(opener) == 1 and len(closer) == 1:6196if ignoreExpr is not None:6197content = (Combine(OneOrMore(~ignoreExpr6198+ CharsNotIn(opener6199+ closer6200+ ParserElement.DEFAULT_WHITE_CHARS, exact=1)6201)6202).setParseAction(lambda t: t[0].strip()))6203else:6204content = (empty.copy() + CharsNotIn(opener6205+ closer6206+ ParserElement.DEFAULT_WHITE_CHARS6207).setParseAction(lambda t: t[0].strip()))6208else:6209if ignoreExpr is not None:6210content = (Combine(OneOrMore(~ignoreExpr6211+ ~Literal(opener)6212+ ~Literal(closer)6213+ CharsNotIn(ParserElement.DEFAULT_WHITE_CHARS, exact=1))6214).setParseAction(lambda t: t[0].strip()))6215else:6216content = (Combine(OneOrMore(~Literal(opener)6217+ ~Literal(closer)6218+ CharsNotIn(ParserElement.DEFAULT_WHITE_CHARS, exact=1))6219).setParseAction(lambda t: t[0].strip()))6220else:6221raise ValueError("opening and closing arguments must be strings if no content expression is given")6222ret = Forward()6223if ignoreExpr is not None:6224ret <<= Group(Suppress(opener) + ZeroOrMore(ignoreExpr | ret | content) + Suppress(closer))6225else:6226ret <<= Group(Suppress(opener) + ZeroOrMore(ret | content) + Suppress(closer))6227ret.setName('nested %s%s expression' % (opener, closer))6228return ret62296230def indentedBlock(blockStatementExpr, indentStack, indent=True):6231"""Helper method for defining space-delimited indentation blocks,6232such as those used to define block statements in Python source code.62336234Parameters:62356236- blockStatementExpr - expression defining syntax of statement that6237is repeated within the indented block6238- indentStack - list created by caller to manage indentation stack6239(multiple statementWithIndentedBlock expressions within a single6240grammar should share a common indentStack)6241- indent - boolean indicating whether block must be indented beyond6242the current level; set to False for block of left-most6243statements (default= ``True``)62446245A valid block must contain at least one ``blockStatement``.62466247Example::62486249data = '''6250def A(z):6251A16252B = 1006253G = A26254A26255A36256B6257def BB(a,b,c):6258BB16259def BBA():6260bba16261bba26262bba36263C6264D6265def spam(x,y):6266def eggs(z):6267pass6268'''626962706271indentStack = [1]6272stmt = Forward()62736274identifier = Word(alphas, alphanums)6275funcDecl = ("def" + identifier + Group("(" + Optional(delimitedList(identifier)) + ")") + ":")6276func_body = indentedBlock(stmt, indentStack)6277funcDef = Group(funcDecl + func_body)62786279rvalue = Forward()6280funcCall = Group(identifier + "(" + Optional(delimitedList(rvalue)) + ")")6281rvalue << (funcCall | identifier | Word(nums))6282assignment = Group(identifier + "=" + rvalue)6283stmt << (funcDef | assignment | identifier)62846285module_body = OneOrMore(stmt)62866287parseTree = module_body.parseString(data)6288parseTree.pprint()62896290prints::62916292[['def',6293'A',6294['(', 'z', ')'],6295':',6296[['A1'], [['B', '=', '100']], [['G', '=', 'A2']], ['A2'], ['A3']]],6297'B',6298['def',6299'BB',6300['(', 'a', 'b', 'c', ')'],6301':',6302[['BB1'], [['def', 'BBA', ['(', ')'], ':', [['bba1'], ['bba2'], ['bba3']]]]]],6303'C',6304'D',6305['def',6306'spam',6307['(', 'x', 'y', ')'],6308':',6309[[['def', 'eggs', ['(', 'z', ')'], ':', [['pass']]]]]]]6310"""6311backup_stack = indentStack[:]63126313def reset_stack():6314indentStack[:] = backup_stack63156316def checkPeerIndent(s, l, t):6317if l >= len(s): return6318curCol = col(l, s)6319if curCol != indentStack[-1]:6320if curCol > indentStack[-1]:6321raise ParseException(s, l, "illegal nesting")6322raise ParseException(s, l, "not a peer entry")63236324def checkSubIndent(s, l, t):6325curCol = col(l, s)6326if curCol > indentStack[-1]:6327indentStack.append(curCol)6328else:6329raise ParseException(s, l, "not a subentry")63306331def checkUnindent(s, l, t):6332if l >= len(s): return6333curCol = col(l, s)6334if not(indentStack and curCol in indentStack):6335raise ParseException(s, l, "not an unindent")6336if curCol < indentStack[-1]:6337indentStack.pop()63386339NL = OneOrMore(LineEnd().setWhitespaceChars("\t ").suppress(), stopOn=StringEnd())6340INDENT = (Empty() + Empty().setParseAction(checkSubIndent)).setName('INDENT')6341PEER = Empty().setParseAction(checkPeerIndent).setName('')6342UNDENT = Empty().setParseAction(checkUnindent).setName('UNINDENT')6343if indent:6344smExpr = Group(Optional(NL)6345+ INDENT6346+ OneOrMore(PEER + Group(blockStatementExpr) + Optional(NL), stopOn=StringEnd())6347+ UNDENT)6348else:6349smExpr = Group(Optional(NL)6350+ OneOrMore(PEER + Group(blockStatementExpr) + Optional(NL), stopOn=StringEnd())6351+ UNDENT)6352smExpr.setFailAction(lambda a, b, c, d: reset_stack())6353blockStatementExpr.ignore(_bslash + LineEnd())6354return smExpr.setName('indented block')63556356alphas8bit = srange(r"[\0xc0-\0xd6\0xd8-\0xf6\0xf8-\0xff]")6357punc8bit = srange(r"[\0xa1-\0xbf\0xd7\0xf7]")63586359anyOpenTag, anyCloseTag = makeHTMLTags(Word(alphas, alphanums + "_:").setName('any tag'))6360_htmlEntityMap = dict(zip("gt lt amp nbsp quot apos".split(), '><& "\''))6361commonHTMLEntity = Regex('&(?P<entity>' + '|'.join(_htmlEntityMap.keys()) +");").setName("common HTML entity")6362def replaceHTMLEntity(t):6363"""Helper parser action to replace common HTML entities with their special characters"""6364return _htmlEntityMap.get(t.entity)63656366# it's easy to get these comment structures wrong - they're very common, so may as well make them available6367cStyleComment = Combine(Regex(r"/\*(?:[^*]|\*(?!/))*") + '*/').setName("C style comment")6368"Comment of the form ``/* ... */``"63696370htmlComment = Regex(r"<!--[\s\S]*?-->").setName("HTML comment")6371"Comment of the form ``<!-- ... -->``"63726373restOfLine = Regex(r".*").leaveWhitespace().setName("rest of line")6374dblSlashComment = Regex(r"//(?:\\\n|[^\n])*").setName("// comment")6375"Comment of the form ``// ... (to end of line)``"63766377cppStyleComment = Combine(Regex(r"/\*(?:[^*]|\*(?!/))*") + '*/' | dblSlashComment).setName("C++ style comment")6378"Comment of either form :class:`cStyleComment` or :class:`dblSlashComment`"63796380javaStyleComment = cppStyleComment6381"Same as :class:`cppStyleComment`"63826383pythonStyleComment = Regex(r"#.*").setName("Python style comment")6384"Comment of the form ``# ... (to end of line)``"63856386_commasepitem = Combine(OneOrMore(Word(printables, excludeChars=',')6387+ Optional(Word(" \t")6388+ ~Literal(",") + ~LineEnd()))).streamline().setName("commaItem")6389commaSeparatedList = delimitedList(Optional(quotedString.copy() | _commasepitem, default="")).setName("commaSeparatedList")6390"""(Deprecated) Predefined expression of 1 or more printable words or6391quoted strings, separated by commas.63926393This expression is deprecated in favor of :class:`pyparsing_common.comma_separated_list`.6394"""63956396# some other useful expressions - using lower-case class name since we are really using this as a namespace6397class pyparsing_common:6398"""Here are some common low-level expressions that may be useful in6399jump-starting parser development:64006401- numeric forms (:class:`integers<integer>`, :class:`reals<real>`,6402:class:`scientific notation<sci_real>`)6403- common :class:`programming identifiers<identifier>`6404- network addresses (:class:`MAC<mac_address>`,6405:class:`IPv4<ipv4_address>`, :class:`IPv6<ipv6_address>`)6406- ISO8601 :class:`dates<iso8601_date>` and6407:class:`datetime<iso8601_datetime>`6408- :class:`UUID<uuid>`6409- :class:`comma-separated list<comma_separated_list>`64106411Parse actions:64126413- :class:`convertToInteger`6414- :class:`convertToFloat`6415- :class:`convertToDate`6416- :class:`convertToDatetime`6417- :class:`stripHTMLTags`6418- :class:`upcaseTokens`6419- :class:`downcaseTokens`64206421Example::64226423pyparsing_common.number.runTests('''6424# any int or real number, returned as the appropriate type64251006426-1006427+10064283.1415964296.02e2364301e-126431''')64326433pyparsing_common.fnumber.runTests('''6434# any int or real number, returned as float64351006436-1006437+10064383.1415964396.02e2364401e-126441''')64426443pyparsing_common.hex_integer.runTests('''6444# hex numbers64451006446FF6447''')64486449pyparsing_common.fraction.runTests('''6450# fractions64511/26452-3/46453''')64546455pyparsing_common.mixed_integer.runTests('''6456# mixed fractions6457164581/26459-3/464601-3/46461''')64626463import uuid6464pyparsing_common.uuid.setParseAction(tokenMap(uuid.UUID))6465pyparsing_common.uuid.runTests('''6466# uuid646712345678-1234-5678-1234-5678123456786468''')64696470prints::64716472# any int or real number, returned as the appropriate type64731006474[100]64756476-1006477[-100]64786479+1006480[100]648164823.141596483[3.14159]648464856.02e236486[6.02e+23]648764881e-126489[1e-12]64906491# any int or real number, returned as float64921006493[100.0]64946495-1006496[-100.0]64976498+1006499[100.0]650065013.141596502[3.14159]650365046.02e236505[6.02e+23]650665071e-126508[1e-12]65096510# hex numbers65111006512[256]65136514FF6515[255]65166517# fractions65181/26519[0.5]65206521-3/46522[-0.75]65236524# mixed fractions652516526[1]652765281/26529[0.5]65306531-3/46532[-0.75]653365341-3/46535[1.75]65366537# uuid653812345678-1234-5678-1234-5678123456786539[UUID('12345678-1234-5678-1234-567812345678')]6540"""65416542convertToInteger = tokenMap(int)6543"""6544Parse action for converting parsed integers to Python int6545"""65466547convertToFloat = tokenMap(float)6548"""6549Parse action for converting parsed numbers to Python float6550"""65516552integer = Word(nums).setName("integer").setParseAction(convertToInteger)6553"""expression that parses an unsigned integer, returns an int"""65546555hex_integer = Word(hexnums).setName("hex integer").setParseAction(tokenMap(int, 16))6556"""expression that parses a hexadecimal integer, returns an int"""65576558signed_integer = Regex(r'[+-]?\d+').setName("signed integer").setParseAction(convertToInteger)6559"""expression that parses an integer with optional leading sign, returns an int"""65606561fraction = (signed_integer().setParseAction(convertToFloat) + '/' + signed_integer().setParseAction(convertToFloat)).setName("fraction")6562"""fractional expression of an integer divided by an integer, returns a float"""6563fraction.addParseAction(lambda t: t[0]/t[-1])65646565mixed_integer = (fraction | signed_integer + Optional(Optional('-').suppress() + fraction)).setName("fraction or mixed integer-fraction")6566"""mixed integer of the form 'integer - fraction', with optional leading integer, returns float"""6567mixed_integer.addParseAction(sum)65686569real = Regex(r'[+-]?(?:\d+\.\d*|\.\d+)').setName("real number").setParseAction(convertToFloat)6570"""expression that parses a floating point number and returns a float"""65716572sci_real = Regex(r'[+-]?(?:\d+(?:[eE][+-]?\d+)|(?:\d+\.\d*|\.\d+)(?:[eE][+-]?\d+)?)').setName("real number with scientific notation").setParseAction(convertToFloat)6573"""expression that parses a floating point number with optional6574scientific notation and returns a float"""65756576# streamlining this expression makes the docs nicer-looking6577number = (sci_real | real | signed_integer).streamline()6578"""any numeric expression, returns the corresponding Python type"""65796580fnumber = Regex(r'[+-]?\d+\.?\d*([eE][+-]?\d+)?').setName("fnumber").setParseAction(convertToFloat)6581"""any int or real number, returned as float"""65826583identifier = Word(alphas + '_', alphanums + '_').setName("identifier")6584"""typical code identifier (leading alpha or '_', followed by 0 or more alphas, nums, or '_')"""65856586ipv4_address = Regex(r'(25[0-5]|2[0-4][0-9]|1?[0-9]{1,2})(\.(25[0-5]|2[0-4][0-9]|1?[0-9]{1,2})){3}').setName("IPv4 address")6587"IPv4 address (``0.0.0.0 - 255.255.255.255``)"65886589_ipv6_part = Regex(r'[0-9a-fA-F]{1,4}').setName("hex_integer")6590_full_ipv6_address = (_ipv6_part + (':' + _ipv6_part) * 7).setName("full IPv6 address")6591_short_ipv6_address = (Optional(_ipv6_part + (':' + _ipv6_part) * (0, 6))6592+ "::"6593+ Optional(_ipv6_part + (':' + _ipv6_part) * (0, 6))6594).setName("short IPv6 address")6595_short_ipv6_address.addCondition(lambda t: sum(1 for tt in t if pyparsing_common._ipv6_part.matches(tt)) < 8)6596_mixed_ipv6_address = ("::ffff:" + ipv4_address).setName("mixed IPv6 address")6597ipv6_address = Combine((_full_ipv6_address | _mixed_ipv6_address | _short_ipv6_address).setName("IPv6 address")).setName("IPv6 address")6598"IPv6 address (long, short, or mixed form)"65996600mac_address = Regex(r'[0-9a-fA-F]{2}([:.-])[0-9a-fA-F]{2}(?:\1[0-9a-fA-F]{2}){4}').setName("MAC address")6601"MAC address xx:xx:xx:xx:xx (may also have '-' or '.' delimiters)"66026603@staticmethod6604def convertToDate(fmt="%Y-%m-%d"):6605"""6606Helper to create a parse action for converting parsed date string to Python datetime.date66076608Params -6609- fmt - format to be passed to datetime.strptime (default= ``"%Y-%m-%d"``)66106611Example::66126613date_expr = pyparsing_common.iso8601_date.copy()6614date_expr.setParseAction(pyparsing_common.convertToDate())6615print(date_expr.parseString("1999-12-31"))66166617prints::66186619[datetime.date(1999, 12, 31)]6620"""6621def cvt_fn(s, l, t):6622try:6623return datetime.strptime(t[0], fmt).date()6624except ValueError as ve:6625raise ParseException(s, l, str(ve))6626return cvt_fn66276628@staticmethod6629def convertToDatetime(fmt="%Y-%m-%dT%H:%M:%S.%f"):6630"""Helper to create a parse action for converting parsed6631datetime string to Python datetime.datetime66326633Params -6634- fmt - format to be passed to datetime.strptime (default= ``"%Y-%m-%dT%H:%M:%S.%f"``)66356636Example::66376638dt_expr = pyparsing_common.iso8601_datetime.copy()6639dt_expr.setParseAction(pyparsing_common.convertToDatetime())6640print(dt_expr.parseString("1999-12-31T23:59:59.999"))66416642prints::66436644[datetime.datetime(1999, 12, 31, 23, 59, 59, 999000)]6645"""6646def cvt_fn(s, l, t):6647try:6648return datetime.strptime(t[0], fmt)6649except ValueError as ve:6650raise ParseException(s, l, str(ve))6651return cvt_fn66526653iso8601_date = Regex(r'(?P<year>\d{4})(?:-(?P<month>\d\d)(?:-(?P<day>\d\d))?)?').setName("ISO8601 date")6654"ISO8601 date (``yyyy-mm-dd``)"66556656iso8601_datetime = Regex(r'(?P<year>\d{4})-(?P<month>\d\d)-(?P<day>\d\d)[T ](?P<hour>\d\d):(?P<minute>\d\d)(:(?P<second>\d\d(\.\d*)?)?)?(?P<tz>Z|[+-]\d\d:?\d\d)?').setName("ISO8601 datetime")6657"ISO8601 datetime (``yyyy-mm-ddThh:mm:ss.s(Z|+-00:00)``) - trailing seconds, milliseconds, and timezone optional; accepts separating ``'T'`` or ``' '``"66586659uuid = Regex(r'[0-9a-fA-F]{8}(-[0-9a-fA-F]{4}){3}-[0-9a-fA-F]{12}').setName("UUID")6660"UUID (``xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx``)"66616662_html_stripper = anyOpenTag.suppress() | anyCloseTag.suppress()6663@staticmethod6664def stripHTMLTags(s, l, tokens):6665"""Parse action to remove HTML tags from web page HTML source66666667Example::66686669# strip HTML links from normal text6670text = '<td>More info at the <a href="https://github.com/pyparsing/pyparsing/wiki">pyparsing</a> wiki page</td>'6671td, td_end = makeHTMLTags("TD")6672table_text = td + SkipTo(td_end).setParseAction(pyparsing_common.stripHTMLTags)("body") + td_end6673print(table_text.parseString(text).body)66746675Prints::66766677More info at the pyparsing wiki page6678"""6679return pyparsing_common._html_stripper.transformString(tokens[0])66806681_commasepitem = Combine(OneOrMore(~Literal(",")6682+ ~LineEnd()6683+ Word(printables, excludeChars=',')6684+ Optional(White(" \t")))).streamline().setName("commaItem")6685comma_separated_list = delimitedList(Optional(quotedString.copy()6686| _commasepitem, default='')6687).setName("comma separated list")6688"""Predefined expression of 1 or more printable words or quoted strings, separated by commas."""66896690upcaseTokens = staticmethod(tokenMap(lambda t: _ustr(t).upper()))6691"""Parse action to convert tokens to upper case."""66926693downcaseTokens = staticmethod(tokenMap(lambda t: _ustr(t).lower()))6694"""Parse action to convert tokens to lower case."""669566966697class _lazyclassproperty(object):6698def __init__(self, fn):6699self.fn = fn6700self.__doc__ = fn.__doc__6701self.__name__ = fn.__name__67026703def __get__(self, obj, cls):6704if cls is None:6705cls = type(obj)6706if not hasattr(cls, '_intern') or any(cls._intern is getattr(superclass, '_intern', [])6707for superclass in cls.__mro__[1:]):6708cls._intern = {}6709attrname = self.fn.__name__6710if attrname not in cls._intern:6711cls._intern[attrname] = self.fn(cls)6712return cls._intern[attrname]671367146715class unicode_set(object):6716"""6717A set of Unicode characters, for language-specific strings for6718``alphas``, ``nums``, ``alphanums``, and ``printables``.6719A unicode_set is defined by a list of ranges in the Unicode character6720set, in a class attribute ``_ranges``, such as::67216722_ranges = [(0x0020, 0x007e), (0x00a0, 0x00ff),]67236724A unicode set can also be defined using multiple inheritance of other unicode sets::67256726class CJK(Chinese, Japanese, Korean):6727pass6728"""6729_ranges = []67306731@classmethod6732def _get_chars_for_ranges(cls):6733ret = []6734for cc in cls.__mro__:6735if cc is unicode_set:6736break6737for rr in cc._ranges:6738ret.extend(range(rr[0], rr[-1] + 1))6739return [unichr(c) for c in sorted(set(ret))]67406741@_lazyclassproperty6742def printables(cls):6743"all non-whitespace characters in this range"6744return u''.join(filterfalse(unicode.isspace, cls._get_chars_for_ranges()))67456746@_lazyclassproperty6747def alphas(cls):6748"all alphabetic characters in this range"6749return u''.join(filter(unicode.isalpha, cls._get_chars_for_ranges()))67506751@_lazyclassproperty6752def nums(cls):6753"all numeric digit characters in this range"6754return u''.join(filter(unicode.isdigit, cls._get_chars_for_ranges()))67556756@_lazyclassproperty6757def alphanums(cls):6758"all alphanumeric characters in this range"6759return cls.alphas + cls.nums676067616762class pyparsing_unicode(unicode_set):6763"""6764A namespace class for defining common language unicode_sets.6765"""6766_ranges = [(32, sys.maxunicode)]67676768class Latin1(unicode_set):6769"Unicode set for Latin-1 Unicode Character Range"6770_ranges = [(0x0020, 0x007e), (0x00a0, 0x00ff),]67716772class LatinA(unicode_set):6773"Unicode set for Latin-A Unicode Character Range"6774_ranges = [(0x0100, 0x017f),]67756776class LatinB(unicode_set):6777"Unicode set for Latin-B Unicode Character Range"6778_ranges = [(0x0180, 0x024f),]67796780class Greek(unicode_set):6781"Unicode set for Greek Unicode Character Ranges"6782_ranges = [6783(0x0370, 0x03ff), (0x1f00, 0x1f15), (0x1f18, 0x1f1d), (0x1f20, 0x1f45), (0x1f48, 0x1f4d),6784(0x1f50, 0x1f57), (0x1f59,), (0x1f5b,), (0x1f5d,), (0x1f5f, 0x1f7d), (0x1f80, 0x1fb4), (0x1fb6, 0x1fc4),6785(0x1fc6, 0x1fd3), (0x1fd6, 0x1fdb), (0x1fdd, 0x1fef), (0x1ff2, 0x1ff4), (0x1ff6, 0x1ffe),6786]67876788class Cyrillic(unicode_set):6789"Unicode set for Cyrillic Unicode Character Range"6790_ranges = [(0x0400, 0x04ff)]67916792class Chinese(unicode_set):6793"Unicode set for Chinese Unicode Character Range"6794_ranges = [(0x4e00, 0x9fff), (0x3000, 0x303f),]67956796class Japanese(unicode_set):6797"Unicode set for Japanese Unicode Character Range, combining Kanji, Hiragana, and Katakana ranges"6798_ranges = []67996800class Kanji(unicode_set):6801"Unicode set for Kanji Unicode Character Range"6802_ranges = [(0x4E00, 0x9Fbf), (0x3000, 0x303f),]68036804class Hiragana(unicode_set):6805"Unicode set for Hiragana Unicode Character Range"6806_ranges = [(0x3040, 0x309f),]68076808class Katakana(unicode_set):6809"Unicode set for Katakana Unicode Character Range"6810_ranges = [(0x30a0, 0x30ff),]68116812class Korean(unicode_set):6813"Unicode set for Korean Unicode Character Range"6814_ranges = [(0xac00, 0xd7af), (0x1100, 0x11ff), (0x3130, 0x318f), (0xa960, 0xa97f), (0xd7b0, 0xd7ff), (0x3000, 0x303f),]68156816class CJK(Chinese, Japanese, Korean):6817"Unicode set for combined Chinese, Japanese, and Korean (CJK) Unicode Character Range"6818pass68196820class Thai(unicode_set):6821"Unicode set for Thai Unicode Character Range"6822_ranges = [(0x0e01, 0x0e3a), (0x0e3f, 0x0e5b),]68236824class Arabic(unicode_set):6825"Unicode set for Arabic Unicode Character Range"6826_ranges = [(0x0600, 0x061b), (0x061e, 0x06ff), (0x0700, 0x077f),]68276828class Hebrew(unicode_set):6829"Unicode set for Hebrew Unicode Character Range"6830_ranges = [(0x0590, 0x05ff),]68316832class Devanagari(unicode_set):6833"Unicode set for Devanagari Unicode Character Range"6834_ranges = [(0x0900, 0x097f), (0xa8e0, 0xa8ff)]68356836pyparsing_unicode.Japanese._ranges = (pyparsing_unicode.Japanese.Kanji._ranges6837+ pyparsing_unicode.Japanese.Hiragana._ranges6838+ pyparsing_unicode.Japanese.Katakana._ranges)68396840# define ranges in language character sets6841if PY_3:6842setattr(pyparsing_unicode, u"العربية", pyparsing_unicode.Arabic)6843setattr(pyparsing_unicode, u"中文", pyparsing_unicode.Chinese)6844setattr(pyparsing_unicode, u"кириллица", pyparsing_unicode.Cyrillic)6845setattr(pyparsing_unicode, u"Ελληνικά", pyparsing_unicode.Greek)6846setattr(pyparsing_unicode, u"עִברִית", pyparsing_unicode.Hebrew)6847setattr(pyparsing_unicode, u"日本語", pyparsing_unicode.Japanese)6848setattr(pyparsing_unicode.Japanese, u"漢字", pyparsing_unicode.Japanese.Kanji)6849setattr(pyparsing_unicode.Japanese, u"カタカナ", pyparsing_unicode.Japanese.Katakana)6850setattr(pyparsing_unicode.Japanese, u"ひらがな", pyparsing_unicode.Japanese.Hiragana)6851setattr(pyparsing_unicode, u"한국어", pyparsing_unicode.Korean)6852setattr(pyparsing_unicode, u"ไทย", pyparsing_unicode.Thai)6853setattr(pyparsing_unicode, u"देवनागरी", pyparsing_unicode.Devanagari)685468556856class pyparsing_test:6857"""6858namespace class for classes useful in writing unit tests6859"""68606861class reset_pyparsing_context:6862"""6863Context manager to be used when writing unit tests that modify pyparsing config values:6864- packrat parsing6865- default whitespace characters.6866- default keyword characters6867- literal string auto-conversion class6868- __diag__ settings68696870Example:6871with reset_pyparsing_context():6872# test that literals used to construct a grammar are automatically suppressed6873ParserElement.inlineLiteralsUsing(Suppress)68746875term = Word(alphas) | Word(nums)6876group = Group('(' + term[...] + ')')68776878# assert that the '()' characters are not included in the parsed tokens6879self.assertParseAndCheckLisst(group, "(abc 123 def)", ['abc', '123', 'def'])68806881# after exiting context manager, literals are converted to Literal expressions again6882"""68836884def __init__(self):6885self._save_context = {}68866887def save(self):6888self._save_context["default_whitespace"] = ParserElement.DEFAULT_WHITE_CHARS6889self._save_context["default_keyword_chars"] = Keyword.DEFAULT_KEYWORD_CHARS6890self._save_context[6891"literal_string_class"6892] = ParserElement._literalStringClass6893self._save_context["packrat_enabled"] = ParserElement._packratEnabled6894self._save_context["packrat_parse"] = ParserElement._parse6895self._save_context["__diag__"] = {6896name: getattr(__diag__, name) for name in __diag__._all_names6897}6898self._save_context["__compat__"] = {6899"collect_all_And_tokens": __compat__.collect_all_And_tokens6900}6901return self69026903def restore(self):6904# reset pyparsing global state6905if (6906ParserElement.DEFAULT_WHITE_CHARS6907!= self._save_context["default_whitespace"]6908):6909ParserElement.setDefaultWhitespaceChars(6910self._save_context["default_whitespace"]6911)6912Keyword.DEFAULT_KEYWORD_CHARS = self._save_context["default_keyword_chars"]6913ParserElement.inlineLiteralsUsing(6914self._save_context["literal_string_class"]6915)6916for name, value in self._save_context["__diag__"].items():6917setattr(__diag__, name, value)6918ParserElement._packratEnabled = self._save_context["packrat_enabled"]6919ParserElement._parse = self._save_context["packrat_parse"]6920__compat__.collect_all_And_tokens = self._save_context["__compat__"]69216922def __enter__(self):6923return self.save()69246925def __exit__(self, *args):6926return self.restore()69276928class TestParseResultsAsserts:6929"""6930A mixin class to add parse results assertion methods to normal unittest.TestCase classes.6931"""6932def assertParseResultsEquals(6933self, result, expected_list=None, expected_dict=None, msg=None6934):6935"""6936Unit test assertion to compare a ParseResults object with an optional expected_list,6937and compare any defined results names with an optional expected_dict.6938"""6939if expected_list is not None:6940self.assertEqual(expected_list, result.asList(), msg=msg)6941if expected_dict is not None:6942self.assertEqual(expected_dict, result.asDict(), msg=msg)69436944def assertParseAndCheckList(6945self, expr, test_string, expected_list, msg=None, verbose=True6946):6947"""6948Convenience wrapper assert to test a parser element and input string, and assert that6949the resulting ParseResults.asList() is equal to the expected_list.6950"""6951result = expr.parseString(test_string, parseAll=True)6952if verbose:6953print(result.dump())6954self.assertParseResultsEquals(result, expected_list=expected_list, msg=msg)69556956def assertParseAndCheckDict(6957self, expr, test_string, expected_dict, msg=None, verbose=True6958):6959"""6960Convenience wrapper assert to test a parser element and input string, and assert that6961the resulting ParseResults.asDict() is equal to the expected_dict.6962"""6963result = expr.parseString(test_string, parseAll=True)6964if verbose:6965print(result.dump())6966self.assertParseResultsEquals(result, expected_dict=expected_dict, msg=msg)69676968def assertRunTestResults(6969self, run_tests_report, expected_parse_results=None, msg=None6970):6971"""6972Unit test assertion to evaluate output of ParserElement.runTests(). If a list of6973list-dict tuples is given as the expected_parse_results argument, then these are zipped6974with the report tuples returned by runTests and evaluated using assertParseResultsEquals.6975Finally, asserts that the overall runTests() success value is True.69766977:param run_tests_report: tuple(bool, [tuple(str, ParseResults or Exception)]) returned from runTests6978:param expected_parse_results (optional): [tuple(str, list, dict, Exception)]6979"""6980run_test_success, run_test_results = run_tests_report69816982if expected_parse_results is not None:6983merged = [6984(rpt[0], rpt[1], expected)6985for rpt, expected in zip(run_test_results, expected_parse_results)6986]6987for test_string, result, expected in merged:6988# expected should be a tuple containing a list and/or a dict or an exception,6989# and optional failure message string6990# an empty tuple will skip any result validation6991fail_msg = next(6992(exp for exp in expected if isinstance(exp, str)), None6993)6994expected_exception = next(6995(6996exp6997for exp in expected6998if isinstance(exp, type) and issubclass(exp, Exception)6999),7000None,7001)7002if expected_exception is not None:7003with self.assertRaises(7004expected_exception=expected_exception, msg=fail_msg or msg7005):7006if isinstance(result, Exception):7007raise result7008else:7009expected_list = next(7010(exp for exp in expected if isinstance(exp, list)), None7011)7012expected_dict = next(7013(exp for exp in expected if isinstance(exp, dict)), None7014)7015if (expected_list, expected_dict) != (None, None):7016self.assertParseResultsEquals(7017result,7018expected_list=expected_list,7019expected_dict=expected_dict,7020msg=fail_msg or msg,7021)7022else:7023# warning here maybe?7024print("no validation for {!r}".format(test_string))70257026# do this last, in case some specific test results can be reported instead7027self.assertTrue(7028run_test_success, msg=msg if msg is not None else "failed runTests"7029)70307031@contextmanager7032def assertRaisesParseException(self, exc_type=ParseException, msg=None):7033with self.assertRaises(exc_type, msg=msg):7034yield703570367037if __name__ == "__main__":70387039selectToken = CaselessLiteral("select")7040fromToken = CaselessLiteral("from")70417042ident = Word(alphas, alphanums + "_$")70437044columnName = delimitedList(ident, ".", combine=True).setParseAction(upcaseTokens)7045columnNameList = Group(delimitedList(columnName)).setName("columns")7046columnSpec = ('*' | columnNameList)70477048tableName = delimitedList(ident, ".", combine=True).setParseAction(upcaseTokens)7049tableNameList = Group(delimitedList(tableName)).setName("tables")70507051simpleSQL = selectToken("command") + columnSpec("columns") + fromToken + tableNameList("tables")70527053# demo runTests method, including embedded comments in test string7054simpleSQL.runTests("""7055# '*' as column list and dotted table name7056select * from SYS.XYZZY70577058# caseless match on "SELECT", and casts back to "select"7059SELECT * from XYZZY, ABC70607061# list of column names, and mixed case SELECT keyword7062Select AA,BB,CC from Sys.dual70637064# multiple tables7065Select A, B, C from Sys.dual, Table270667067# invalid SELECT keyword - should fail7068Xelect A, B, C from Sys.dual70697070# incomplete command - should fail7071Select70727073# invalid column name - should fail7074Select ^^^ frox Sys.dual70757076""")70777078pyparsing_common.number.runTests("""70791007080-1007081+10070823.1415970836.02e2370841e-127085""")70867087# any int or real number, returned as float7088pyparsing_common.fnumber.runTests("""70891007090-1007091+10070923.1415970936.02e2370941e-127095""")70967097pyparsing_common.hex_integer.runTests("""70981007099FF7100""")71017102import uuid7103pyparsing_common.uuid.setParseAction(tokenMap(uuid.UUID))7104pyparsing_common.uuid.runTests("""710512345678-1234-5678-1234-5678123456787106""")710771087109