Program structure

The compiler should be split in several distinct phases:

  • Lexing (done by PLY)
  • Parsing (done by PLY)
  • AST construction ( and ast/*)
    This should store tokens in a hierarchical structure and nothing else. e.g.
    def __init__(param1, param2):
        self.param1 = param1
        self.param2 = param2

And that's it. I know, it's boring :)

  • Pre-processing (ast/*)
    This happens completely on the existing AST. Things to do here:
  • Unpack parameter lists (don't hard-code these in the parser, even for string literals etc. We want "foo" + "bar" to actually work everywhere)
  • Reduce expressions
  • Register identifiers

The whole handling of identifiers could use an overhaul. All identifiers should be in one namespace, with the possible exception of action0props/action2vars.
This paves the way for assigning identifiers to more things such as parameters. (e.g. param foo = 1 + 12)

  • Processing (actions/*)
    Here, the actions are generated (get_action_list()). The AST is basically replaced with an action list representing the target actions as defined in actions/...
  • Post-processing (actions/*)
    Anything left to do (such as mapping varact2ids) is done here. This is fully equivalent to the current prepare_output
  • Output (actions/* and output_xxx)
    Generate the actual bytes. This should be the equivalent of a const function in C++, i.e. no changes to the object itself.