I've been working on Psil, the dialect of Lisp I'm implementing in Python. There is no better test for a programming language than to write actual code with it, so I've simultaneously been working through the Project Euler problem set, implementing a solution for each one in Psil. This has been enlightening because like Scheme, Psil currently has no explicit iteration constructs and relies on tail recursion to get anything useful done.
Until recently, the Psil interpreter did not do anything special with tail calls. The eval function looked something like this (greatly simplified):
def eval(expr): if isinstance(expr, list): f = expr func = self.eval(f) return apply(func, [self.eval(x) for x in expr[1:]]) elif isinstance(expr, Symbol): return self.lookup(expr.name) else: return expr
In the presence of tail recursion, obviously this will fill up the Python stack just as fast as the code can recurse.
I was thinking about different techniques to implement proper tail recursion without completely rewriting my interpreter. I thought about using continuation-passing style with a trampoline function, but the performance implications of that method are significant, and would require an almost complete rewrite. I was thinking about the fundamental idea of "unwind the stack before calling the next function", and it occurred to me that I could encapsulate the information needed to make the next call into a Python exception and raise it. The caller would then catch the exception and call the encapsulated function application inside the exception.
Implementing this idea, the eval function becomes something like (new code highlighted):
def eval(expr, tail=False): if isinstance(expr, list): f = expr func = self.eval(f)if tail: raise TailCall(func, [self.eval(x) for x in expr[1:]]) else: try: return apply(func, [self.eval(x) for x in expr[1:]]) except TailCall, t: a = t while True: try: return a.apply() except TailCall, t: a = telif isinstance(expr, Symbol): return self.lookup(expr.name) else: return expr
The idea here is that if eval is called with tail=True, meaning it is in a tail call position, then instead of applying the function right away the function and its arguments are stuffed inside a TailCall exception which will be executed later, after the current stack frame is unwound.
If eval is called with tail=False, then it must be prepared to catch any TailCall exceptions that come from deeper levels. If a TailCall exception does appear, then eval must catch it and keep calling apply() on the encapsulated function application stored inside the TailCall exception.
The final necessary part is the lambda loop that executes each form in a lambda in turn, it just passes tail=False for all but the last form in a lambda, where it passes tail=True.
After I had thought about this and implemented it, it occurred to me that I probably wasn't the first one. Sure enough, here is a Tail Call Optimization Decorator for Python code. It uses some tricky stack introspection and only handles a single self-tail recursive function, but commenters on that thread have posted improved versions that support mutual recursion. I also just found that Scala appears to implement tail recursion on the JVM using a similar technique.
There is a slight execution cost in my implementation because every function call raises a TailCall exception as its final act. In most cases, the raised TailCall exception will be immediately evaluated by its caller. Fortunately, exceptions are reasonably fast in Python (relative to normal execution speed).
I also had to make some adjustments to handle the cases where Psil code is called from Python code which is called from Psil code. For example, in
(map (lambda (x) (+ x 1)) a)
the "map" function is really the Python built-in map function. So the evaluation of this has Python calling the
(lambda (x) (+ x 1)) directly, which would normally raise a TailCall exception when it's done. I had to make sure Psil functions called in this way did not raise the TailCall exception at the end, or it would unwind all the way back through the map call. In addition to working correctly, this change improved performance a bit because in this case, no TailCall exception is raised at all from the inner lambda.
The other thing that comes out of doing Project Euler problems is I can definitely tell where I need to apply some optimisation. Certain things my interpreter does are really quite expensive in execution time. I plan to continually improve the interpreter as I go.