Cleaning Up in a Python Generator Can Be Dangerous

This code in this post is in Python 3, but aside from “cosmetic” differences, such as next(g) vs g.next() it applies to Python 2 as well.

A few days ago someone from my work called me to take a look at a weird behavior she was having with a Python generator. It wasn’t long until the room was packed with Pythonistas trying to understand what the hell is going on. It took us a while, but we managed to come up with an SSCCE - a Short, Self Contained, Correct (Compilable), Example™ which I’ll show you in this post. The thing is, this isn’t some bug in our code or in the Python implementation - it seems like a true consequence of defined Python behavior which we stumbled on.

I want to emphasize something before we dive in - in general, and in this post specifically, I urge you to correct me if I’m wrong.

Consider the following generator:

def gen():
    yield 'so far so good'
    try:
        yield 'yay'
    finally:
        yield 'bye'

We’ll exhaust it the old fashioned way and see that it behaves as you would expect:

>>> g = gen()
>>> next(g)
0: 'so far so good'
>>> next(g)
1: 'yay'
>>> next(g)
2: 'bye'
>>> next(g)
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    next(g)
StopIteration

Now, let’s throw a wrench in there and throw in an exception in yield 'yay':

>>> g = gen()
>>> next(g)
5: 'so far so good'
>>> next(g)
6: 'yay'
>>> g.throw(ValueError('too bad'))
7: 'bye'
>>> next(g)
Traceback (most recent call last):
  File "<pyshell#20>", line 1, in <module>
    next(g)
  File "<pyshell#0>", line 4, in gen
    yield 'yay'
ValueError: too bad
>>> next(g)
Traceback (most recent call last):
  File "<pyshell#21>", line 1, in <module>
    next(g)
StopIteration

Right off the bat, this is interesting, because the exception wasn’t raised when we called g.throw(ValueError('too bad')), but on the next next. Overall, however, this seems logical. The error is raised in yield 'yay'. It is in a try, but there are no except clauses, so we perform the code in finally. Since that involves a yield, that halts the execution of our function until we call next which allows our exception raising to continue.


Now, this is where the weird behavior comes in. What happens if you close the generator instead of throwing an exception?

>>> g = gen()
>>> next(g)
8: 'so far so good'
>>> next(g)
9: 'yay'
>>> g.close()
Traceback (most recent call last):
  File "<pyshell#25>", line 1, in <module>
    g.close()
RuntimeError: generator ignored GeneratorExit

Let’s try to understand what happens here. Calling g.close() is roughly equivalent to g.throw(GeneratorExit()). So the code flow is the same - the error is raised in yield 'yay'. It tries to perform the finally clause, and we get to yield 'bye'. Here, a special behavior is triggered. According to PEP 342:

Add a close() method for generator-iterators, which raises GeneratorExit at the point where the generator was paused. If the generator then raises StopIteration (by exiting normally, or due to already being closed) or GeneratorExit (by not catching the exception), close() returns to its caller. If the generator yields a value, a RuntimeError is raised. If the generator raises any other exception, it is propagated to the caller. close() does nothing if the generator has already exited due to an exception or normal exit.

This means that if you yield in finally (or in an except clause that catches GeneratorExit, like a bare except:), closeing that generator will result in RuntimeError.

This is problematic. yielding in finally clauses is definitely a useful pattern. Consider you are implementing a coroutine where you yield futures that communicates with a remote server. You might want to make sure you close a session when you’re done. That will inevitably include yielding a future that sends a message to that server. yielding in finally is the logical way to go.

Now, you might say that this can be avoided by simply not closing generators if you need to use this finally pattern. Well, you’re out of luck:

>>> g = gen()
>>> next(g)
10: 'so far so good'
>>> next(g)
11: 'yay'
>>> g = 'something else'
Exception ignored in: <generator object gen at 0x0368B720>
RuntimeError: generator ignored GeneratorExit

What the hell is going on? Well, PEP 342 again:

Add support to ensure that close() is called when a generator iterator is garbage-collected.

If there is any chance that a generator you write might be close()ed or simply not be exhausted, the corollary here is that you must not yield in finally. This is even more perplexing considering the last thing PEP 342 has to say:

Allow “yield” to be used in try/finally blocks, since garbage collection or an explicit close() call would now allow the finally clause to execute.

I understand that this is intended to be useful when fully exhausting the generator; but the (perhaps, unforeseen?) consequence of all of these changes are that yields in finally are quite dangerous. This sentence in the PEP is really confusing to me, unless the intention is to use yield only in the try clause.

There is a workaround for avoiding the RuntimeError: you can catch the GeneratorExit exception, mark a flag and check for it in finally:

def safegen():
    yield 'so far so good'
    closed = False
    try:
        yield 'yay'
    except GeneratorExit:
        closed = True
        raise
    finally:
        if not closed:
            yield 'boo'

So, this works. But it’s an ugly workaround you have to to use in every finally clause that attempts to yield.

The conclusion for me is this:

Logically, it seems that generators have two finally paths; a ‘standard’ one where normal operations are allowed; and a second one for ‘panic mode’ - where you can do very little to clean up after yourself. The syntax for this, however, is awkward, invloves boilerplate code and is easy to get wrong. It might be fixable with custom syntax such as:

def safegen():
    yield 'so far so good'
    closed = False
    try:
        yield 'yay'
    finally except GeneratorExit:
        yield 'boo'

Am I missing something? Is there a better workaround for this issue? How do you clean up in generators and coroutines? Please share your thoughts.

Discuss this post at the comment section below.
Follow me on Twitter , Facebook or Google+
Thanks to Hannan Aharonov, Yonatan Nakar, Ram Rachum, Shachar Ohana and Shachar Nudler for reading drafts of this.

Similar Posts