Cleaning Up in a Python Generator Can Be Dangerous
This code in this post is in Python 3, but aside from “cosmetic” differences, such as
g.next() it applies to Python 2 as well.
A few days ago someone from my work called me to take a look at a weird behavior she was having with a Python generator. It wasn’t long until the room was packed with Pythonistas trying to understand what the hell is going on. It took us a while, but we managed to come up with an SSCCE - a Short, Self Contained, Correct (Compilable), Example™ which I’ll show you in this post. The thing is, this isn’t some bug in our code or in the Python implementation - it seems like a true consequence of defined Python behavior which we stumbled on.
I want to emphasize something before we dive in - in general, and in this post specifically, I urge you to correct me if I’m wrong.
Consider the following generator:
We’ll exhaust it the old fashioned way and see that it behaves as you would expect:
Now, let’s throw a wrench in there and throw in an exception in
Right off the bat, this is interesting, because the exception wasn’t raised when we called
g.throw(ValueError('too bad')), but on the next
next. Overall, however, this seems logical. The error is raised in
yield 'yay'. It is in a
try, but there are no
except clauses, so we perform the code in
finally. Since that involves a
yield, that halts the execution of our function until we call
next which allows our exception raising to continue.
Now, this is where the weird behavior comes in. What happens if you
close the generator instead of
throwing an exception?
Let’s try to understand what happens here. Calling
g.close() is roughly equivalent to
g.throw(GeneratorExit()). So the code flow is the same - the error is raised in
yield 'yay'. It tries to perform the
finally clause, and we get to
yield 'bye'. Here, a special behavior is triggered. According to PEP 342:
close()method for generator-iterators, which raises
GeneratorExitat the point where the generator was paused. If the generator then raises
StopIteration(by exiting normally, or due to already being closed) or
GeneratorExit(by not catching the exception),
close()returns to its caller. If the generator yields a value, a
RuntimeErroris raised. If the generator raises any other exception, it is propagated to the caller.
close()does nothing if the generator has already exited due to an exception or normal exit.
This means that if you
finally (or in an
except clause that catches
GeneratorExit, like a bare
closeing that generator will result in
This is problematic.
finally clauses is definitely a useful pattern. Consider you are implementing a coroutine where you
yield futures that communicates with a remote server. You might want to make sure you close a session when you’re done. That will inevitably include
yielding a future that sends a message to that server.
finally is the logical way to go.
Now, you might say that this can be avoided by simply not closing generators if you need to use this
finally pattern. Well, you’re out of luck:
What the hell is going on? Well, PEP 342 again:
Add support to ensure that close() is called when a generator iterator is garbage-collected.
If there is any chance that a generator you write might be
close()ed or simply not be exhausted, the corollary here is that you must not
finally. This is even more perplexing considering the last thing PEP 342 has to say:
Allow “yield” to be used in try/finally blocks, since garbage collection or an explicit close() call would now allow the finally clause to execute.
I understand that this is intended to be useful when fully exhausting the generator; but the (perhaps, unforeseen?) consequence of all of these changes are that
finally are quite dangerous. This sentence in the PEP is really confusing to me, unless the intention is to use
yield only in the
There is a workaround for avoiding the
RuntimeError: you can catch the
GeneratorExit exception, mark a flag and check for it in
So, this works. But it’s an ugly workaround you have to to use in every
finally clause that attempts to
The conclusion for me is this:
Logically, it seems that generators have two
finally paths; a ‘standard’ one where normal operations are allowed; and a second one for ‘panic mode’ - where you can do very little to clean up after yourself. The syntax for this, however, is awkward, invloves boilerplate code and is easy to get wrong. It might be fixable with custom syntax such as:
Am I missing something? Is there a better workaround for this issue? How do you clean up in generators and coroutines? Please share your thoughts.Discuss this post at the comment section below.
Follow me on Twitter, Facebook or Google+
Thanks to Hannan Aharonov, Yonatan Nakar, Ram Rachum, Shachar Ohana and Shachar Nudler for reading drafts of this.