Cleaning Up in a Python Generator Can Be Dangerous
This code in this post is in Python 3, but aside from “cosmetic” differences, such as next(g)
vs g.next()
it applies to Python 2 as well.
A few days ago someone from my work called me to take a look at a weird behavior she was having with a Python generator. It wasn’t long until the room was packed with Pythonistas trying to understand what the hell is going on. It took us a while, but we managed to come up with an SSCCE - a Short, Self Contained, Correct (Compilable), Example™ which I’ll show you in this post. The thing is, this isn’t some bug in our code or in the Python implementation - it seems like a true consequence of defined Python behavior which we stumbled on.
I want to emphasize something before we dive in - in general, and in this post specifically, I urge you to correct me if I’m wrong.
Consider the following generator:
We’ll exhaust it the old fashioned way and see that it behaves as you would expect:
Now, let’s throw a wrench in there and throw in an exception in yield 'yay'
:
Right off the bat, this is interesting, because the exception wasn’t raised when we called g.throw(ValueError('too bad'))
, but on the next next
. Overall, however, this seems logical. The error is raised in yield 'yay'
. It is in a try
, but there are no except
clauses, so we perform the code in finally
. Since that involves a yield
, that halts the execution of our function until we call next
which allows our exception raising to continue.
Now, this is where the weird behavior comes in. What happens if you close
the generator instead of throw
ing an exception?
Let’s try to understand what happens here. Calling g.close()
is roughly equivalent to g.throw(GeneratorExit())
. So the code flow is the same - the error is raised in yield 'yay'
. It tries to perform the finally
clause, and we get to yield 'bye'
. Here, a special behavior is triggered. According to PEP 342:
Add a
close()
method for generator-iterators, which raisesGeneratorExit
at the point where the generator was paused. If the generator then raisesStopIteration
(by exiting normally, or due to already being closed) orGeneratorExit
(by not catching the exception),close()
returns to its caller. If the generator yields a value, aRuntimeError
is raised. If the generator raises any other exception, it is propagated to the caller.close()
does nothing if the generator has already exited due to an exception or normal exit.
This means that if you yield
in finally
(or in an except
clause that catches GeneratorExit
, like a bare except:
), close
ing that generator will result in RuntimeError
.
This is problematic. yield
ing in finally
clauses is definitely a useful pattern. Consider you are implementing a coroutine where you yield
futures that communicates with a remote server. You might want to make sure you close a session when you’re done. That will inevitably include yield
ing a future that sends a message to that server. yield
ing in finally
is the logical way to go.
Now, you might say that this can be avoided by simply not closing generators if you need to use this finally
pattern. Well, you’re out of luck:
What the hell is going on? Well, PEP 342 again:
Add support to ensure that close() is called when a generator iterator is garbage-collected.
If there is any chance that a generator you write might be close()
ed or simply not be exhausted, the corollary here is that you must not yield
in finally
. This is even more perplexing considering the last thing PEP 342 has to say:
Allow “yield” to be used in try/finally blocks, since garbage collection or an explicit close() call would now allow the finally clause to execute.
I understand that this is intended to be useful when fully exhausting the generator; but the (perhaps, unforeseen?) consequence of all of these changes are that yield
s in finally
are quite dangerous. This sentence in the PEP is really confusing to me, unless the intention is to use yield
only in the try
clause.
There is a workaround for avoiding the RuntimeError
: you can catch the GeneratorExit
exception, mark a flag and check for it in finally
:
So, this works. But it’s an ugly workaround you have to to use in every finally
clause that attempts to yield
.
The conclusion for me is this:
Logically, it seems that generators have two finally
paths; a ‘standard’ one where normal operations are allowed; and a second one for ‘panic mode’ - where you can do very little to clean up after yourself. The syntax for this, however, is awkward, invloves boilerplate code and is easy to get wrong. It might be fixable with custom syntax such as:
Am I missing something? Is there a better workaround for this issue? How do you clean up in generators and coroutines? Please share your thoughts.
Discuss this post at the comment section below.Follow me on Twitter and Facebook
Thanks to Hannan Aharonov, Yonatan Nakar, Ram Rachum, Shachar Ohana and Shachar Nudler for reading drafts of this.