Python: Common Newbie Mistakes, Part 2
I would still love to hear your feedback in the comments below. Enjoy!
Scoping
The focus of this part is an area of problems where scoping in Python is misunderstood. Usually, when we have global variables (okay, I’ll say it because I have to - global variables are bad), Python understands it if we access them within a function:
Here we’re using, inside foo
, a global variable called bar
and it works as expected:
This is pretty cool. Usually we’ll use this feature for constants that we want to use throughout the code. It also works if we use some function on a global like so:
But what if we want to change bar
?
We can see that foo
ran fine and without exceptions, but if we print the value of bar
we’ll see that it’s still 42
! What happens here is that the line bar = 0
, instead of changing bar
, created a new, local variable also called bar
and set its value to 0
. This is a tough bug to find and it causes some grief to newbies (and veterans!) who aren’t really sure of how Python’s scoping works. To understand when and how Python decided to treat variables as global or local, let’s look at a less common, but probably more baffling version of this mistake and add an assignment to bar
after we print it:
This shouldn’t break our code, right? We added an assignment after the print, so there’s no way it should affect it (Python is an interpreted language after all), right? Right??
WRONG.
How is this possible? Well, there are two parts to this misunderstanding. The first misconception is that Python, being an interpreted language (which is awesome, I think we can all agree), is executed line-by-line. In truth, Python is being executed statement-by-statement. To get a feel of what I mean, go to your favorite shell (you aren’t using the default one, I hope) and type the following:
Press Enter. As you can see, the shell didn’t offer any output and it’s clearly waiting for you to continue with your function definition. It will continue to do so until you finish declaring you function. This is because a function declaration is a statement. Well, it’s a compound statements, that includes within it many other statements, but a statement notwithstanding. The content of your function isn’t being executed until you actually call it. What is being executed is that a function object is being created.
This leads us to the second point. Again, Python’s dynamic and interpreted nature leads us to believe that when the line print bar
is executed, Python will look for a variable bar
first in the local scope and then in the global scope. What really happens here is that the local scope is in fact not completely dynamic. When the def
statement is executed, Python statically gathers information regarding the local scope of the function. When it reaches the line bar = 0
(not when it executes it, but when it reads the function definition), it adds “bar”
to the list of local variable for foo
. When foo
is executed and Python tries to execute the line print bar, it looks for the variable in the local scope and it finds it, since it was statically accessed, but it knows that it wasn’t assigned yet - it has no value. So the exception is raised.
You could ask “why couldn’t an exception be raised when we were declaring the function? Python could have known in advance that bar
was referenced before assignment”. The answer to that is that Python can’t know whether the local bar
was assigned to or not. Look at the following:
Python is playing a delicate game between static and dynamic. The only thing it knows for sure is that bar
is assigned to, but it doesn’t know it’s referenced before assignment until it actually happens. Wait - in fact, it doesn’t even know it was assigned to!
When running foo
, we get:
While we, intelligent beings that we are, can clearly see that the assignment to bar
will never happen, Python ignores that fact and still declares bar
as statically local.
I’ve been babbling about the problem long enough. We want solutions, baby! I’ll give you two.
The first one is using the global
keyword. It’s pretty self-explanatory. It let’s Python know that bar
is a global variable and not local.
The second, more preferred solution, is - don’t. In the sense of - don’t use a global that isn’t constant. In my day-to-day I work on a lot of Python code and there isn’t one use of the global
keyword. It’s nice to know about it, but in the end it’s avoidable. If you want to keep a value that is used throughout your code, define it as a class attribute for a new class. That way the global
keyword is redundant since you need to qualify your variable access with its class name:
Read the rest of the series:
- Part 1: Using a Mutable Value as a Default Value
- Part 2: Scoping (YOU ARE HERE!)
Follow me on Twitter and Facebook