Python: Common Newbie Mistakes, Part 2

This post is a few years old now, so some details (or my opinions) might be out of date.
I would still love to hear your feedback in the comments below. Enjoy!

Scoping

The focus of this part is an area of problems where scoping in Python is misunderstood. Usually, when we have global variables (okay, I’ll say it because I have to - global variables are bad), Python understands it if we access them within a function:

bar = 42
def foo():
    print bar

Here we’re using, inside foo, a global variable called bar and it works as expected:

>>> foo()
42

This is pretty cool. Usually we’ll use this feature for constants that we want to use throughout the code. It also works if we use some function on a global like so:

bar = [42]
def foo():
    bar.append(0)

>>> print bar
[42, 0]

But what if we want to change bar?

>>> bar = 42
... def foo():
...     bar = 0
... foo()
... print bar
42

We can see that foo ran fine and without exceptions, but if we print the value of bar we’ll see that it’s still 42! What happens here is that the line bar = 0, instead of changing bar, created a new, local variable also called bar and set its value to 0. This is a tough bug to find and it causes some grief to newbies (and veterans!) who aren’t really sure of how Python’s scoping works. To understand when and how Python decided to treat variables as global or local, let’s look at a less common, but probably more baffling version of this mistake and add an assignment to bar after we print it:

bar = 42
def foo():
    print bar
    bar = 0

This shouldn’t break our code, right? We added an assignment after the print, so there’s no way it should affect it (Python is an interpreted language after all), right? Right??

>>> foo()
Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    foo()
  File "<pyshell#3>", line 3, in foo
    print bar

UnboundLocalError: local variable ''bar'' referenced before assignment

WRONG.

How is this possible? Well, there are two parts to this misunderstanding. The first misconception is that Python, being an interpreted language (which is awesome, I think we can all agree), is executed line-by-line. In truth, Python is being executed statement-by-statement. To get a feel of what I mean, go to your favorite shell (you aren’t using the default one, I hope) and type the following:

def foo():

Press Enter. As you can see, the shell didn’t offer any output and it’s clearly waiting for you to continue with your function definition. It will continue to do so until you finish declaring you function. This is because a function declaration is a statement. Well, it’s a compound statements, that includes within it many other statements, but a statement notwithstanding. The content of your function isn’t being executed until you actually call it. What is being executed is that a function object is being created.

This leads us to the second point. Again, Python’s dynamic and interpreted nature leads us to believe that when the line print bar is executed, Python will look for a variable bar first in the local scope and then in the global scope. What really happens here is that the local scope is in fact not completely dynamic. When the def statement is executed, Python statically gathers information regarding the local scope of the function. When it reaches the line bar = 0 (not when it executes it, but when it reads the function definition), it adds “bar” to the list of local variable for foo. When foo is executed and Python tries to execute the line print bar, it looks for the variable in the local scope and it finds it, since it was statically accessed, but it knows that it wasn’t assigned yet - it has no value. So the exception is raised.

You could ask “why couldn’t an exception be raised when we were declaring the function? Python could have known in advance that bar was referenced before assignment”. The answer to that is that Python can’t know whether the local bar was assigned to or not. Look at the following:

bar = 42
def foo(baz):
    if baz > 0:
        print bar
    bar = 0

Python is playing a delicate game between static and dynamic. The only thing it knows for sure is that bar is assigned to, but it doesn’t know it’s referenced before assignment until it actually happens. Wait - in fact, it doesn’t even know it was assigned to!

bar = 42
def foo():
    print bar
    if False:
        bar = 0

When running foo, we get:

Traceback (most recent call last):
  File "<pyshell#17>", line 1, in <module>
    foo()
  File "<pyshell#16>", line 3, in foo
    print bar
UnboundLocalError: local variable 'bar' referenced before assignment

While we, intelligent beings that we are, can clearly see that the assignment to bar will never happen, Python ignores that fact and still declares bar as statically local.

I’ve been babbling about the problem long enough. We want solutions, baby! I’ll give you two.

>>> bar = 42
... def foo():
...     global bar
...     print bar
...     bar = 0
... 
... foo()
42
>>> bar
0

The first one is using the global keyword. It’s pretty self-explanatory. It let’s Python know that bar is a global variable and not local.

The second, more preferred solution, is - don’t. In the sense of - don’t use a global that isn’t constant. In my day-to-day I work on a lot of Python code and there isn’t one use of the global keyword. It’s nice to know about it, but in the end it’s avoidable. If you want to keep a value that is used throughout your code, define it as a class attribute for a new class. That way the global keyword is redundant since you need to qualify your variable access with its class name:

>>> class Baz(object):
...     bar = 42
... 
... def foo():
...     print Baz.bar  # global
...     bar = 0  # local
...     Baz.bar = 8  # global
...     print bar
... 
... foo()
... print Baz.bar
42
0
8

Read the rest of the series:

Discuss this post at the comment section below.
Follow me on Twitter and Facebook

Similar Posts