You might already know that most Python classes have an internal dictionary called __dict__ which holds all of their internal variables. And what’s amazing about Python is that we can simply inspect even internal implementation details like this one:
>>>foo=Foo()>>>foo.__dict__{'bar':'hello!'}
So we can arrive at the following incomplete hypothesis:
Err… okay. We can see that __getattr__ can “fake” attribute access, but it doesn’t work if we already have that variable defined (meaning, foo.bar returns 'hello!' and not 'goodbye!'). So this mechanism is more complex than it seemed, and there’s actual logic involved when accessing attributes. Indeed, there’s a magic method that’s called whenever we access instance attributes, but it’s clearly not __getattr__ as we can see in the above example. This magic method is called __getattribute__ and we’ll try to reverse engineer it by observing its different behaviors. For now, let’s modify our hypothesis:
foo.bar is equivalent to calling foo.__getattribute__('bar'), which is roughly:
Great, so let’s just make sure that we also support setting these variables and we can go home and enjoy the rest of the -
>>>foo.baz=1337>>>foo.baz1337>>>foo.my_getattribute('baz')='h4x0r'SyntaxError:can't assign to function call
Damn.
In retrospect this seems a bit obvious. my_getattribute returns something that is like a reference1. We can mutate it, but we can’t reassign the original value to a new object. So what the hell is going on here? If foo.baz translates to any function call, how can we ever assign to it?
When we look at a statement like foo.bar = 1, there’s an extra something going on. And it seems like we simply don’t access attributes the same way when we set them, as opposed to get them. Indeed, we can also override __setattr__ in a similar manner:
There’s intentional asymmetry such that __setattr__ doesn’t have an analogous accompanying method similar to __getattribute__ (i.e., there’s no __setattribute__).
__setattr__ works in __init__ as well - that’s why we do a weird assignment to my_dunder_dict (self.__dict__['my_dunder_dict'] = {}). Otherwise, we’ll get infinite recursion.
And then… there’s property (and friends). Decorators that make methods behave like members. Sigh.
Let’s try to understand how this is happening.
>>>classFoo(object):...def__getattribute__(self,item):...print('__getattribute__ was called')...returnsuper().__getattribute__(item)......def__getattr__(self,item):...print('__getattr__ was called')...returnsuper().__getattr__(item)......@property...defbar(self):...print('bar property was called')...return100>>>f=Foo()>>>f.bar__getattribute__wascalledbarpropertywascalled
Out of curiosity, what’s in f.__dict__ then?
>>>f.__dict____getattribute__wascalled{}
Let me get this straight. bar is not in __dict__, but __getattr__ isn’t called. wat?.
Well, bar is a method and it accepts the class instance, but it’s actually a member of the class object, not the instance. Let’s verify:
We can see bar as the last item in that dictionary. In order to reconstruct __getattribute__ we need to answer another question here - who has precedence, the instance, or the class?
>>>f.__dict__['bar']='will we see this printed?'__getattribute__wascalled>>>f.bar__getattribute__wascalledbarpropertywascalled100
Alright. We now know that the class’ __dict__ is also checked and that it has priority. So it’s just a minor complicati –
Wait wait wait, when did we even call the bar method? I mean, our pseudo-code for __getattribute__ never calls the object, so what’s going on?
That is all there is to it. Define any of these methods and an object is considered a descriptor and can override default behavior upon being looked up as an attribute.
If an object defines both __get__() and __set__(), it is considered a data descriptor. Descriptors that only define __get__() are called non-data descriptors (they are typically used for methods but other uses are possible).
Data and non-data descriptors differ in how overrides are calculated with respect to entries in an instance’s dictionary. If an instance’s dictionary has an entry with the same name as a data descriptor, the data descriptor takes precedence. If an instance’s dictionary has an entry with the same name as a non-data descriptor, the dictionary entry takes precedence.
To make a read-only data descriptor, define both __get__() and __set__() with the __set__() raising an AttributeError when called. Defining the __set__() method with an exception raising placeholder is enough to make it a data descriptor.
TL;DR - if you implement any of __get__, __set__ or __delete__ you have officially, erm… Descripted a Protocol, I guess? Which is exactly what the property decorator is doing. In the case of calling it like we did, it defines a read-only data descriptor, which is then called in __getattribute__.
One last refactor:
foo.bar as a getter, is equivalent to calling foo.__getattribute__('bar'), which is roughly:
Let’s try to demonstrate all the behaviors we know:
classFoo:class_attr="I'm a class attribute!"def__init__(self):self.dict_attr="I'm in a dict!"@propertydefproperty_attr(self):return"I'm a read-only property!"def__getattr__(self,item):return"I'm dynamically returned!"defmy_getattribute(self,item):ifiteminself.__class__.__dict__:print('Retrieving from self.__class__.__dict__')v=self.__class__.__dict__[item]elifiteminself.__dict__:print('Retrieving from self.__dict__')v=self.__dict__[item]else:print('Retrieving from self.__getattr__')v=self.__getattr__(item)ifhasattr(v,'__get__'):print("Invoking descriptor's __get__")v=v.__get__(self,type(self))returnv
>>> foo = Foo()
...
... print(foo.class_attr)
... print(foo.dict_attr)
... print(foo.property_attr)
... print(foo.dynamic_attr)
...
... print()
...
... print(foo.my_getattribute('class_attr'))
... print(foo.my_getattribute('dict_attr'))
... print(foo.my_getattribute('property_attr'))
... print(foo.my_getattribute('dynamic_attr'))
I'm a class attribute!
I'm in a dict!
I'm a read-only property!
I'm dynamically returned!
Retrieving from self.__class__.__dict__
I'm a class attribute!
Retrieving from self.__dict__
I'm in a dict!
Retrieving from self.__class__.__dict__
Invoking descriptor's __get__
I'm a read-only property!
Retrieving from self.__getattr__
I'm dynamically returned!
There’s always more. I’ve just scratched the surface of Python’s internals, and while the general idea is correct, it’s probable that the small details are implemented differently. Please read the official sources below if you need exact implementation details.
My hope is that aside from demonstrating how attribute access works, I’ve also convinced you of how beautiful Python is - a language you can push and prod and experiment with. Settle some knowledge debt today.