Python Slots Vs Dict
List and dictionary are fundamentally different data structures . A list can store a sequence of objects in a certain order such that you can index into the list, or iterate over the list. Moreover, List is a mutable type meaning that lists can be modified after they have been created. Python dictionary is an implementation of a hash table and is a key-value store. It is not ordered and it requires that the keys are hashtable. Also, it is fast for lookups by key.
Elements in a list have the following characteristics:
- They maintain their ordering unless explicitly re-ordered (for example, by sorting the list).
- They can be of any type, and types can be mixed.
- They are accessed via numeric (zero based) indices.
Elements in a Dictionary have the following characteristics:
- Every entry has a key and a value
- Ordering is not guaranteed
- Elements are accessed using key values
- Key values can be of any hashtable type (i.e. not a dict) and types can be mixed
- Values can be of any type (including other dict’s), and types can be mixed
Python provides another composite data type called a dictionary, which is similar to a list in that it is a collection of objects. Here’s what you’ll learn in this.
Usage:Use a dictionary when you have a set of unique keys that map to values and to use a list if you have an ordered collection of items.
- Python: check if key exists in dictionary (6 Ways) Python: How to copy a dictionary Shallow Copy vs Deep Copy; Python: How to create a list of all the Values in a dictionary? Python Dictionary: update function tutorial & examples; What is a dictionary in python and why do we need it? Pandas: Create Series from dictionary in python.
- A dictionary object can be constructed using dict function. This function takes a tuple of tuples as argument. Each tuple contains key value pair. t=((1,.
The goal of this series is to describe internals and general concepts behind the class
object in Python 3.6. In this part, I will explain how Python stores and lookups attributes. I assume that you already have a basic understanding of object-oriented concepts in Python.
Let's start with a simple class:
Here the Vehicle
is a class, and the car
is an instance of the class.
The dot notation (e.g. car.kind
) is called an attribute reference, which usually points either to a variable or a method (function).
Instance and class variables
The model_name
is called an instance variable, which value belongs to an instance. On the other hand, the kind
is a class variable, which owner is a class.
It is important to understand the difference between them. Changing class variables affect all instances.
What happens when you change a class variable from the instance?
As you can see, the value of the kind
variable changes only for one instance. How is it possible?
Instead of changing a class variable Python creates a new instance variable with the same name. Hence, the instance variables have precedence over class variables when searching for an attribute value.
Mutable class variables
You need to be very careful when working with mutable class variables (e.g., list, set, dictionary). Unlike immutable types, you can change them from an instance.
The rule of thumb here is to avoid class variables unless you have a reason to use them.
How Python stores instance attributes
In Python, all instance variables are stored as a regular dictionary. When working with attributes, you just changing a dictionary.
We can access instance dictionary by calling __dict__
dunder (magic) method:
By knowing this detail, we can save and later restore the state of an arbitrary class:
How Python stores class attributes
As I said earlier, class attributes are owned by a class itself (i.e., by its definition). As it turns out, classes are using a dictionary too.
Class dictionary can also be accessed from an instance, using __class__
dunder method (i.e., car.__class__.__dict__
).
Dictionaries of classes are protected by mappingproxy
. The proxy checks that all attribute names are strings, which helps to speed-up attribute lookups. As a downside, it makes dictionary read-only.
Because all methods belong to a class, they are also stored in this dictionary.
Functions and methods
As you may know, a method is a function that belongs to a specific class. In Python 2, there were two kinds of methods: unbound and bound. Python 3 has only latter.
Bound methods are associated with data of the instance they bound to:
We can access an instance from a bound method:
Class dictionary stores functions, which become methods when they are accessed by attribute syntax (dot notation). With the help of descriptor protocol, every function has a __get__
method, which bounds function to an object.
Manual function bounding:
As a result, bound method omits first argument (i.e. self
) of a function.
There is a well-written explanation of how it is work in Python's documentation: Functions and Methods and Method Objects.
Inheritance and attribute lookup order
Now you know that all variables and methods are stored in two dictionaries. It is time to understand how Python performs attribute lookup in case of inheritance.
Since every Python class implicitly inherits from object
, there is always one level of inheritance.
The mro
(Method Resolution Order) is a special method, which returns linearized order of classes.
To fully understand lookup order you need be familiar with Descriptor Protocol. But basically, the are two types of descriptors:
If an object defines both __get__()
and __set__()
, it is considered a data descriptor. Descriptors that only define __get__()
are called non-data descriptors (they are typically used for methods but other uses are possible).
Thus, because functions only implement __get__
, they are called non-data descriptors.
Python uses the following order:
- Data descriptors from class dictionary and its parents
- Instance dictionary
- Non-data descriptors from class dictionary and its parents
Keep in mind, that no matter how many levels of inheritance you have there is always one instance dictionary which stores all instance variables.
Pseudo-code of attribute lookup:
__slots__
When dealing with thousands of instances, memory consumption can be a problem. Because of the underlying implementation of a hash table, creating a dictionary for each instance takes a lot of memory. Hopefully, Python provides a way to disable per-instance dictionary by defining __slots__
attribute.
Here is how slots are usully defined:
When you define slots, instead of creating a dictionary for each instance, attribute values are stored in a list. In turn, attributes names are moved to a class dictionary.
On the class-level, each slot has as a descriptor that knows its unique position in the instance list. There is a good explanation of how it works by Raymond Hettinger. Although it was written 10 years ago, the concept stays the same.
Bonus: function attributes
Python's dictionary is so fundamental to Python, that many other objects using it too. Since Python 2.1, functions can have arbitrary attributes, that is, you can use a function as key-value storage.
Internally, it's just a dictionary that handles failed attribute lookups (i.e., nondefault attributes). You can access or even replace such dictionary using already familiar __dict__
attribute. The PEP 232 has an extensive description of this feature.
For example, you can track the number of times a function was called:
The explanation of attributes got a lot longer than I expected and I will split the article about internals into a series of posts. If you don't want to miss the follow-up on this topic, you can subscribe to my RSS.
Want a monthly digest of these blog posts?
- Andrew Franklin 2 years, 10 months ago (from disqus) #
double -> square
;)
List Vs Dict Python
- 行者酱油君 2 years, 10 months ago (from disqus) #
When i saw the title,i expected it's a post about python implement in C, But it seems it just in python level...
- Artem 2 years, 10 months ago (from disqus) #
Very few people want to read C code, I'm rewriting all C code to Python pseudo-code. The get_attribute function originally written in C.
- Aidas Bendoraitis 2 years, 10 months ago (from disqus) #
Very detailed and comprehensive article. Thanks!
I would just like to have information about
__slots__
included.And I would rename the
from_dict(dict)
tofrom_dict(dictionary)
, becausedict
is a reserved keyword for the dictionary type.- Artem 2 years, 10 months ago (from disqus) #
Next article starts with slots. I will finish it soon.
Good catch about dict!
upd: Actually, I changed my mind and added description about slots to this article.
Python Dict Methods
- Tony Su 12 months ago #
I happened to read this article and it really solved many puzzles in my mind of how Python does this and that under the table and why.
Really appreciated it!!!
Python Dict Update Method
- Sam B 8 months, 2 weeks ago #
Great article and one of the better articles that describes classes and init I am new to python and one thing I am still unclear about is why should I use classes when I can use a dictionary? They both can store large datasets. And how often are classes used versus dictionaries? Of everything that I have read (keep in mind, at the beginner) level, dictionaries seem to be preferred.
- Artem 8 months, 2 weeks ago #
I always prefer dictionaries. I don't need classes to just store some data, but you can use them to add extra input validation for each field.