We love Python, but sometimes, it can behave in counterintuitive ways. In this post, we want to show you five ways in which Python's interpreter behaves in ways that you wouldn't expect.
Mutable Default Arguments
If you've been using Python for a while, then this one will not come as a surprise. In Python, default arguments are evaluated only once, and this happens when the function is defined. As a result, mutable objects like lists and dictionaries retain their state between function calls.
In order to avoid this, you can use some variation of the following code:
The idea is to set items to None
, and then set it to an empty list. This way, we have a default value, which avoids the issue of having a mutable default value.
Integer Caching
Python Integers are cached, meaning that they are objects that are already created as the interpreter starts. So, when one assigns the number 10
to the variable a
, the interpreter simply reuses that object. This has some interesting implications when using the identity operator is
.
However, if you compare any numbers using the identity operator that are not in the range of -5 to 256, then you will get a different result:
This is why the equality operator ==
should always be used when comparing numbers. In the current implementation of our analyzer, we have detection when you are comparing a variable directly against a literal, and we are working towards support for variable-to-variable comparisons as well.
Late Binding Closures
Consider the following code:
>>> funcs = [lambda x: x * i for i in range(5)]
>>> print([f(3) for f in funcs])
[12, 12, 12, 12, 12]
The lambdas above should have resulted in 0, 3, 6, 9, 12
not 12, 12, 12, 12, 12
. One would hope that we would get functions that would multiply x
by a different number, i
. However, the lambda function simply captures the variable i
, and not its value during loop iteration. By the time the lambda function is actually called, the value of i
is already 4
. If we wanted to get our desired output, we would need to change the code like so:
By doing this, we tell the interpreter to capture the current value of i
during each iteration of the loop. This kind of issue can be caught in SonarLint.
String Interning
Like Integer Caching, Python stores some small strings only once, and simply has variables point to it.
>>> a = "hello"
>>> b = "hello"
>>> a is b
True
However, if we try this with a larger string, we will get a different result.
>>> c = "hello world"
>>> d = "hello world"
>>> c is d
False
Again, this is why the equality operator ==
should always be used when making comparisons. Similar to Integer Caching, when a variable-to-variable comparison is made, SonarLint will not catch it, but when a variable to a literal comparison is made, SonarLint can catch this type of issue. We are working towards adding support for variable-to-variable comparisons as well.
The += Operator on Mutable and Immutable Types
Consider the following example:
In the first part of the example, a
is a list (mutable). When we create b
, it points to the same list object in memory as a
. Using the +=
operator with a mutable object like a list modifies the original object in place. In this case, a += [4]
appends the element 4
to the original list, which is also referenced by b;
no new list is created. Therefore, when we print b
, the output is 1, 2, 3, 4
.
In the second part of the example, s
is a string (immutable). When we create t
, it points to the same string object in memory as s
. However, using the +=
operator with an immutable object like a string creates a new object, rather than modifying the original object in place.
When we try to add word
to the string s
with, s += world
creates a new string object with the value hello world
and assigns it to s
. The variable t
still points to the original string object (in memory) with the value hello
. This is why when we print t
, the output is hello
.
This difference in behavior between mutable and immutable objects when using the +=
operator can lead to unexpected results when you have multiple variables or containers referencing the same object.
The way to avoid this problem is to use object specific methods. For example, with lists, we can use append
or extend
. With strings, we can create new strings that are formatted using f-strings like so:
s = f”{t} world”
Currently, this is the only weird quirk in Python (mentioned in this list) that we don't have a rule for, but we are considering adding it since many new Pythonistas fall into this trap of assuming that the +=
operator behaves the same way with different objects.
Conclusion
Python, like any other programming language, has its quirks. However, with practice, we can learn to avoid making some of the most common mistakes, and ensure that our code works the way we intended. Tools like SonarLint and SonarQube help in avoiding such mistakes.