Blog post

Weird Python: 5 Unexpected Behaviors in the Python Interpreter

Quazi Nafiul Islam

Developer Advocate - Python

May 1, 2023

5 min read

Python

We love Python, but sometimes, it can behave in counterintuitive ways. In this post, we want to show you five ways in which Python's interpreter behaves in ways that you wouldn't expect.

Mutable Default Arguments

If you've been using Python for a while, then this one will not come as a surprise. In Python, default arguments are evaluated only once, and this happens when the function is defined. As a result, mutable objects like lists and dictionaries retain their state between function calls.

def add_item(item, items=[]):
   items.append(item)
   return items


print(add_item(1))  # Output: [1]
print(add_item(2))  # Output: [1, 2], not [2] as you might expect

In order to avoid this, you can use some variation of the following code:

def add_item(item, items=None):
   if items is None:
       items = []
   items.append(item)
   return items

The idea is to set items to None, and then set it to an empty list. This way, we have a default value, which avoids the issue of having a mutable default value.

Integer Caching

Python Integers are cached, meaning that they are objects that are already created as the interpreter starts. So, when one assigns the number 10 to the variable a, the interpreter simply reuses that object. This has some interesting implications when using the identity operator is.

a = 10
b = 10
print(a is b)  # Output: True

However, if you compare any numbers using the identity operator that are not in the range of -5 to 256, then you will get a different result:

c = 300
d = 300
print(c is d)  # Output: False

This is why the equality operator == should always be used when comparing numbers. In the current implementation of our analyzer, we have detection when you are comparing a variable directly against a literal, and we are working towards support for variable-to-variable comparisons as well.

Late Binding Closures

Consider the following code:

>>> funcs = [lambda x: x * i for i in range(5)]

>>> print([f(3) for f in funcs])

[12, 12, 12, 12, 12]

The lambdas above should have resulted in 0, 3, 6, 9, 12 not 12, 12, 12, 12, 12. One would hope that we would get functions that would multiply x by a different number, i. However, the lambda function simply captures the variable i, and not its value during loop iteration. By the time the lambda function is actually called, the value of i is already 4. If we wanted to get our desired output, we would need to change the code like so:

funcs = [lambda x, i=i: x * i for i in range(5)]
print([f(3) for f in funcs])  # Output: [0, 3, 6, 9, 12]

By doing this, we tell the interpreter to capture the current value of i during each iteration of the loop. This kind of issue can be caught in SonarLint.

String Interning

Like Integer Caching, Python stores some small strings only once, and simply has variables point to it.

>>> a = "hello"

>>> b = "hello"

>>> a is b

True

However, if we try this with a larger string, we will get a different result.

>>> c = "hello world"

>>> d = "hello world"

>>> c is d

False

Again, this is why the equality operator == should always be used when making comparisons. Similar to Integer Caching, when a variable-to-variable comparison is made, SonarLint will not catch it, but when a variable to a literal comparison is made, SonarLint can catch this type of issue. We are working towards adding support for variable-to-variable comparisons as well.

The += Operator on Mutable and Immutable Types

Consider the following example:

a = [1, 2, 3]
b = a
a += [4]
print(b)  # Output: [1, 2, 3, 4]


s = "hello"
t = s
s += " world"
print(t)  # Output: "hello"

In the first part of the example, a is a list (mutable). When we create b, it points to the same list object in memory as a. Using the += operator with a mutable object like a list modifies the original object in place. In this case, a += [4] appends the element 4 to the original list, which is also referenced by b; no new list is created. Therefore, when we print b, the output is 1, 2, 3, 4.

In the second part of the example, s is a string (immutable). When we create t, it points to the same string object in memory as s. However, using the += operator with an immutable object like a string creates a new object, rather than modifying the original object in place.

When we try to add word to the string s with, s += world creates a new string object with the value hello world and assigns it to s. The variable t still points to the original string object (in memory) with the value hello. This is why when we print t, the output is hello.

This difference in behavior between mutable and immutable objects when using the += operator can lead to unexpected results when you have multiple variables or containers referencing the same object.

The way to avoid this problem is to use object specific methods. For example, with lists, we can use append or extend. With strings, we can create new strings that are formatted using f-strings like so:

s = f”{t} world”

Currently, this is the only weird quirk in Python (mentioned in this list) that we don't have a rule for, but we are considering adding it since many new Pythonistas fall into this trap of assuming that the += operator behaves the same way with different objects.

Conclusion

Python, like any other programming language, has its quirks. However, with practice, we can learn to avoid making some of the most common mistakes, and ensure that our code works the way we intended. Tools like SonarLint and SonarQube help in avoiding such mistakes.

twitter facebook linkedin mail