Global training solutions for engineers creating the world's electronics
Menu

Python Coding Guidelines and Idioms

Copyright 2020 Doulos Ltd

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


This document is about recommended coding styles and idioms that are specific to Python. There are many general recommendations about coding style that would apply equally to Python and to other languages (e.g. use meaningful variable names, keep most functions short, insert blank lines between functions and between groups of related statements), but are not included in this document.

In what follows, recommended Python coding styles and idioms are tagged as GOOD, and counter-examples are tagged as BAD. The counter examples are not necessarily bad in themselves, but each has been chosen to contrast with a particular GOOD example, just to make a specific point. Some of the points made are mere conventions that reflect common practice in the Python community. Many of the points made are not black-and-white, but require judgement and interpretation. Sometimes you may need to be a bit forgiving because it is not always possible to capture the full spirit of a coding guideline in a short example. Often you will need to "close your eyes" to certain aspects of an example, because the example has been contrived to illustrate a specific point. For example, many of the variable and function names below are single letters rather than being meaningful words (because they have no meaning in these contrived examples!) Please don't get hung up or dogmatic about any of this, but try to understand the spirit!  


There is an executable version of these coding guidelines available on Google Colab, allowing you to run every Python code fragment from this document, online, without having to install Python on your computer. Click here to register and gain access.


Start from the design principles of the Python language, as outlined in the Zen of Python

import this

Layout and Naming Conventions

White Space

Always use 4 spaces for indentation, as opposed to 1 space, or 2 spaces, or variable spacing, or tabs. Avoid using tabs for indentation because they are sometimes displayed differently in different editors, which can cause confusion and waste time, especially when editors convert spaces to tabs or vice-versa.

GOOD

def f():
    if True:
        a = 0
        while False:
            b = 1
            c = 2
    else:
        d = 3
        
f()

BAD

def f():
 if True:
   a = 0
   while False:
         b = 1
         c = 2
 else:
      d = 3
        
f()

Put one space either side of the assignment symbol and either side of operators:

GOOD

a = 2
if a == 2:
    b = 2 + 2

BAD

a=2
if a==2:
    b= 2+2

Put one space immediately after , and : in all lists, argument lists, and dictionaries:

GOOD

x = [4, 5, 6]

def f(a, b, c):
    return {1: a, 2: b, 3: c}

f(x[1], x[0], x[2])

BAD

x = [4,5,6]

def f(a,b,c):
    return {1:a,2:b,3:c}

f(x[1],x[0],x[2])

But do not put white space immediately within parenthesis:

GOOD

x = (2 + (3 * 4))
y = {'a': 1, 'b': 2, 'c': [3, 4]}

BAD

x = ( 2 + ( 3 + 4 ) )
y = { 'a': 1, 'b': 2, 'c': [ 3, 4 ] }

Although white space should be put around = in general, do not put white space around = when setting default values for arguments or when passing keyword arguments:

GOOD

def f(arg1, arg2=0, arg3=0):
    return arg1 + arg2 + arg3

f(1, arg3=3)

BAD

def f(arg1, arg2 = 0, arg3 = 0):
    return arg1 + arg2 + arg3

f(1, arg3 = 3)

In long expressions, consider putting white space only around lower priority operators (or use parenthesis to group operations):

GOOD

a = b = c = x = y = 1
a, b, c, x, y = [1] * 5
y = a*x + b*y + c
y

Do not use ; to put more than one statement on the same line

GOOD

a = 1
b = a + 2

BAD

a = 1; b = a + 2

Avoid using the line continuation character \. Instead, take advantage of the rule that allows line breaks within parenthesis. Where line breaks are necessary within long expressions, prefer to insert line breaks immediately before operators.

GOOD

def f(
    a_very_very_long_variable_name,
    another_very_very_long_variable_name,
    yet_another_very_very_long_variable_name):
    x = (a_very_very_long_variable_name
        + another_very_very_long_variable_name
        + yet_another_very_very_long_variable_name)
    return x

f(1, 1, 1)

BAD

def f( \
    a_very_very_long_variable_name, \
    another_very_very_long_variable_name, \
    yet_another_very_very_long_variable_name):
    x = a_very_very_long_variable_name \
        + another_very_very_long_variable_name \
        + yet_another_very_very_long_variable_name
    return x
f(1, 1, 1)

In control statements, always follow : with a line break. Avoid writing statements on the same line as :

condition = True

GOOD

def f():
    print('Hello')

if condition:
    f()

BAD

def f(): print('Hello')
    
if condition: f()

Naming

Use the variable name _ where the value of a variable is not being used:

GOOD

for i in range(8):
    print(i, end='')
for _ in range(8):     # Value of loop variable is unused
    print('~', end='')
def f():
    return (1, 2)

v, _ = f()             # Second value of tuple is unused
v

Avoid reading the value of variable _:

BAD

print(_)

Use snake_case for variable, function, and attribute names.

Use CamelCase for class names.


Consider using the __ prefix for class attributes, particularly where the same attribute name is used in a sub-class:

GOOD

class MyClass:                   # Camel case
    def __init__(self, data1):
        self.__data = data1      # __ prefix
    def get(self):
        return self.__data
        
class MySubClass(MyClass):       # Camel case
    def __init__(self, data1, data2):
        super().__init__(data1)
        self.__data = data2      # __ prefix
    def get(self):
        return self.__data
my_object = MySubClass(1, 2)     # Snake case

assert my_object.get() == 2
assert MyClass.get(my_object) == 1

try:
    my_object.__data             # __ prefix mangles the name so it is not visible outside the class
except AttributeError as details:
    print(details)

Keyword Arguments

Use keyword arguments (as opposed to positional arguments) where they help to improve readability:

def add_and_print(a, b, quiet=False, base8=False, base16=False):
    assert not (base8 and base16)
    if not quiet:
        aplusb = a + b
        if base8:
            print(oct(aplusb))
        elif base16:
            print(hex(aplusb))
        else:
            print(aplusb)

GOOD

add_and_print(2, 2, base16=True)

BAD

add_and_print(2, 2, False, False, True)

Import

Avoid wildcard imports, and generally prefer not to import individual names from modules. Instead, import the whole module, and prefix the required function or variable or class name with the module name:

GOOD

import math
math.pi, math.sin(math.pi)

NOT ENCOURAGED

from math import pi
from math import sin
pi, sin(pi)

BAD

from math import *
pi, sin(pi)

While importing individual variables or functions is discouraged, it is fine to import individual modules from packages

GOOD

 
from matplotlib import pyplot
pyplot.show()

Only use import X as Y where Y is a standard or well-known abbreviation for X

GOOD

import numpy as np
import matplotlib.pyplot as plt

plt.plot(np.array([1, 2, 3]), np.array([30, 0, 20]))
plt.show()

BAD

import numpy as n
import matplotlib.pyplot as p

p.plot(n.array([1, 2, 3]), n.array([30, 0, 20]))
p.show()

Lists, Tuples, Iterators, Generators, and Lambdas

Take advantage of lists, tuples, dictionaries, iterators, and generators. They are very convenient, very Pythonic, and have many "batteries included".

GOOD

Initialize three variables with one assignment:

a, b, c = 1, 2, 3

Swap the values of a and b with one assignment:

a, b = b, a

An example of iterator unpacking

a, *b, c = range(1, 6)

a, b, c

To initialize several variables to the same value, take advantage of the fact that you can chain the assignment operator. You don't even need a tuple:

GOOD

a = b = c = 0

To create a fixed-size list populated with a default value, use the * operator:

GOOD

N = 8
list1 = [0] * N
list1
list2 = [None] * N
list2

BAD

list1 = [0 for _ in range(N)]
list2 = [None for _ in range(N)]

This is only bad in the sense that using the * operator to build a list is such a neat trick that avoids the need for a for loop altogether. List comprehensions are good ...


Use list comprehensions in preference to for loops to populate lists, but only where the list comprehension is easy to understand. A for loop may still be preferred for more complicated or obscure cases.

GOOD

my_list = [i * i for i in range(5)]
my_list

BAD

my_list = []
for i in range(5):
    my_list.append(i * i)
my_list

This is only bad in the sense that a list comprehension is a tidier way to achieve the same thing. There is nothing wrong with using a loop to append to an empty list.


Use generators or generator expressions to generate a series of objects in cases where the overhead of building a fully-populated list in memory is not necessary. That is, in cases where a series of objects are ultimately consumed one-by-one.

GOOD

my_generator = (n * n for n in range(5))
my_generator
for i in my_generator:
    print(i, end=', ')

BAD

my_list = [i * i for i in range(5)]
my_list

Take advantage of functional programming features such as generators, lambda, map, filter, and zip (perhaps to avoid the need to construct lists in memory), but only where the code is easy to understand.

GOOD

my_generator = (n * n for n in range(5))

The generator yields up objects one at a time, and the map applies the given function to each object in turn to create a new iterable.

my_map = map(lambda x: x + 1, my_generator)   # Increment each value from my_generator
for i in my_map:
    print(i, end=', ')

A generator function, that is, a function that uses the yield statement, is a very convenient way for a user to define an iterator.

def my_generator(N):         # Generate a series of integers from 0 to N-1
    for i in range(N):
        #print(f'yield {i}')
        yield i
for j in my_generator(5):
    print(j, end=', ')

A filter applies a given function to each object from the iterable my_generator(10) in turn, and itself only returns those objects for which the lambda returns True.

for n in filter(lambda arg: arg % 2, my_generator(10)):
    print(n, end=', ')

A zip is like a clothing zipper, only extended to multiple dimensions. The first item from each of the iterables passed to zip is combined into a tuple, then the next item from each iterable, and so on.

z = zip(range(3), [4.0, 5.0, 6.0], ['abc', 'def', 'ghi'])
for i in z:
    print(i)

BAD

Avoiding nesting zip, map, filter, generators, and similar if the code becomes hard to understand.

x = zip(map(lambda x: x * x, 
            (i for i in range(10) if (i % 2) == 0)),
        filter(lambda x: x % 3, (i for i in range(8))))

What the heck?

tuple(x)

GOOD

It is much better to break up the code into readable parts, following the Python language design principle that flat is better than nested. This map generates the squares of the even integers between 0 and 9.

m = map(lambda x: x * x, (i for i in range(10) if (i % 2) == 0))

itertools.tee is being used to replicate the iterator so that we can show the value of m for debug purposes, the point being that the call to tuple(tmp) will exhaust the iterator tmp, so we need second copy of the iterator to pass forward into the zip.

import itertools
tmp, m = itertools.tee(m)
tuple(tmp)

This filter generates the integers between 0 and 7 that are not divisible by 3.

f = filter(lambda x: x % 3, (i for i in range(8)))

Again, itertools.tee is being used purely so we can show the intermediate value of f before it is consumed by the zip below.

tmp, f = itertools.tee(f)
tuple(tmp)
x = zip(m, f)
tuple(x)

Avoid using lambda where def would do perfectly well:

GOOD

def f(arg1, arg2):
    return arg1 + arg2

f(1, 2)

BAD

f = lambda arg1, arg2: arg1 + arg2

f(1, 2)

Operator Usage

When comparing a value to the special value None, prefer the is operator to the == operator, because the meaning of == can (in theory) be redefined using magic methods.

GOOD

def f(arg1=None):
    if arg1 is None:    # Arg1 is absent
        return 0
    else:
        return arg1
f()

BAD

def f(arg1=None):
    if arg1 == None:    # Might not work if == has been redefined
        return 0
    else:
        return arg1
f()

Similarly, when treating values as truth values, avoid writing == True or > 0.

GOOD

condition = True
if condition:
    pass

BAD

condition = True
if condition == True:
    pass

GOOD

my_list = [1, 2, 3]
if my_list:
    print(my_list)

BAD

my_list = [1, 2, 3]
if len(my_list) > 0:
    print(my_list)

Take advantage of the in operator whenever you need to do an existence check. That is, don't write a for loop or use multiple or operators to test for the existence of an item in a dictionary, set, or sequence.

GOOD

my_list = [1, 2, 3]
item = 2
if item in my_list:
    print("Item found")

BAD

if my_list[0] == item or my_list[1] == item or my_list[2] == item:
    print("Item found")

BAD

for i in my_list:
    if i == item:
        print("Item found")
        break

Exceptions

Only use the try...except syntax to handle genuine exceptions, not programming errors and bugs. Genuine exceptions are events, typically unusual events, that are best handled outside of the normal program control flow of functions, loops, and conditional statements. Keep try blocks short, because the longer the try block, the more likely you are to have programming errors (bugs) being caught as exceptions. Avoid catching unnamed exceptions, because every run-time programming error raises an exception.

GOOD

x, y = 12, 4
a = "abc"
try:
    b = a[x // y]   # The try-except handles division-by-zero, but there is an unanticipated bug (index out of range)
except ZeroDivisionError:
    print("Unexpected divide-by-zero")

BAD

try:
    x, y = 12, 4
    a = "abc"
    b = a[x // y]   # Bug - index out of range
except:             # Unnamed exception catches all exceptions, so the out-of-range error is hidden
    print("Exception caught")

Do not use exceptions where normal control flow statements such as if or break would do just as well.

my_list = [1, 2, 3, -1, 4]

GOOD

for i in my_list:
    if i < 0:
        print('Negative value found')
        break               # Jump out of loop

BAD

try:
    for i in my_list:
        if i < 0:
            raise Exception  # Jump out of loop
except:
    print('Negative value found')

Context Managers

Do use context managers whenever appropriate (the with construct), because context managers are the cleanest way to ensure that any resources allocated during execution are tidied up at the right time, an obvious example being opening and closing files:

GOOD

with open('Python style v2.ipynb') as file:
    for line in file:
        print(line, end='')
        if "metadata" in line:
            break

BAD

file = open('Python style v2.ipynb')
for line in file:
    print(line, end='')
    if "metadata" in line:
        break
file.close()   # Without a context manager, need to remember to close the file explicity

@property

Use the @property decorator when you want to define custom setter and getter methods that are called whenever a given property of an instance object is accessed. This is more of an advanced coding idiom, but does make it possible to add any kind of custom behavior or side-effect to a simple instance object property reference. When used appropriately, in a way that is clear and natural to the user of the class, this is very Pythonic. There are other ways of achieving the same thing in Python, but this seems to have become the preferred idiom:

GOOD

class C:
    def __init__(self):
        self.__x = None
        self.get_count = 0
        self.set_count = 0
        
    @property
    def x(self):            # Getter method for property x
        self.get_count += 1 # Custom behavior
        return self.__x
    
    @x.setter
    def x(self, value):     # Setter method for property x
        self.set_count += 1 # Custom behavior
        self.__x = value
        
    @x.deleter
    def x(self):
        print(f'Deleted after {self.get_count} gets and {self.set_count} sets')
        del self.__x   
c = C()
c.x = 1
c.x = 2
assert c.x == 2
del c.x

Use the Batteries!

Remember the Python design principle of batteries included. Do use the "batteries" that are built into the Python language and the Standard Library, rather than reinventing the wheel and coding up such features from scratch. Here are a few common examples, but the list could go on and on:


Use the Standard Library wherever it offers the feature you need. For example, creating file system directory paths.

directory = "my_dir"
file = "my_file"

GOOD

import os
os.path.join(directory, file)

BAD

directory + "/" + file

Use the built-in sorted function or sort method to sort lists or strings, as opposed to implementing your own sorting algorithm from scratch.

GOOD

list_of_chars = sorted("Monty Python")
list_of_chars

Use join to convert lists of characters to strings (with or without a separator between the characters):

GOOD

''.join(list_of_chars)

BAD

result = ''
for char in list_of_chars:
    result += char
result

Use enumerate if you need an index variable when scanning through a collection.

my_collection = {'a': 1, 'b': 2, 'c': 3}

GOOD

for i, value in enumerate(my_collection):
    print(i, value, my_collection[value])

BAD

i = 0
for value in my_collection:
    print(i, value, my_collection[value])
    i += 1

Use f-strings whenever you need to format a text string.

GOOD

a, b = 2, 3
print(f"a = {a}, b = {b}, a + b = {a + b}")

BAD

print("a = {0}, b = {1}, a + b = {2}".format(a, b, a + b))

BAD

print("a = %d, b = %d, a + b = %d" % (a, b, a + b))   # Python 2

BAD

print("a = " + str(a) + ", b = " + str(b) + ", a + b = " + str(a + b))

pytest

Use the pytest module for unit testing. pytest is not part of the Standard Library, so usually needs to be installed using pip. The point is that pytest has become the de facto standard for unit testing in the Python world, so you should use it rather than inventing your own unit testing framework. (unittest was an older unit testing framework which had a Python implementation, but pytest is more Pythonic and has become more popular in recent years.)

Unfortunately it is hard to illustrate pytest from within a Jupyter Notebook, and unit testing is anyway such a broad topic that is is better described elsewhere.

A Few General Guidelines

These are general coding guidelines that would apply for any programming language, but which have a particular twist for Python.

Assert

Do use assert statements to catch potential bugs and write defensive code. assert statements should be used as checks that expected conditions do indeed hold. A failing assert statement should always indicate an unexpected programming error. Never use assert instead of if or print, because Python programs can be executed with assertions disabled.

Because Python is a dynamic language without compile-time type checking, assertions can be a useful way to check the type of incoming argument values:

GOOD

def f(a, b):
    assert type(a) is int
    assert type(b) is int 
    return (a + b) // 2

f(2, 3)
f(2, 'b')

It is best to test just a single condition with each assert, because this makes it easier to debug assertion failures.

BAD

def f(a, b):
    assert type(a) is int and type(b) is int 
    return (a + b) // 2

f(2, 'b')

Comments

Wherever possible, re-write code to avoid the need for comments. But do write comments to explain anything surprising or obscure.

An assert is sometimes better than a comment:

GOOD

def f(arg):
    assert type(arg) is float
    return 10 ** arg
f(0.5)

BAD

def f(arg):
    # arg should be a float
    return 10 ** arg
f(0.5)

Do use docstrings to add appropriate documentation to modules, classes, and functions

GOOD

"""This is a docstring that describes the purpose of this module"""

class C:
    """This is a docstring that describes the purpose of this class"""

    def f():
        """This is a docstring that describes the purpose of this function"""
        pass
    
    def g():
        """This is a docstring that describes the purpose of this function"""
        pass
    
help(C)

Global

Avoid global and nonlocal variables whenever possible. The alternative would be to make the objects locally accessible without reference to global or nonlocal variables. This implies passing the objects as mutable arguments, that is, objects containing variables that can be assigned within functions by calling setter methods of the object. For example:

GOOD

class C:
    def set(self, arg):
        self.__v = arg
    def get(self):
        return self.__v

def set_v(arg):     # A mutable object is passed as an argument
    arg.set(1)  # Modifies the value of the object
    
def get_v(arg):
    print(arg.get())


    
def f():
    obj = C()   # obj is local to h, not global, and there are no global or nonlocal accesses to obj
    set_v(obj)
    get_v(obj)
    
f()

BAD

def set_v():
    global v
    v = 2
    
def get_v():
    print(v)
    
def f():
    set_v()
    get_v()
    
f()

del v  # Serves no purpose, but illustrates that v is defined and visible at global scope

Exactly the same point can be illustrated using nonlocal scope rather than global scope.

GOOD

def g():
    def set_v(arg):
        arg.set(3)
    
    def get_v(arg):
        print(arg.get())
    
    def f():
        obj = C()
        set_v(obj)
        get_v(obj)
    f()
g()

BAD

def g():
    def set_v():
        nonlocal v
        v = 4
    
    def get_v():
        print(v)
    
    def f():
        set_v()
        get_v()
    f()
    del v  # Serves no purpose, but illustrates that v is defined and visible at nonlocal scope
g()

In the same spirit of avoiding global variables, avoid static methods in Python, because their only significance is their scope (defined within a class). If you want variables that are common to all instance objects of a given class, it is better to use class methods.

GOOD

class C:
    @classmethod
    def set(cls, arg):
        cls.classdata = arg

    @classmethod
    def get(cls):

        return cls.classdata
    
obj1 = C()
obj2 = C()   # Two independent instance objects sharing a common class variable

obj1.set(5)
obj2.get()

BAD

class C:
    @staticmethod
    def set(arg):
        global classdata
        classdata = arg

    @staticmethod
    def get():
        return classdata

obj1 = C()
obj2 = C()   # Two independent instance objects sharing a common global variable

obj1.set(6)
obj2.get()
del classdata  # Serves no purpose, but illustrates that classdata is global