How to Write Good Code
Overview
The ease by which other people can read and understand a program (often called “readability” in software engineering) is perhaps the most important quality of a program. Readable programs are used and extended by others, sometimes for decades. For this reason, we often repeat in CS 111 that programs are written to be read by humans, and only incidentally to be interpreted by computers.
In CS 111, each project has a composition score that is graded on the style of your code. This document provides some guidelines. A program is composed well if it is concise, well-named, understandable, and easy to follow.
Excellent composition does not mean adhering strictly to prescribed style conventions. There are many ways to program well, just as there are many styles of effective communication. However, the following guiding principles universally lead to better composition of programs:
- Names: To a computer, names are arbitrary symbols: “xegyawebpi” and “foo” are just as meaningful as “tally” and “denominator”. To humans, comprehensible names aid immensely in comprehending programs. Choose names for your functions and values that indicate their use, purpose, and meaning. See the lecture notes section on choosing names for more suggestions.
- Functions: Functions are our primary mechanism for abstraction, and so each function should ideally have a single job that can be used throughout a program. When given the choice between calling a function or copying and pasting its body, strive to call the function and maintain abstraction in your program. See the lecture notes section on composing functions for more suggestions.
- Purpose: Each line of code in a program should have a purpose. Statements should be removed if they no longer have any effect (perhaps because they were useful for a previous version of the program, but are no longer needed). Large blocks of unused code, even when turned into comments, are confusing to readers. Feel free to keep your old implementations in a separate file for your own use, but don’t turn them in as your finished product.
- Brevity: An idea expressed in four lines of code is often clearer than the same idea expressed in forty. You do not need to try to minimize the length of your program, but look for opportunities to reduce the size of your program substantially by reusing functions you have already defined.
Names and variables
Variable and function names should be self-descriptive:
Goodgoal, score, opp_score = 100, 0, 0
greeting = 'hello world'
is_even = lambda x: x % 2
Bada, b, m = 100, 0, 0
thing = 'hello world'
stuff = lambda x: x % 2
Indices and mathematical symbols
Using one-letter names and abbreviations is okay for indices, mathematical symbols, or if it is obvious what the variables are referring to.
Goodi = 0 # a counter for a loop
x, y = 0, 0 # x and y coordinates
p, q = 5, 17 # mathematical names in the context of the question
In general, i
, j
, and k
are the most common indices used.
‘o’ and ‘l’
Do not use the letters ‘o’ and ‘l’ by themselves as names:
Bado = O + 4 # letter 'O' or number 0?
l = l + 5 # letter 'l' or number 1?
Unnecessary variables
Don’t create unnecessary variables. For example,
Goodreturn answer(argument)
Badresult = answer(argument)
return result
However, if it is unclear what your code is referring to, or if the expression is too long, you should create a variable:
Gooddivisible_49 = lambda x: x % 49 == 0
score = (total + 1) // 7
do_something(divisible_49, score)
Baddo_something(lambda x: x % 49 == 0, (total + 1) // 7)
Profanity
Don’t leave profanity in your code. Even if you’re really frustrated.
Badeff_this_class = 666
Naming convention
Use lower_case_and_underscores
for variables and functions:
Goodtotal_score = 0
final_score = 1
def mean_strategy(score, opp):
...
BadTotalScore = 0
finalScore = 1
def Mean_Strategy(score, opp):
...
On the other hand, use CamelCase
for classes:
Goodclass ExampleClass:
...
Badclass example_class:
...
Spacing and Indentation
Whitespace style might seem superfluous, but using whitespace in certain places (and omitting it in others) will often make it easier to read code. In addition, since Python code depends on whitespace (e.g. indentation), it requires some extra attention.
Spaces vs. tabs
Use spaces, not tabs for indentation. Our starter code always uses 4 spaces instead of tabs. If you use both spaces and tabs, Python will raise an IndentationError
.
Many text editors, including VS Code and Atom, offer a setting to automatically use spaces instead of tabs.
Indent size
Use 4 spaces to denote an indent. Technically, Python allows you to use any number of spaces as long as you are consistent across an indentation level. The conventional style is to use 4 spaces.
Line Length
Keep lines under 80 characters long. Other conventions use 70 or 72 characters, but 80 is usually the upper limit. 80 characters is not a hard limit, but exercise good judgement! Long lines might be a sign that the logic is too much to fit on one line!
Double-spacing
Don’t double-space code. That is, do not insert a blank line in between each line of code. It increases the amount of scrolling needed and goes against the style of the rest of the code we provide.
One exception to this rule is that there should be space between two functions or classes.
Spaces with operators
Use spaces between +
and -
. Depending on how illegible expressions get, you can use your own judgement for *
, /
, and **
(as long as it’s easy to read at a glance, it’s fine).
Goodx = a + b*c*(a**2) / c - 4
Badx=a+b*c*(a**2)/c-4
Spacing lists
When using tuples, lists, or function operands, leave one space after each comma ,
:
Goodtup = (x, x/2, x/3, x/4)
Badtup = (x,x/2,x/3,x/4)
Line wrapping
If a line gets too long, use parentheses to continue onto the next line:
Gooddef func(a, b, c, d, e, f,
g, h, i):
# body
tup = (1, 2, 3, 4, 5,
6, 7, 8)
names = ('alice',
'bob',
'eve')
Notice that the subsequent lines line up with the start of the sequence. It can also be good practice to add an indent to imply expression continuation; use whichever format expresses the line continuation most clearly.
Goodtotal = (this_is(a, very, lengthy) + line + of_code + so_it - should(be, separated) + onto(multiple, lines))
Blank lines
Leave a blank line between the end of a function or class and the next line:
Gooddef example():
return 'stuff'
x = example() # notice the space above
Trailing whitespace
Don’t leave whitespace at the end of a line.
Control Structures
Boolean comparisons
Don’t compare a boolean variable to True
or False
:
Badif pred == True: # bad!
...
if pred == False: # bad!
...
Instead, do this:
Goodif pred: # good!
...
if not pred: # good!
...
Use the “implicit” False
value when possible. Examples include empty containers like []
, ()
, {}
, set()
.
Goodif lst: # if lst is not empty
...
if not tup: # if tup is empty
...
Checking None
Use is
and is not
for None
, not ==
and !=
.
Redundant if/else
Don’t do this:
Badif pred: # bad!
return True
else:
return False
Instead, do this:
Goodreturn pred # good!
Likewise:
Badif num != 49:
total += example(4, 5, True)
else:
total += example(4, 5, False)
In the example above, the only thing that changes between the conditionals is the boolean at the end. Instead, do this:
Goodtotal += example(4, 5, num != 49)
In addition, don’t include the same code in both the if
and the else
clause of a conditional:
Badif pred: # bad!
print('stuff')
x += 1
return x
else:
x += 1
return x
Instead, pull the line(s) out of the conditional:
Goodif pred: # good!
print('stuff')
x += 1
return x
while vs. if
Don’t use a while
loop when you should use an if
:
Badwhile pred:
x += 1
return x
Instead, use an if
:
Goodif pred:
x += 1
return x
Parentheses
Don’t use parentheses with conditional statements:
Badif (x == 4):
...
elif (x == 5):
...
while (x < 10):
...
Parentheses are not necessary in Python conditionals (they are in other languages though).
Comments
Recall that Python comments begin with the #
sign. Keep in mind that the triple-quotes are technically strings, not comments. Comments can be helpful for explaining ambiguous code, but there are some guidelines for when to use them.
Docstrings
Put docstrings only at the top of functions. Docstrings are denoted by triple-quotes at the beginning of a function or class:
Gooddef average(fn, samples):
"""Calls a 0-argument function SAMPLES times, and takes
the average of the outcome.
"""
You should not put docstrings in the middle of the function – only put them at the beginning.
Remove commented-out code
Remove commented-out code from final version. You can comment lines out when you are debugging but make sure your final submission is free of commented-out code. This makes it easier for readers to identify relevant portions of code.
Unnecessary comments
Don’t write unnecessary comments. For example, the following is bad:
Baddef example(y):
x += 1 # increments x by 1
return square(x) # returns the square of x
Your actual code should be self-documenting – try to make it as obvious as possible what you are doing without resorting to comments. Only use comments if something is not obvious or needs to be explicitly emphasized.
Repetition
In general, don’t repeat yourself (DRY). It wastes space and can be computationally inefficient. It can also make the code less readable.
Do not repeat complex expressions:
Badif a + b - 3 * h / 2 % 47 == 4:
total += a + b - 3 * h / 2 % 47
return total
Instead, store the expression in a variable:
Goodturn_score = a + b - 3 * h / 2 % 47
if turn_score == 4:
total += turn_score
return total
Don’t repeat computationally-heavy function calls either:
Badif takes_one_minute_to_run(x) != ():
first = takes_one_minute_to_run(x)[0]
second = takes_one_minute_to_run(x)[1]
third = takes_one_minute_to_run(x)[2]
Instead, store the expression in a variable:
Goodresult = takes_one_minute_to_run(x)
if result != ():
first = result[0]
second = result[1]
third = result[2]
Semicolons
Do not use semicolons. Python statements don’t need to end with semicolons.
Generator expressions
Generator expressions are okay for simple expressions. This includes list comprehensions, dictionary comprehensions, set comprehensions, etc. Generator expressions are neat ways to concisely create lists. Simple ones are fine:
Goodex = [x*x for x in range(10)]
L = [pair[0] + pair[1]
for pair in pairs
if len(pair) == 2]
However, complex generator expressions are very hard to read, even illegible. As such, do not use generator expressions for complex expressions.
BadL = [x + y + z for x in nums if x > 10 for y in nums2 for z in nums3 if y > z]
Use your best judgement.