CS 111 Code Quality Guidelines
The ease by which other people can read and understand a program (often called “readability” in software engineering) is perhaps the most important quality of a program. Readable programs are used and extended by others, sometimes for decades. For this reason, we often repeat in CS 111 that programs are written to be read by humans, and only incidentally to be interpreted by computers.
In CS 111, each project has a composition score that is graded on the style of your code. This document provides some guidelines. A program is composed well if it is concise, well-named, understandable, and easy to follow.
Excellent composition does not mean adhering strictly to prescribed style conventions. There are many ways to program well, just as there are many styles of effective communication.
The following are the code style guidelines that we will be using throughout the CS 111 course. The standards that are used to grade your code will include entries related to these code quality requirements.
Not all the quality standards will apply to every assignment. However, if your code contains any element listed below, the quality standards apply for that code.
Variables & Constants
Variables Style
Use a consistent style for your variable names. They Python convention (and recommend convention for this class) is to use lower snake case for variable names. In lower snake case, single word variables are all lowercase and variable names that contain more than one word are all lower case with underscores ‘_’ between the words.
Examples:
- height
- price_per_dozen
- current_grade
Alternately, you may choose to use lower camel case for your variable names. In lower camel case, single word variables are all lower case while in multi-word variable names, the first letter of each word after the first is capitalized.
Examples:
- height
- pricePerDozen
- currentGrade
Choose one style and be consistent, don’t mix and match throughout your code.
Constants Style
Values that are intended to be constants in your program follow a different style convention. For constants, you must use upper snake case naming. In upper snake case, the names use all capital letters and the words in multi-word constant names are separated by underscores ‘_’.
Examples:
- PI
- PEOPLE_PER_LARGE
- EARTH_RADIUS
Names
Variable names should be descriptive of what the value held in the variable is for. Variable names should be long enough to convey meaning but not so long as to be a burden to type. If you use abbreviations for words in your variable names, they should still be readable (e.g. ‘ave’ or ‘avg’ for average, ‘num’ for number, etc.)
Single character variable names are fine for item normally described by those names (e.g. i, j, k for loop variables or generic integer values, x, y, z for coordinates or generic floating point values, f, g, h for generic functions).
The goal with variable names is to make the functionality of your code readable just by reading the names and commands.
Example:
a = compute_average(t,y,m)
# what are a, t, y, & m?
average_temp = compute_average(temperature_data, year, month)
# now we know what the data stored in the variables are
Magic Numbers
Magic numbers are numbers that appear repeatedly in your code with no context of their meaning. You should give these numbers names (as constants).
Example:
What does the value 100 represent in the following code?
if num_guests > 100:
print ("Too many people registered, we have", num_guests - 100, "extra guests.")
We should give that value a name to make the code clearer:
MAX_CAPACITY = 100
if num_guests > MAX_CAPACITY:
print ("Too many people registered, we have", num_guests - MAX_CAPACITY, "extra guests.")
It is a little extra typing but makes the code much easier to understand and reason about.
Comments
In most cases, comments are optional. If you’ve written good names for your variables and functions, it should be fairly easy to understand what your code is doing. However, if there are places that are unclear, or you want to document why you did things a certain way, feel free to add comments to document your code and make it clearer.
Functions
Names
Like variable names, functions names should be long enough to describe the function but no so long as to make typing them burdensome. Again the goal is to make understanding what the code does just by reading it.
Typically, function names should describe either what the function does (e.g. calculate_average()), the effect it has on the parameters (e.g. to_lower_case()), or the value returned (e.g. is_empty()).
Name Style
Like variables, the Python convention for function names (and preferred style for this class) is lower snake case. Alternately you may use lower camel case for function names as well.
Choose one style and be consistent, don’t mix and match throughout your code.
Function Size
Large functions are harder to read, understand and debug. A good rule of thumb is that functions should be less than 20 lines long. This is not a hard and fast rule but more than a guideline. If you find yourself writing very long functions, consider breaking the functionality up into a series of smaller functions that the original function then calls and give them appropriate names so that someone reading the original function can understand what it is doing by reading the names of the functions it calls.
Don’t Repeat Yourself
If you find yourself writing the same (or similar) code repeatedly in your programs, you should extract that code into a function (possibly with parameters to cover the variations) that is then called at the various points in the code that functionality is needed.
This results in shorter code that is easier to read (you now have a good name describing what is happening where the function is called) and easier to maintain and debug (there is only one instance of the code instead of multiple copies).
In this class we expect you to identify duplicated (or very similar) code and replace the instances of that code with function calls to handle the required work.
Single Responsibility
The single responsibility principle refers to what the function is supposed to do. Good functions are responsible for a single bit of functionality, not multiple different aspects. For example, you wouldn’t write a print_and_fax() function but rather two functions, print() and fax(), each of which would focus on one single action.
Adhering to this principle makes code that is easier to debug, extend, and maintain as functions are smaller and more modular.
In this class we expect your functions to adhere to this design principle and be responsible for a single piece of functionality. If a function is doing multiple things, break it into multiple functions.
Docstrings
All functions should contain a docstring at the very beginning. The docstring should include at least:
- A brief description of what the function does
- A description of each function parameter and its domain (range of input values)
- A description of the return values of the function and their range
- (optional) A description of the algorithm used and/or why it was chosen
- (optional$^*$) doctests for the function.
$^*$some assignments will require doctests to be written for some functions.
Example:
This is fairly complex to demonstrate all the required items plus doctests. Yours only need to be this detailed if necessary
def compute_average(daily_data, year=None, month=None):
"""
This function computes the average of all the values in the daily_data.
If a year and/or month are specified, it only averages over the parts of the
dictionary that match the specified values. i.e. if only a year is specified,
it averages all the data over all the months of that year. If only a month is
specified, it averages that month's data across all years. If both are
specified, it averages the values for the specified month of the specified
year.
The function returns a single floating point value. If an invalid month
or year is given, the function throws and out_of_range exception.
daily_data is a dictionary that has integer years as the keys and a
dictionary as the value. The inner dictionary uses integer values for
the months as the keys and a vector of numbers as the value. These values
could be integers or floating point numbers.
year can be any integer with negative numbers representing years BCE
month is an integer with a value from 1 to 12
>>> my_data = {1999:{1:[10,12],3:[15,18]}, 2000:{1:[11,13,12]}}
>>> compute_average(my_data)
13.0
>>> compute_average(my_data, year=2000)
12.0
>>> compute_average(my_data, month=1)
11.6
>>> compute_average(my_data, 1999, 1)
11.0
"""
Classes
Class Names
The Python convention for class names (and the style required for this class) is to use upper camel case. In upper camel case the first letter of the class name is capitalized and if it is a multi-word clas name, the first letter of each work is capitalized. It is essentially identical to lower camel case except the first letter of the name is capitalized.
Examples:
- Book
- BookShelf
- LinkValidator
Variable & Function Names
Instance and class variables and functions follow the same convention as regular variables and functions for their names and style for items that are intended to be publicly accessible.
For “private” class variables or functions, i.e. not intended to be accessed outside the class, variable names should be prefaced by an underscore ‘_’. This is the Python convention and what we will use in this class.
Examples:
from datetime import datetime
class Book:
def __init__(self):
# public instance variables
self.author = "Anonymous"
self.title = "Unknown"
self.publication_year = datetime.now().year
# private instance variables
self._is_public_domain = False
self._record_last_modified = datetime.now()