Lab 11 - Testing

Due by 11:59pm on 2023-10-17.

Starter Files

Download lab11.zip. Inside the archive, you will find starter files for the questions in this lab.

Topics

Pytest

If you ever want to use your code for something, you need to make sure it works properly. To ensure it works, we use testing. There's several libraries and modules you can use to test your code in an efficient way, and we are going to use pytest.

Installing Pytest

To install pytest, run one of the following

pip install pytest
python3 -m pip install pytest

To make sure you installed it correctly, run one of the following

pytest -h
python3 -m pytest -h

You can uninstall a python library by typing into the terminal pip uninstall <library name> or python3 -m pip uninstall <library name>

BYU Pytest Utils

The autograder uses pytest with an extra library containing extra testing utility tools. To run the autograder's tests locally, install the BYU pytest utils library:

pip install byu_pytest_utils
python3 -m pip install byu_pytest_utils

Using Pytest

Let's say we are trying to test our functions in example.py:

def square(x):
return x * x

def find_factors(n):
factors = []
for i in range(1, n):
if n % i == 0:
factors.append(i)

return factors

We can run our checks using pytest by writing test functions that check if the output matches what we expect.

def test_square():
assert square(4) == 16
assert square(0) == 0
assert square(1/2) == 0.25

def test_find_factors():
assert find_factors(15) == [1,3,5,15]
assert find_factors(20) == [1,2,4,5,10,20]

Notice that each of the test functions start with test_*. In order for pytest to realize that these function are used to verify our code, it must start with test_*. At this point, we can type one of the following into the terminal:

pytest example.py
python3 -m pytest example.py

and it will run all the functions in example.py that start with test_* without us ever needing to call them in our code.

Here's what the output looks like:

pytest_example

We can see that find_factors does not work.

If we want to display more about the failed test case, we run pytest with the -v option.

pytest example.py -v
python3 -m pytest example.py -v

Additionally, if we only wanted to run one test function from a test file, we can do follow the format of pytest <test_file.py>::<test_function>

pytest example.py::test_square
python3 -m pytest example.py::test_square

Note: If you're having trouble running pytest, try each of the terminal commands. It's possible only one of the commands work for your operating system

Organization with Test Files

When there is a large amount of code in one file, it is worth moving our tests into a different file for better organization. We will move all our tests into test_example.py and import the functions from example.py:

from example import * # from example.py import everything

def test_square():
assert square(4) == 16
assert square(0) == 0
assert square(1/2) == 0.25

def test_find_factors():
assert find_factors(15) == [1,3,5,15]
assert find_factors(20) == [1,2,4,5,10,20]

Additionally, running pytest without specifying a file will cause pytest to automatically run all files in the format test_*.py or *_test.py in the current directory and subdirectories.

Running the command

pytest
python3 -m pytest

will make Pytest automatically run test_example.py in this case.

Features of Pytest

Approx

When dealing with floating point numbers (i.e. decimals), computers have a hard time storing particular numbers within memory. For example,

>>> 0.1 + 0.2 == 0.3
False

To compensate for this limitation, pytest has a approx function.

>>> import pytest
>>> 0.1 + 0.2 == pytest.approx(0.3)
True

By default, the tolerance on the approximation is 1e-6. Provide a second argument to change the tolerance.

>>> import pytest
>>> 1.5 + 0.4 == pytest.approx(2)
False
>>> 1.5 + 0.4 == pytest.approx(2, 0.1)
True
>>> 1.5 + 0.6 == pytest.approx(2, 0.1)
True

Raises

Sometimes we design our code to raise errors. To test that our code does that, we can use pytest's raises function.

import pytest

def square_root(x):
if x < 0:
raise ValueError("Negative numbers not allowed")
return sqrt(x)

def test_square_root_raises_exception():
with pytest.raises(ValueError):
square_root(-4)

Required Questions

Write your code in lab11.py and your tests in test_lab11.py

Q1: Product and Summation

Write the tests for the product and summation functions first before writing any code for the function.

Write tests for a function called product that takes in a integer parameter n. product returns the result of 1 · 2 · 3 · ... · n; however, if n is less than one or not an integer, raise a ValueError.

Additionally, write tests for a similar function called summation that takes in a integer parameter n. summation returns the result of 1 + 2 + ... + n; however, if n is less than zero or not an integer, raise a ValueError.

To check if a number is an integer, use the isinstance() function. For example,

>>> value_in_question = 5
>>> isinstance(value_in_question, float)
False
>>> isinstance(value_in_question, int)
True

When writing the tests, make sure to consider all cases. For example, product should do the following:

  • If n is less than the one or not an integer, raise a ValueError
  • If n is greater than or equal to one, compute 1 · 2 · ... · n

Write tests that check if your code follows these rules by thinking of what inputs would cause each case.

Make sure to use the raises function that comes with pytest.

After writing the tests, for both functions, implement both functions. When you are done, run one of the following pairs in your terminal:

pytest test_lab11.py::test_summation
pytest test_lab11.py::test_product
python3 -m pytest test_lab11.py::test_summation
python3 -m pytest test_lab11.py::test_product

If you get an error, it is either due to poorly written tests or a poorly written function. If you are confident that your tests are correct, find the bug in the respective function.

Q2: Refactoring Product and Summation

You may have noticed that product and summation are very similar to each other in that they both raise a ValueError if n is less than some number or if n is not an integer. Additionally, both functions take the total of a function (add or multiply) applied on some range of values. Because of this, we can refactor our code so the functions have the same behavior but with a cleaner design.

To refactor our code, create three new functions:

  1. product_short(n) - same behavior as product, but with a cleaner design
  2. summation_short(n) - same behavior as summation, but with a cleaner design
  3. accumulate(merger, initial, n)

accumulate with contain the logic of applying some function merger to intial and to each value in the range from one to n. It will then return the total after merger has been applied to each value. (merger will either be the add or mul functions.) Additionally, if n is less than the initial or not an integer, raise a ValueError. For example,

>>> from operator import add, mul
>>> accumulate(add, 0, 3) # 0 + 1 + 2 + 3
6
>>> accumulate(add, 2, 3) # 2 + 1 + 2 + 3
8
>>> accumulate(mul, 2, 4) # 2 * 1 * 2 * 3 * 4
48
>>> accumulate(mul, 5, 0) # Raises a ValueError

Write tests for accumulate and then implement accumulate. (Feel free to use the examples given above in addition to the tests you write yourself.)

pytest test_lab11.py::test_accumulate
python3 -m pytest test_lab11.py::test_accumulate

Hint: Using the second example given above, add(2,1) gives 3, then add(3, 2) gives 5, then add(5, 3) gives 10

After implementing accumulate, use the same tests from test_product and test_summation for test_product_short and test_summation to ensure that the new versions of each of the functions work the exact same. After that,implement product_short and summation_short by calling accumulate with the right arguments. product_short and summation_short should contain one line each in their function bodies.

pytest test_lab11.py::test_summation_short
pytest test_lab11.py::test_product_short
python3 -m pytest test_lab11.py::test_summation_short
python3 -m pytest test_lab11.py::test_product_short

Q3: Statistics

Your younger sibling (or cousin) was covering statistics in math class today and learned about the mean, median, mode, and standard deviation of a dataset. After working on two problems where they had to calculate each statistic by hand, they had had enough. They chose to write a program with functions that would do their homework for them; however, it does not work 😞. Your sibling has already spent more time trying to debug their program than it would have taken to complete their homework, and they are too tired to keep debugging. Now, they need your help to figure out what is wrong.

Write tests for each function they wrote -- square, sqrt, mean, etc. If the functions fail the tests, try to find the error in their code and fix it.

When fixing errors, do not delete an entire line or rewrite a function. The errors are small and should require you to add, delete, or replace a few things.

Some of their functions may work while others do not. Some functions may rely on other broken functions. To find what the expected outputs should be, rather than calculating them by hand, it is worth searching for a calculator on the web that will do it for you. Down below is a quick review of the mean, median, mode, and standard deviation of a dataset that your sibling (or cousin) used as reference.

Mean

To calculate the mean, find the sum of the dataset and divide it by the size/length of the dataset. For example, if the dataset was [1, 1, 1, 3, 4]. The sum would be 10 and the size would be 5, so the mean would be 10/5 or 2.

Median

The median is the middle value of a sorted dataset. For example, if the dataset was [1, 2, 3, 4, 5] , the median would be 3. If there is no middle value in the dataset because there is an even amount of elements, the median would be the mean/average of the two values closest to the middle. For example, if the dataset was [1, 2, 3, 4, 5, 6], the two values closest to the middle are 3 and 4. Taking the mean/average of those numbers gives 3.5 which would be the median.

Mode

The mode is the most common element in a dataset. For example, if the dataset was [1,2,1,1], the mode of the dataset would be 1 because it appears the most times. If two elements appear the same amount of times, the mode will be (for this lab) the element that appeared the most times first. For example, if the dataset was [1,1,2,2], the mode would be 1.

Standard Deviation

The standard deviation represents the amount of variation of all the values in a dataset. To calculate it, we use the following formula:

$$\sigma = \sqrt{ \frac{\sum (x_i - \mu)^2 }{n} }$$

where

$\sigma$ = standard deviation

$x_i$ = individual data value

$\mu$ = mean

$n$ = dataset's size

We can read this formula as:

  1. For each data value in the dataset
    1. Find the data value minus the mean. Square that result, and add it to a sum.
  2. Divide the sum by the size of the dataset.
  3. Take the square root of the result from step 2

Hint: Whenever you are working with floating point numbers, it is good practice to use the approx() function. Additionally, remember that the optional second parameter tolerance will be helpful.

Submit

If you attend the lab, you don't have to submit anything.

If you don't attend the lab, you will have to submit working code. Submit the lab11.py and test_lab11.py files on Canvas to Gradescope in the window on the assignment page.

Grading on Gradescope

If you submit your lab to Gradescope, you will be graded on two things:

  • Submitting working functions
    • This will require you to write tests to identify the bugs in both the functions you write and the starter functions you're given
    • This will be graded with regular tests
  • Submitting passing tests
    • You should just submit the tests you wrote as you looked for bugs in the functions
    • This will be graded by running your tests to make sure they pass

Normally, the starter files come with the tests that the autograder will run. But in this case, doing so would defeat the purpose of having you write tests in the first place! So, unlike other assignments, you won't be given any tests in the starter files.

Note: Gradescope has two naming conventions. As an example, test_invert will test the actual invert function you submit, and test_test_invert will test the test_invert test you submit.


Optional Questions

Q4: Invert and Change

Write the tests for the invert and change functions first before writing any code for the function

Write the tests for a function invert that takes in a number x and limit as parameters. invert calculates 1/x, and if the quotient is less than the limit, the function returns 1/x; otherwise the function returns limit. However, if x is zero, the function raises a ZeroDivisionError.

Write the tests second function change that takes in numbers x, y and limit as parameters and returns abs(y - x) / x if it is less than the limit; otherwise the function returns the limit. If x is zero, raise a ZeroDivisionError.

Tests for Invert and Change

When writing the tests, make sure to consider all cases. For example, invert should do the following:

  • If 1/x is less than the limit return 1/x
  • If 1/x is greater than the limit return limit
  • If x is zero, raise a ZeroDivisionError

Write tests that check if your code follows these rules by thinking of what inputs would cause each case.

Now implement invert and change.

Check your work and run pytest in the terminal:

pytest

Q5: Refactor

Notice that invert and change have very similar logic in that you are dividing some numerator by x and if the result is greater than the limit then the function returns the limit. Because of this, we can refactor our code so it has the same behavior but with a cleaner design.

To do this we are going to add three new functions:

  • invert_short - same behavior as invert but designed differently
  • change_short - same behavior as change but designed differently
  • limited

limited will have three parameters numerator, denominator and limit. It will contain the logic of dividing a numerator by the denominator, and if the result is greater than the limit then the function returns the limit, and it returns the result otherwise. However, if the denominator is zero, it raises a ZeroDivisionError.

Now have invert_short and change_short call limited appropriately to maintain the same behavior as invert and change.

Note: invert_short and change_short should have only one line in its body

Tests for Refactor

Implement two more test functions test_invert_short and test_change_short that ensures that those two functions behave the same as invert and change.

Check your work and run pytest in the terminal:

pytest

© 2023 Brigham Young University, All Rights Reserved