Algorithm Design and Documentation

Jump To

Algorithms

Computers are very good at performing mathematical operations, comparing values, and storing vast amounts of data. Beyond that, however, a computer only performs exactly what it is told to do, even if that is not what a programmer has in mind. Because of this, it is extremely important for the programmer to understand exactly what the problem is, so that any ambiguity or misunderstanding does not translate into incorrect code.

To successfully solve a problem, it is necessary to establish a set of rules that will allow us to find the solution. In computer science and in mathematics, the term for this is an algorithm. A more precise definition of an algorithm would be something like this:

An algorithm is a well-ordered collection of unambiguous and effectively computable operations that when executed produces a result and halts in a finite amount of time.
— Schneider, M. and J. Gersting (1995), An Invitation to Computer Science, West Publishing Company, New York, NY, p. 9.

We use algorithms every day. Take, for example, the following algorithm for making scrambled eggs. Your method of making scrambled eggs may be different from the one below.

  1. Turn on the heat on the stove.
  2. Place a pan on the burner.
  3. Put a spoonful of butter in the pan.
  4. Crack an egg into a bowl.
  5. Whisk the egg until it is scrambled.
  6. Pour the scrambled egg into the heated pan.
  7. Stir the egg until it is cooked.
  8. Turn off the heat.
  9. Remove the pan from the burner.
  10. Transfer the egg to a plate.
  11. Enjoy breakfast.

Each step is clearly defined, and well-ordered. There is a result (tasty egg) produced in a finite amount of time (5 minutes). All recipes are analogous to algorithms, in this sense.

A slightly more complicated algorithm might be one for tying your shoes. Again, each statement must be unambiguous and precise. Most of us have probably learned how to tie our shoes so long ago that it is difficult to describe the process, since we are adept at doing it without much conscious effort; however, an algorithm might look something like the following.

  1. Take the left shoelace (henceforth called SL1) in your left hand, and the right shoelace (SL2) in your right hand.
  2. Cross SL1 over SL2, then let SL1 drop.
  3. Grab SL2 with your left hand, then release your right hand and use it to grab SL1, making an “X” with the two laces.
  4. Tuck the end of SL1 into the opening beneath the intersection of SL1 and SL2, then release it.
  5. Pick up SL1 in your right hand again, then pull both ends of SL1 and SL2 in opposite directions.
  6. Fold SL2 into a loop, then wrap SL1 around the loop and your thumb.
  7. Remove your thumb from the laces, then push SL1 through the resulting hole.
  8. Pull the two loops tight.

Of course, most mathematical processes can be formalized as algorithms. Take, for example, the following algorithm to calculate n factorial.

  1. Obtain n.
  2. Let f=n.
  3. Let x=n-1.
  4. While x > 1:
    1. Multiply f by x.
    2. Subtract 1 from x.
  5. Output f.

Compare this to the Python code below.

n = int(input("N: "))
f = n
x = n-1
while x > 1:
   f *= x
   x -= 1
print(f)

The algorithm looks nearly identical to the code itself, only the code is expressed using a more mathematical notation. Of course, it is possible to compute n factorial using a different algorithm, or to use a for loop in place of the while loop, similar to an earlier exercise. Nevertheless, our goal is achieved via a sequence of clear instructions. All computer programs, whether large or small, are performing algorithms as designed by the programmer. It is your responsibility to ensure that the algorithm is correct, if you are to guarantee that the program runs correctly.

Exercises

  1. Describe an algorithm for each task.
    1. Making a hamburger.
    2. Doing a cartwheel.
    3. Brushing your teeth.
    4. Making a paper airplane.
    5. Finding the average of n integers.

Pseudocode

While there are no official rules concerning pseudocode, here are some guidelines that I recommend:

  1. Write one statement per line. This ensures each task is clearly identifiable at a glance.
  2. Use a verb as a keyword, and capitalize it for emphasis. This makes it easy to spot the actions involved in the task.
  3. Use indentation to group related tasks, such as decisions or repetitive processes. This makes pseudocode more readable.
  4. Ensure that pseudocode is language-independent. An individual should be able to follow your pseudocode without requiring knowledge of a specific programming language.

Here is an example of pseudocode for making buttered toast.

PUT bread in toaster
PRESS lever down so toaster is on
WHILE bread is not toasted
   WAIT for bread to turn golden brown
POP toast up
PUT toast on plate
SPREAD butter onto toast

Computational algorithms are often described using pseudocode. In fact, if pseudocode is well-written, it is usually fairly easy to translate it into a computer program, assuming that the programmer is familiar with a language’s syntax and commands. Compare the following pseudocode with the Python program that follows it.

GENERATE random number between 1 and 10
READ guess
SET count = 1
WHILE guess ≠ random number
   READ guess
   INCREMENT count by 1
PRINT count
import random
num = random.randint(1,10)
guess = int(input("Guess: "))
count = 1
while guess != num:
   guess = int(input("Nope. Guess again: "))
   count += 1
print("Correct! You guessed the number in", count, "guesses!")

Pseudocode should be clear and precise, so that there is no misinterpretation. Take, for example, the following directions: “Go to the store to buy 2 loaves of bread, and if there are oranges, buy 6.” If there are oranges in the store, how many loaves of bread should you buy? (Answer: 6)

Exercises

  1. Write pseudocode for each task.
    1. Dividing D dollars among n people.
    2. Inflating a balloon.
    3. Making a bowl of oatmeal.
  2. Write pseudocode for each task.
    1. Given three integers, sort them in descending order.
    2. Given n integers, determine how many are odd.
    3. Ask for a two integers, the second of which must be greater than the first. Do not accept an invalid value for the second integer.
    4. Given two positive decimal numbers m and n, calculate the sum of all whole number between m and n inclusive.
    5. Given two positive decimal numbers m and n, calculate the product of all even whole number between m and n inclusive.

Flowcharts

Pseudocode is a way to organize the steps of an algorithm, using a human-readable syntax. Each step is clearly identified using a set of rules, such as the one used above. Some people prefer to organize the steps graphically, so that they can visualize program flow better. For example, a loop in pseudocode might be indented to show that the steps within the loop are related. The loop itself would be indicated using a verb like WHILE, FOR, REPEAT, or DO … UNTIL. Using a visual representation, however, a loop might be easier to identify, since there would be some symbol or construct that would make it stand out.

Flowcharts are graphical representations of algorithms. They use standard symbols to represent different actions, and connect each step in some logical sequence. The graphic to the left shows some of the more common symbols used. There are many more, including those for reading and writing files or accessing certain types of storage, but this tutorial only covers a small subset. We will introduce other symbols as necessary.

Each flowchart begins and ends with a terminator. This indicates where the algorithm starts, and where it finishes. Between the terminators, each step is connected by an arrow indicating which step follows another. If two paths must connect, a connector is used to indicate that the two paths join. This is for clarity — sometimes it is necessary to cross over an arrow to reach another step. In this case, the lack of connector indicates this cross-over.

Some other common symbols used are:

  • Parallelogram: indicates input or output, such as reading data from the user or displaying information to the screen.
  • Rectangle: indicates a process, such as assigning a value to a variable or performing some mathematical operation.
  • Diamond: indicates a decision, where the answer is typically “yes/no” or “true/false”. Each arrow leading out of a diamond should be labeled with the result of the decision.

To the right is a flowchart that illustrates how to add two values together. The algorithm is entirely sequential, with no loops or decisions in it. For the algorithm to work, the user must supply both numbers that are to be added. The flowchart uses parallelograms to indicate user input. After this, the algorithm processes the two numbers by adding them together. A rectangle indicates this processing. Then, the sum of the two numbers is displayed to the screen, as indicated by the final parallelogram.

Algorithms often include decision-making or repetition. A slightly more complicated example that uses both is the one below left, which adds all values until a negative value is entered.

Before the user enters any values, the sum of all entered values is zero. The user then enters a value. If this value is non-negative (zero or greater), then we must update the sum accordingly. After this, the algorithm loops back to read another value. Note the connector that indicates that the arrow leaving the summation process rejoins the path just before the next value is entered. This arrow literally makes a loop, making it very easy to identify the repetition involved in the algorithm. It is very important that it rejoins the path after the sum is initialized to zero. Can you explain why?

At some point, the user will enter a negative value. Once this occurs, the algorithm displays the sum of the non-negative values and ends. The decision has exactly two outputs: either the number is greater than or equal to zero (non-negative), or it is below zero (negative). For any decision, there should always be two outputs to cover both possibilities.

Finally, consider the problem of determining the largest of three numbers. The flowchart below uses only three comparisons to do this.

Description.

Exercises

  1. Create flowcharts for each task.
    1. Crossing the street.
    2. Making mashed potatoes.
    3. Filling the gas tank of a car.
  2. Create flowcharts for each task.
    1. Determine an employee’s net pay, given the number of hours worked, the hourly wage, and the taxation rate.
    2. Determine if two given integers have a sum that is positive, negative or zero.
    3. Determine if the values of three given integers are consecutive.
    4. Calculate the value of n factorial.
    5. Determine the largest power of 2 that does not exceed a given integer.

Comments and In-line Documentation

Up to this point, most programs have been simple enough that a competent programmer could determine what they do with minor effort. As programs get larger and more complex, however, it might not be so obvious what a certain program, or a particular section of code, does. For this reason, programmers typically use comments within the body of a program’s code. These comments can describe complicated functions, identify variables, or state conditions that must be met in order for the code to run.

Recall that a comment in indicated by the # character. Anything after a # will not be interpreted when a program is executed. The interpreter simply ignores everything after the # until the end of the line. This means that comments can be placed either before a command, or after a command on the same line, as shown in the following two examples.

# print the numbers 1-10
for count in range(1,11):
   print(count)
for count in range(1,11): # print the numbers 1-10
   print(count)

For readability, I recommend placing comments just before the relevant commands, but the choice is up to you. Use comments liberally throughout your programs, whenever you feel that a block of code could use some explanation.

When commenting code, it is not necessary to provide a comment for every single command in a program. This would not only take much more time to enter each comment, but would clutter the code with (possibly) unnecessary descriptions of code that should be “obvious” to a typical programmer. Here, “obvious” is subjective — what you consider “obvious” may not be for another person. For example, the example below is much too cluttered.

# hypotenuse.py
# J. Garvin
# 2011-10-28
# Calculate the length of a hypotenuse, given a right-triangle's arms.
# Uses Pythagorean Theorem, c=sqrt(a**2+b**2)

# import the math module
import math
# read an integer from the user
a = int(input("Arm 1: "))
# read another integer from the user
b = int(input("Arm 2: "))
# calculate the hypotenuse using Pythagorean Theorem
c = math.sqrt(a**2+b**2)
# output the length of the hypotenuse
print("The hypotenuse has a length of", c, "units.")

A cleaner version of the code, which is still “obvious”, is below.

# hypotenuse.py
# J. Garvin
# 2011-10-28
# Calculate the length of a hypotenuse, given a right-triangle's arms.
# Uses Pythagorean Theorem, c=sqrt(a**2+b**2)
import math

# read values for both arms
a = int(input("Arm 1: "))
b = int(input("Arm 2: "))

# calculate and display the hypotenuse
c = math.sqrt(a**2+b**2)
print("The hypotenuse has a length of", c, "units.")

Related commands are grouped together, so the two input statements share a comment. Similarly, the calculation and display commands are grouped, since the program’s purpose is to perform both actions after another. It is cleaner, and easier to read, especially by using whitespace to separate the blocks of code.

Another issue to consider is that, for large projects, there is typically more than one programmer involved. A large and complex piece of software, such as a video game or a spreadsheet, may have hundreds of programmers focusing on certain portions of the code. Since these programmers often need to write code that interacts with that of others, they must be able to understand code that was not written by themselves. Comments can make this task easier, ensuring that the bulk of a programmer’s time is not spent trying to figure out what a certain function does, or what a certain variable represents.

Consider the following program below. Try to figure out what it does before scrolling further down this page.

x = int(input("Integer: "))
y = 1
for z in range(2,x+1):
   if x % z == 0:
      y += z
print(y)

Did you determine what this program does? If so, how long did it take for you to figure it out? Compare the code above to the commented code below.

# sumoffactors.py
# J. Garvin
# 2011-10-28
# For a given integer >= 2, find the sum of all of its factors.
# (e.g. 6 has factors 1, 2, 3 and 6, so 1+2+3+6=12)

# User enters an integer.
x = int(input("Integer: "))

# Loop through all factors from 2 to the integer, checking if they
# divide cleanly into the integer. Since all integers are divisible
# by 1, start with sum initialized to 1.
y = 1
for z in range(2,x+1):

   # If a factor is found, add it to the running sum
   if x % z == 0:
      y += z

# Output the sum of the factors.
print(y)

Much clearer, isn’t it? Even if you did not understand all of the Python commands (such as the % or += operators), the comments are probably sufficient to explain what the code is doing. To further illustrate this value of comments, the code below also calculates the sum of an integer’s factors, albeit much quicker. Can you determine how the program works, without any comments to explain the logic behind it?

import math
x = int(input("Integer: "))
y = x + 1
w = int(math.ceil(math.sqrt(x)))
for z in range(2, w):
   if x % z == 0:
      y += (z + x // z)
if w * w == x:
    y += w
print(y)

In this case, comments would definitely be nice! Here’s an explanation.

# sumoffactors.py
# J. Garvin
# 2011-10-28
# For a given integer >= 2, find the sum of all of its factors.
# (e.g. 6 has factors 1, 2, 3 and 6, so 1+2+3+6=12)
import math

# User enters an integer.
x = int(input("Integer: "))

# The integer will have 1 and itself as factors, so initialize
# the sum with those values.
y = x + 1

# Loop through all factors from 2 to the square root of the integer,
# checking if they divide cleanly into the integer. If a factor less
# than the square root is found, add both that factor and the other
# factor that multiply together to the integer.
w = int(math.ceil(math.sqrt(x)))
for z in range(2, w):
   if x % z == 0:
      y += (z + x // z)

# Check if the square root of the integer is a repeated factor
# (e.g. 10 * 10 = 100). If the square root is a factor, add it
# to the sum only once.
if w * w == x:
    y += w

# Output the sum of the factors.
print(y)

On my machine, the first version of the code takes 4 minutes and 45 seconds to find the sum of the factors of 1,000,000,000. The second version of the code, however, runs in a mere 0.15 seconds — that’s nearly 1,900 times faster! The second program has a more efficient algorithm, and terminates sooner. Since larger numbers take computational power, the restricted sample space makes the second program perform magnitudes faster than the first.

Of course, there are many programmers who feel that comments should be used sparingly, or even not at all. One common argument is that a good programmer should write code in such a way that it is “obvious”, and so all comments are redundant. This is fine if you are familiar with every command that has been used in a program, but it does not address the issue of any underlying algorithms. Why did the programmer choose to loop from x to y+5, then square the sum of the values? Here, comments to describe how the algorithm works would be beneficial. Another criticism is that if the code is changed, then the programmer must ensure that all comments are still relevant to the new code. Since programs are often modified to reflect the changing needs of a client, this may occur fairly often. In my opinion, however, the benefits of using comments and in-line documentation in your code — clarification of code, and explanation of algorithms — outweigh the negative aspects.

Exercises

  1. Run the programs below, then write program headers and comments for each program.
    1. L = input("Letter: ")
      N = int(input("Integer: "))
      for x in range(1,N):
          print(L*x)
      print(L*N)
      for y in range(N,1,-1):
          print(L*y)
      
    2. n = -1
      while n < 0:
         n = int(input("Positive integer: "))
      total = n
      for x in range(1, n):
         print(str(x) + " + ", end="")
         total += x
      print(str(n) + " =", total)
      
    3. import random
      p1 = p2 = 0
      for x in range(10):
          n1 = random.randint(1,10)
          n2 = random.randint(1,10)
          print(n1, n2)
          if n1 > n2:
              p1 += 1
          elif n2 > n1:
              p2 += 1
      if p1 > p2:
          print("p1")
      else:
          print("p2")
      
  2. Write well-documented programs to solve each task. Include pseudocode that explains how your algorithms work.
    1. Compute a student's average grade, given the marks from any number of assessments (tests, quizzes, assignments), each out of a certain number of marks. Express the student's mark using a letter grade, with the scale A (100-80), B (70-79.9), C (60-69.9), D (50-50.9) and F (below 50).
    2. The Fibonacci sequence of integers is defined as follows.
      • The first value in the sequence is 1
      • The second value in the sequence is 1
      • All subsequent values are equal to the sum of the previous two values.

      The first few values of the Fibonacci sequence are 1, 1, 2, 3, 5, 8, 13, and so forth. Write a program that will display the first n numbers in the Fibonacci sequence.

    3. Simulate the roll of two dice, then display the result graphically. For example, a roll of 5 and 2 should look something like this.
      +-------+ +-------+
      | o   o | | o     |
      |   o   | |       |
      | o   o | |     o |
      +-------+ +-------+
      

      Try to use the minimum number of if statements in your program. Hint: Remember that if you do not want to end a print statement with a newline, use print("TEXT HERE", end="").

5 Responses to Algorithm Design and Documentation

  1. BE says:

    For the ‘sum of factors’ example, the line ‘ith sum initialized to 1.’ should have a # in front of it.

  2. h says:

    like the blog :)

  3. JS says:

    Thx for the info, and the web-site honestly looks outstanding. Just what wordpress theme are you employing?