Testing, Documentation, and Design

Testing

Program testing can be used to show the presence of bugs, never to show their absence!

\(-\) Edsger Dijkstra

Unit Testing vs Integration Testing

Unit Testing usually refers to testing a single function
Unit tests should ideally be run after every "compile" and can also be run after every push to a git repository using continuous integration
Thus, unit tests should execute quickly and test only a single isolated item.
Integration tests are tests that require multiple components to work together to function
- These tests can take longer to run, are typically run less often then unit tests, and encompass a broader range of functionality

Why Automating Testing

Provides some level of assurance that your code is correct
When your code changes, it helps you avoid breaking existing functionality
Provides built in working examples of how to use your code
Designing your code to be tested leads to better code
Why Test?

Test Strategies

Write tests whenever you encounter a bug
- helps you reproduce the bug
- ensures that you won't encounter the same exact problem again
Test Driven Development: write tests prior to writing code
In robotics, testing is generally significantly harder than in most other disciplines
- Testing everything often requires physical hardware to move or sense in the real world
- Try to isolate as much of the "software" part of the robot from the hardware to facilitate testing

Testing in ROS and Python

Testing ROS python code involves several layers
At its core, ROS uses the built-in python unit testing framework, unittest
To create unit tests
- import the code you wish to test
- import unitttest
- Write a class that serves as a "Test Case"
  - the class inherits from unittest.TestCase
  - each method in the class that starts with the word "test" defines a unit test
  - Each unit test calls the code it wants to test and asserts the result using special unittest assertion functions (e.g., self.assertTrue, self.assertEqual)
In ROS, you use unittest as normal, but apply some special ROS wrappers to put the output in a format ROS understands (see unittest).
- Tests can be run either as plane python scripts or as ROS Nodes

Test as plain python Code

Place your test in a test/ directory

Your main code should be

import unittest
import <module_to_test>
class MyTestCase(unittest.TestCase):
    def test_something(self):
        self.assertEquals(<expression1>, <expression2>)

if __name__ == "__main__":
    import rosunit
    rosunit.unitrun(PackageName, 'test_class_name', TestClass)

Make sure the test is not executable

Add the test (if CATKIN_ENABLE_TESTING is true) in the CMakeLists.txt

if(CATKIN_ENABLE_TESTING)
   catkin_add_nosetests(directory_with_tests) 
endif()

You must and <test_depend> on rosunit in your package.xml
Your test will run with catkin_make run_tests
- You should see information about the tests that were executing when using catkin_make run_tests
- You can see more information about the tests with catkin_test_results
- The actual test results are output to build/test_results/

Test a ROS node

Testing with ROS nodes is done with rostest
rostest reads launchfiles, enabling you to create tests involving multiple nodes
Launchfiles used exclusively for testing typically end in .test and use the tag <test> rather than <node> to launch the nodes.
- The <test> tag uses a test-name attribute instead of a name attribute to find the test node
You can add <test> nodes to your existing launchfiles and those nodes will only run when you are testing

In your CMakeLists.txt add

if(CATKIN_ENABLE_TESTING)
    find_package(rostest REQUIRED)
    add_rostest(test/mytest.test)
endif()

Then test/mytest.test will be run on a call to catkin_make run_tests. #+END_SRC

You can write a test node by calling rospy functions within unittests.

The main for your unittest should be

if __name__ == "__main__":
   import rostest
   rostest.rosrun(PackageName, 'test_class_name', TestClassName)

You must <test_depend> on rostest in your package.xml
You can also invoke .test files using rostest (like roslaunch)
Nodes that you are testing should be executable
based on the existing example nodes, I call rospy.init_node from the constructor of the test case.
Alert! Writing testing nodes can expose various race conditions in your code (which is a good thing) but these can be hard to diagnose. The order of nodes in a launchfile (including testing nodes) is not specified and cannot be relied on.
Until you call rospy.init_node, the services and subscribers you setup cannot be called: therefore it makes sense to do as much as possible after calling rospy.init_node

If you do create a service/subscriber after rospy.init_node has been called, it can be invoked immediately: for example:

srv = rospy.Service("myserve", Type, callback) # create the service
self.i = 2 # it is possible that callback is called prior to this line!

Documentation

Sphinx is a tool that converts docstrings in your python source code into well-formatted documentation
- To install: sudo apt install python3-sphinx
rosdoc_lite Is a ROS tool that integrates Sphinx with ROS and helps produce ROS documentation
It also helps setup Sphinx appropriately for a ROS project

Quick Setup

Create a doc directory in your package

In the root directory of your package, create a rosdoc.yaml file with contents:

- builder: sphinx
  sphinx_root_dir: doc # this is where documentation will be created

Copy conf.py and index.rst to the doc directory
Run sphinx-apidoc -o doc src to add your python modules to the documentation
- These files are provided for convenience. To make the most out of sphinx you will need to edit them and create additional documentation files
- Think of sphinx-apidoc much like catkin_create_pkg, it is a convenience function used to generate some files that will then become a part of your project.
Run rosdoc_lite .
View doc/html/index-msg.html in web browser.

Setup

These are more complete setup instructions that explain what I did to create the example conf.py file above.

See the ROS Sphinx wiki page for details.
Create the doc/ directory and the rosdoc.yaml file
Run sphinx-quickstart in the root of your package directory to create conf.py
- conf.py is the configuration file used by Sphinx
- See ROS Sphinx wiki for answers to the questions
- They also have code to "rosify" your configuration. Modify the resulting conf.py to follow their conventions
- When modifying conf.py see catkin_pkg to view how to customize some values based on what is in your package.xml
  - I also import string and use "".join(catkin_package.authors) to automatically fill in authors from the package.xml, for example
Add sphinx.ext.napoleon to the extensions list to allow sphinx to parse google and numpy style docstrings. Otherwise you have to annotate your functions using a hard-to-read format.

Git

Git is an important part of the documentation process
Think of git as documentation for your collaborators (including your future-self)
If you write good log messages, then you can go back and answer "Why was this done?"
Look at the commit messages in the Linux Kernel Source Code!
For more information about git, see Git Introduction

Design Guidelines

ROS adds methods for abstraction beyond those provided by a programming language. When thinking about the design of a ROS system, here are some questions that often arise and some guidelines about the trade-offs experienced in these decisions.

How should the code be divided across nodes?
- Code in a single node is simpler and can take advantage of the abstractions provided by the programming language
- Code that is split across multiple nodes can be re-used and mixed-and-matched with other ROS nodes more easily
- Code that is split across multiple nodes is inherently parallel (since nodes run in parallel).
- My (very loose) guideline is that if two nodes are
  1. Always are either both running or not running
  2. Subscribed to each other's topics
  3. Do not need to be running in parallel
  then you should keep them in a single node.
How should the code be divided across packages?
- A single package promotes simplicity
- Multiple packages enable re-use
- I start with a single package. If part of the package seems individually useful, I will spin it off into a separate package.
How should the code be divided across repositories?
- Single repository is simpler and makes it easier to keep related changes between the packages synchronized.
- Multiple repositories promote decoupling between the packages, potentially making one more generally useful.
- For me, if the packages will be released together and are used for the same project, I keep them in the same repository

There is also a list of ROS design patterns here.

Node Design Patterns

I often find myself writing two basic types of nodes in ROS which I call reactive and fixed.

A reactive node is entirely driven by subscribers: the goal is to perform a task every time a message is received, and this task is performed directly in the subscriber callback. If the task involves publishing a ROS message, then the node publishes a message from the callback and is effectively translating from one message type to another.

The other type of node, a fixed node is designed to output data at a fixed frequency. It may have subscribers and services that modify what is published from a main, fixed frequency timer or loop.

These patterns can be implemented with a python class, or using functions and global variables. The class-based method is arguably more "pythonic", has a clear and organized structure, but has a bit more syntactic overhead. The global-variable method is a bit simpler, but also it becomes harder to turn into an importable python library, which makes testing more difficult. Both methods are practically pretty similar, because within an object member variables are effectively global, and in this pattern a single object implements everything.

Reactive

Here is an example of a the reactive pattern, using a class.

#!/usr/bin/env python

import rospy
# Other imports go here

class Node:
    """ This class manages the node lifecycle 
        For bigger projects, I try to limit the scope of this node
        to code that must directly interact with ROS and use classes 
        in a python package to manage everything
    """
    def __init__(self):
        self.sub = rospy.Subscriber("topic", MyType, self.callback)
        # Wait for a service to become available for creating a client
        rospy.wait_for_service('helper_service')
        self.serv = rospy.ServiceProxy('helper_service', SomeType) 
        self.pub = rospy.Publisher(...) # if you need a publisher
        self.other_var = 2 # if you need to share data between states
        rospy.init_node('name')

    def callback(self, msg):
        """ This is the callback for the topic subscriber """
        # Do calculations here
        # Note that you can read and save state through the self variable
        # and it is shared with other callbacks in this class;
        # However, usually we just need to access the publisher here
        self.other_var = self.serv(...)
        self.pub(...) # publish the message

def main():
    """ The main() function makes the code easier to import in the REPL"""
    n = Node()
    # Spin is outside of the Node constructor
    # This way the Node() object is fully constructed when it is used
    rospy.spin()


if __name__ == "__main__":
    main()

The example above could be made to not use a class by

Removing class Node: and removing self and self.
Removing the main() function
Removing __init__ and placing its contents where main() is called, followed by rospy.spin()
Adding the global keyword as needed (if a function assigns to a variable it is automatically made local unless otherwise specified)

Fixed

For the fixed pattern the basic idea is that all calculation, and publishing occurs from the main timer callback. The subscribers and services that are offered modify variables that are then read by the main timer. There are two advantages to structuring code this way

You have more control over the frequency of publication: the node publishes at a fixed frequency regardless of input data
The structure means that the primary action of the node is happening in one place, instead of being scattered across multiple callbacks.

from enum import Enum, auto

class State(Enum):
    """ The state of the control loop 
        These are different modes that the controller can be in
    """
    RUNNING = auto()
    STOPPED = auto()

class Node:
    def __init__(self):
        # We subscribe to sensor data
        self.sub = rospy.Subscriber("sensor", MySensor, feedback)
        # We publish a control signal
        self.pub = rospy.Publisher("control", ControlType)

        # Definie some services that become events in the internal state machine
        # Stop the controller
        self.stop = rospy.Service("stop", EmptyType, stop_cback)
        self.start = rospy.Service("start", IntType, start_cback)
        self.state = State.STOPPED
        self.sensor = 0
        self.goal = 0
        rospy.init_node('name')

        # The main loop runs in a timer (timers must be created after rospy.init_node() is called)
        self.tmr = rospy.Timer(rospy.Duration(0.01), main_loop)

    def feedback(self, data):
        """ Subscribe to the sensor data and store it for use by the main loop
        In this pattern, subscribers mainly just store data for the main loop
        to use
        """
        # note: assignment of this source is atomic in python
        self.sensor = data

    def stop_cback(self, req):
        """ Response to the stop service """
        # Put the main loop into the Stopped state
        self.state = State.Stopped
        return EmptyResponse()

    def start_cback(self, req):
        """ Response to the start service """
        # Put the main loop into the RUNNING state
        self.state = State.Running
        self.goal = req.value
        return EmptyResponse()

    def main_loop(self, event):
        """ The main control loop timer callback

             Args:
                event: A rospy.Timer event
        """
        if self.state == State.STOPPED:
            self.pub(0) # publish no control
        elif self.state == State.RUNNING:
            u = compute_control(self.goal, self.sensor_value)
            self.pub(u)
        else:
            raise Exception("Always Handle the Else Case in a state machine")

State Machine

When implementing the main loop you may find yourself with a complex tangle of if statement, checking various state variable. The logic can become quite complex. Often this can be simplified by using a Finite State Machine (FSM).

An FSM is characterized by a finite number of states and transitions between states.
The current state determines what action should be taken at a given moment
Transitions can be triggered when certain events occur

Each time the main loop is called, the state is checked, and an action is taken. The main loop itself, or a callback may initiate a state transition, changing the behavior of a main loop. In this way, each complex logical condition is mapped to a single state, and all combinations are handled with an overall if statement.

Traffic Light Example

Here is a graphical depiction of a FSM for a traffic light controller at the intersection of Sheridan Rd. and Noyes St.

Figure 1: State Machine for a Basic Traffic Light Controller

The Sheridan and Noyes lights both have three possible values: Red, Yellow, Green. Transitions between lights are driven by an external event, namely time. The main loop (which controls the lights) happens at a fixed frequency, and each time the main loop runs, it is able to make one state transition.

We could write out complex logic to maintain a safe intersection: for example

If Sheridan.Red and Noyes.Green and Time > T1 then Noyes.Yellow
If Sheridan.Red and Noyes.Yellow and Time > T2 then Noyes.Red
If Sheridan.Red and Noyes.Red and Noyes.Yellow(T-1) and Time > T3 then Sheridan.Green
Etc…

Notice that even for a simple intersection the logic above gets complex, especially because when both lights are Red, the next transition depends on the previous state.

Instead, we can identify states of the system and associate them with behaviors:

SheridanGo: Sheridan shows Green, Noyes shows Red
NoyesGo: Sheridan shows Red, Noyes shows Green
SheridanSlow: Sheridan shows Yellow, Noyes shows Red
NoyesSlow: Sheridan shows Red, Noyes shows Yellow
SheridanStop: Sheridan shows Red, Noyes shows Red (when the last Yellow light was Sheridan)
NoyesStop: Sheridan Shows Red, Noyes shows Red (when the last Yellow light was Noyes)

And identify the state transitions

SheridanGO goes to SheridanSlow after T1 seconds
SheridanSlow goes to SheridanStop after T2 seconds
SheridanStop goes to NoyesGo after T3 seconds
NoyesGo goes to NoyesSlow after T4 seconds
NoyesSlow goes to NoyesStop after T5 seconds
NoyesStop goes to SheridanGo after T6 seconds

Example

See github for an example combining Sphinx documentation, unit testing, and integration testing.

Other Resources

Standard Test Nodes Some nodes that perform some standard tests Tutorials on ROS Quality