Introduction to Image Processing in ROS
Table of Contents
1 Introduction
This page will describe some of the basic image processing concepts in ROS. The goal of this page is not to teach computer vision or image processing. Rather, the goal is to introduce the ROS packages and related non-ROS packages that are useful for image processing when using ROS.
2 Primary Concepts
The first step to working with any sort of camera in ROS is to find a driver that will work with your camera. If you are using a simple USB or built-in webcam, there are a variety of drivers to choose from, such as:
Note that in this instance "UVC" stands for USB video device class. This is
the protocol that these cameras communicate over. Also note that uvcvideo
is
the kernel driver responsible for communicating with these cameras. It may
take a bit of testing and tweaking of parameters to get a particular camera to
work as you would like. Additionally, these packages have overlapping, but not
identical, functionality. I typically use uvc_camera
simply because that is
what I usually use, but I have found instances where the other packages had
features that I wanted to use.
If you have some sort of special machine vision camera, there may already be a ROS driver (e.g. mv_bluefox_driver, pointgrey_camera_driver, or pylon_camera). If there is no ROS driver available for your camera, then it is up to you to write one (this is generally not too difficult).
2.1 What does a camera driver do?
The camera driver's primary goal is to get data off of the camera and convert
it into a sensor_msgs/Image message and then publish it on some topic (often
called image_raw
). The drivers usually also offer a variety of parameters
for configuring the functionality of the camera, defining the namespaces that
should be used, and defining camera calibration information. If you are using
a pre-made driver it is up to you to determine how to properly set these
parameters. If you are writing your own driver, it is up to you to decide
the extent of the reconfiguration options that your driver will support.
2.2 Image Pipeline
The image_pipeline is meta-package that contains many of the key tools for working with cameras in the ROS world. This meta-package is "designed to process raw camera images into useful inputs to vision algorithms: rectified mono/color images, stereo disparity images, and stereo point clouds." Generically, the tools in this meta-package are designed to interact with a camera driver and metadata on the calibration parameters of a specific camera to automatically provide a variety of image streams that have been pre-processed for easy use in image processing algorithms. Here are the most important components:
- image_proc
- This implements a set of nodelets that listen to a raw camera
stream (
image_raw
) and information on camera calibration (camera_info
) and produce four output topics. The output topics are:image_mono
: Monochrome unrectified imageimage_rect
: Monochrome rectified imageimage_color
: Color unrectified imageimage_rect_color
: Color rectified imageIf you need to rectify or convert your image's colorspace, you are far better off running the
image_proc
node and having it do the work.
- camera_calibration
- This package uses OpenCV's camera calibration tools
to calibrate monocular or stereo cameras using a checkerboard
calibration target. Once calibration is complete, the calibration
information is written to a YAML file that can then be used with your
camera driver to provide a
camera_info
topic. There is a great tutorial on using this package - image_view
- This is a simple image viewer for viewing camera topics. You
can also view camera streams directly in rviz. This is nice because
markers and robot models are automatically superimposed onto the image
streams in rviz (assuming that your
/tf
tree is defined appropriately). - depth_image_proc
- This package contains a variety of nodelets for processing depth images. A depth image is the type of data that is produced by a Kinect. A depth image is basically just a grayscale image where the pixel values correspond to depth (instead of intensity). Combining a calibrated depth camera with a depth image, it is possible to build a point cloud. This is one of the primary features of this package.
2.3 Image Transport
There are several packages available for transporting image data in non-standard formats or through non-standard mechanisms. These are often very helpful.
- image_transport
- This package allows subscribing and publishing of low-bandwidth, compressed image formats. Transporting images in a compressed format can save considerable bandwidth at the expense of minimal computational effort. This is especially useful when sending images over a network. As of right now, this package only supports C++, but even if you ultimately want to use Python for image processing, this package could be useful for writing a middle-man decompression node to save bandwidth.
- web_video_server
- This is a new version of mjpeg_server, and is very similar to jpeg_streamer. When configured properly, this package provides a video stream of a ROS image transport topic that can be accessed via HTTP. This could be useful whenever you are trying to view an image stream in a browser.
3 Pinhole Cameras
3.1 Camera Calibration Resources
- Expanded explanation of the parameters in a sensor_msgs/CameraInfo message.
- ROS Tutorial on calibrating a monocular camera
- Raw message definition of a sensor_msgs/CameraInfo message
- Caltech Camera Calibration Toolbox for MATLAB. This contains many useful descriptions of camera calibration in general.
- Documentation on the MathWorks website about camera calibration.
- Useful presentation on camera calibration from the University of Freiburg:
- The following are three blog posts that provide very detailed and clear analysis of the pinhole camera model:
4 Using OpenCV with ROS
4.1 vision_opencv
- cv_bridge
- This tool is used for converting ROS messages of type
sensor_msgs/Image
into OpenCV image formats. You should always expect to use this package when combining ROS and OpenCV. - image_geometry
- This package can utilize a
sensor_msgs/CameraInfo
message to instantiate a class containing a pinhole camera model. This class then has many image-related geometry computations available. For example,image_geometry.PinholeCameraModel.project3dToPixel
takes in a tuple/list of \((x,y,z)\) world coordinates of a point, and it returns the corresponding pixel coordinates \((u,v)\). The inverse of this function isimage_geometry.PinholeCameraModel.projectPixelTo3dRay
. This package helps to guarantee that you correctly interpret and use the information from a calibrated camera (which can be tricky). Using this package for geometry calculations is highly recommended. - opencv_apps
- This package contains many image processing nodes that perform a single, common operation. Using these nodes, one can build up a sequence of image processing operations using only a single launch file. Example nodes include an edge detection node, a face detection node, multiple optical flow nodes, etc. This can be a great way to prototype new applications.
5 Miscellaneous Useful Image Processing Packages
5.1 Tag Tracking
There are a variety of tag tracking solutions available as ROS packages. They vary in terms of their robustness, ease of use, number of tags tracked, etc. A good introduction can be found in the ROS Tag Tracking mini project from this class in 2014.
- ar_sys
- ar_pose
- ar_track_alvar
- visp_auto_tracker
- apriltags (perhaps the ethzasl_apriltag version is better, they claim less memory usage)