Image Processing with ROS

Overview

This lecture is about how to interface ROS with cameras and vision processing libraries.

Cameras On Linux:

Cameras on Linux are viewed as files (like any other device).
Cameras on Linux are usually called /dev/videoX, where X is a number
Currently some cameras create two video devices, the lower numbered one is the one to use
v4l2-ctl is a command line tool for interfacing with cameras. See man v4l2-ctl
v4l2-ctl --list-formats-ext prints useful information about the formats supported by cameras connected to your computer
- You often need to specify these parameters to the ROS camera node, as parameters
v4l2-ctl --all prints all information about your cameras
v4l2-ctl --list-devices Lists all the usb cameras. Some cameras have multiple /dev/video* devices so it is useful to see what goes with what
When working with cameras you should use udev rules to give your cameras the proper permissions and a persistent name.
More information can be found here: V4l2

ROS Camera "Drivers"

usb_cam The main and most up-to-date ROS camera driver. Try this one first and prefer it to the others.
cv_camera A camera driver that uses OpenCV to open the camera. So if you can use your camera in open CV this node should work
libuvc_camera (not released for noetic)

Depending on your camera, one may work better than the other: libuvc is for cameras that follow the UVC standard (most modern webcameras) and usb_cam uses the v4l framework (video for linux).

These camera nodes access your camera via the video device file (/dev/videoX)
These camera "drivers" accept several private parameters for setting camera options and formats.
- Essentially, the node reads from the camera and publishes an image message
Sometimes, you need to experiment with various options to get a usable webcam image.
- If the node prints lots of warnings, it usually means a parameter like the image format should be changed
- Other information that the node prints can be useful for debugging issues or tweaking parameters:
- If possible, try to fix any warnings that you may find
The image display type in rviz can be used to see the video from a camera
Other industrial cameras may have their own ROS node drivers, or you may need to write your own.
usb_cam lets you access the camera via /dev/videoX, whereas libuvc_camera lets you access it via Vendor Id (VID), Product Id (PID), and Serial number

Using usb_cam

The camera is one of the /dev/videoX devices
Run the usb_cam node. The basic parameters for most webcams are: rosrun usb_cam usb_cam_node _pixel_format:=yuyv
rosrun image_view image_view image:=/usb_cam/raw_image to view
You can also view the image in rviz or rqt_image_view

Using cv_camera

The camera is one of the /dev/videoX devices
Run the cv_camera_node. The basic parameters for most webcams are: rosrun cv_camera cv_camera_node

Using libuvc_camera

Find the VID, PID, and Serial number of your camera: requires some investiagion
- lsusb - Print VID and PID of cameras on your computer
- v4lctl --all - See camera's attached to computer (can use this to guess which USB device it is)
Create udev rules for the camera:
- Copy udev rules to /etc/udev/rules.d
- sudo udevadm control --reload (reloads udev rules)
- sudo udevadm trigger (re-runs udev rules)
Start with the launchfile provided on the wiki and edit it to match your camera settings

The Image Pipeline

image_pipeline contains several packages relating to image manipulation in ROS. The idea is to chain various calibration and image processing steps together to complete computer vision tasks. The pipeline also applies to stereo-vision cameras and 3D point clouds as well as monocular images.

Note: if you get errors about the python interpreter not being found or missing cv2, then you should install the perception pipeline from source in ~/customws: https://github.com/ros-perception/image_pipeline

Camera Calibration

Finds the intrinsic parameters of a camera using a checkerboard pattern and OpenCV.
The calibration can be stored and used to provide a camera_info topic so that the calibration can be used by other nodes.
- The CameraInfo Message Definition has some useful information
See this tutorial
More information about camera calibration CameraInfo
Some more information on calibration matrices for the pinhole camera model
Matlab tutorial on camera calibration

image_proc Camera lenses cause distortion. Rectification is a transformation to make the image "rectangular" by accounting for camera lens distortion. This node handles rectification for you: Subscribes to a raw camera image (image_raw) and camera calibration information camera_info and publishes

image_mono: A monochrome unrectified image
image_rect: A monochrome rectified image
image_color: A color unrectified image. These images are de-omsaiced (that is it accounts for the rgb pixel pattern of the camera)
image_rect_color A color rectified image.

image_view

Simple, stand-alone image viewer. Specify the raw_image topic as a parameter and it will display the image
There is also rqt_image_view (although that is part of rqt, not the image pipeline)

depth_image_proc Processes depth image

In a depth image, each pixel corresponds to the depth (i.e., how far light has to travel before hitting something) in the scene.
Used by RGB-D cameras such as the intel RealSense.
Used to make point clouds (basically 3 dimensional pixelized images) from calibrated depth cameras

Other Image Tools

Image Transport determines how image data is sent across the system.

Rather than raw camera data, the image data can be compressed to use less bandwidth. For python usage, the republish feature is the most important

Web Video Server used to create an http stream of video, useful if interfacing your robot to a website.

OpenCV

OpenCV is a major image processing library. Their website has many tutorials on how to process images.

Some packages that use/relate to OpenCV are:

cv_bridge Converts between OpenCV data and ROS messages.
- If you want to use opencv with ROS, you should use this package
opencv_apps Each "app" is a node that performs a single image processing function using opencv. Thus you can chain together many computer vision algorithms by connecting some nodes together in a launchfile.
image_geometry Using a camera calibration, lets you convert between pixel coordinates and the other frames in your system.
- Python API documentation

Tag Tracking

Tags are special 2-Dimensional patterns that are designed to be viewed with a camera. Computer vision algorithms can then determine the 6 Degree of freedom pose of the tag and often an identifier (thus Tags are like QR codes but they also contain geometric information.

The best tags to use are April tags.

Prior to the invention of April tags, AR tags were a popular choice for tag tracking. These were popular enough at a time that many people (included me) still say "AR tag".

If I say this, please correct me, I mean April tags.

I see no reason to use this package over apriltag_ros, but you may encounter other packages that use it.

ar_track_alvar

There are many other types of tags, that have different properties: for example aruco tags.

Other Image Packages and Libraries

Darknet ROS - realtime object detection
Up to date list of visual SLAM solutions
face_detector
facial_recognition You will need to compile this since it was last released for ROS indigo.
2D tracking of multiple objects
Find object 2D Feature detection to find objects

Example Image Processing

See https://github.com/m-elwin/me495_image.git

Other Learning Resources

Cameras and Lenses (Highly recommend this)
mrcal NASA-quality lenses modeling and calibration software.