UP | HOME

Image Processing with ROS

Overview

This lecture is about how to interface ROS with cameras and vision processing libraries.

Cameras On Linux:

  • Cameras on Linux are viewed as files (like any other device).
  • Cameras on Linux are usually called /dev/videoX, where X is a number
  • Currently some cameras create two video devices, the lower numbered one is the one to use
  • v4l2-ctl is a command line tool for interfacing with cameras. See man v4l2-ctl
  • v4l2-ctl --list-formats-ext prints useful information about the formats supported by cameras connected to your computer
    • You often need to specify these parameters to the ROS camera node, as parameters
  • v4l2-ctl --all prints all information about your cameras
  • v4l2-ctl --list-devices Lists all the usb cameras. Some cameras have multiple /dev/video* devices so it is useful to see what goes with what
  • When working with cameras you should use udev rules to give your cameras the proper permissions and a persistent name.
  • More information can be found here: V4l2

ROS Camera "Drivers"

  1. usb_cam The main and most up-to-date ROS camera driver. Try this one first and prefer it to the others.
  2. cv_camera A camera driver that uses OpenCV to open the camera. So if you can use your camera in open CV this node should work
  3. libuvc_camera (not released for noetic)

Depending on your camera, one may work better than the other: libuvc is for cameras that follow the UVC standard (most modern webcameras) and usb_cam uses the v4l framework (video for linux).

  • These camera nodes access your camera via the video device file (/dev/videoX)
  • These camera "drivers" accept several private parameters for setting camera options and formats.
    • Essentially, the node reads from the camera and publishes an image message
  • Sometimes, you need to experiment with various options to get a usable webcam image.
    • If the node prints lots of warnings, it usually means a parameter like the image format should be changed
    • Other information that the node prints can be useful for debugging issues or tweaking parameters:
    • If possible, try to fix any warnings that you may find
  • The image display type in rviz can be used to see the video from a camera
  • Other industrial cameras may have their own ROS node drivers, or you may need to write your own.
  • usb_cam lets you access the camera via /dev/videoX, whereas libuvc_camera lets you access it via Vendor Id (VID), Product Id (PID), and Serial number

Using usb_cam

  1. The camera is one of the /dev/videoX devices
  2. Run the usb_cam node. The basic parameters for most webcams are: rosrun usb_cam usb_cam_node _pixel_format:=yuyv
  3. rosrun image_view image_view image:=/usb_cam/raw_image to view
  4. You can also view the image in rviz or rqt_image_view

Using cv_camera

  1. The camera is one of the /dev/videoX devices
  2. Run the cv_camera_node. The basic parameters for most webcams are: rosrun cv_camera cv_camera_node
Using libuvc_camera
  1. Find the VID, PID, and Serial number of your camera: requires some investiagion
    • lsusb - Print VID and PID of cameras on your computer
    • v4lctl --all - See camera's attached to computer (can use this to guess which USB device it is)
  2. Create udev rules for the camera:
    • Copy udev rules to /etc/udev/rules.d
    • sudo udevadm control --reload (reloads udev rules)
    • sudo udevadm trigger (re-runs udev rules)
  3. Start with the launchfile provided on the wiki and edit it to match your camera settings

The Image Pipeline

image_pipeline contains several packages relating to image manipulation in ROS. The idea is to chain various calibration and image processing steps together to complete computer vision tasks. The pipeline also applies to stereo-vision cameras and 3D point clouds as well as monocular images.

Camera Calibration

image_proc Camera lenses cause distortion. Rectification is a transformation to make the image "rectangular" by accounting for camera lens distortion. This node handles rectification for you: Subscribes to a raw camera image (image_raw) and camera calibration information camera_info and publishes

  1. image_mono: A monochrome unrectified image
  2. image_rect: A monochrome rectified image
  3. image_color: A color unrectified image. These images are de-omsaiced (that is it accounts for the rgb pixel pattern of the camera)
  4. image_rect_color A color rectified image.

image_view

  • Simple, stand-alone image viewer. Specify the raw_image topic as a parameter and it will display the image
  • There is also rqt_image_view (although that is part of rqt, not the image pipeline)

depth_image_proc Processes depth image

  • In a depth image, each pixel corresponds to the depth (i.e., how far light has to travel before hitting something) in the scene.
  • Used by RGB-D cameras such as the intel RealSense.
  • Used to make point clouds (basically 3 dimensional pixelized images) from calibrated depth cameras

Other Image Tools

  1. Image Transport determines how image data is sent across the system.

Rather than raw camera data, the image data can be compressed to use less bandwidth. For python usage, the republish feature is the most important

  1. Web Video Server used to create an http stream of video, useful if interfacing your robot to a website.

OpenCV

OpenCV is a major image processing library. Their website has many tutorials on how to process images.

Some packages that use/relate to OpenCV are:

  1. cv_bridge Converts between OpenCV data and ROS messages.
    • If you want to use opencv with ROS, you should use this package
  2. opencv_apps Each "app" is a node that performs a single image processing function using opencv. Thus you can chain together many computer vision algorithms by connecting some nodes together in a launchfile.
  3. image_geometry Using a camera calibration, lets you convert between pixel coordinates and the other frames in your system.

Tag Tracking

Tags are special 2-Dimensional patterns that are designed to be viewed with a camera. Computer vision algorithms can then determine the 6 Degree of freedom pose of the tag and often an identifier (thus Tags are like QR codes but they also contain geometric information.

The best tags to use are April tags.

Prior to the invention of April tags, AR tags were a popular choice for tag tracking. These were popular enough at a time that many people (included me) still say "AR tag".

  • If I say this, please correct me, I mean April tags.

I see no reason to use this package over apriltag_ros, but you may encounter other packages that use it.

There are many other types of tags, that have different properties: for example aruco tags.

Other Image Packages and Libraries

  1. Darknet ROS - realtime object detection
  2. Up to date list of visual SLAM solutions
  3. face_detector
  4. facial_recognition You will need to compile this since it was last released for ROS indigo.
  5. 2D tracking of multiple objects
  6. Find object 2D Feature detection to find objects

Example Image Processing

Other Learning Resources

Author: Matthew Elwin