Image Processing with ROS
Overview
This lecture is about how to interface ROS with cameras and vision processing libraries.
Cameras On Linux:
- Cameras on Linux are viewed as files (like any other device).
- Cameras on Linux are usually called
/dev/videoX
, whereX
is a number - Currently some cameras create two video devices, the lower numbered one is the one to use
v4l2-ctl
is a command line tool for interfacing with cameras. Seeman v4l2-ctl
v4l2-ctl --list-formats-ext
prints useful information about the formats supported by cameras connected to your computer- You often need to specify these parameters to the ROS camera node, as parameters
v4l2-ctl --all
prints all information about your camerasv4l2-ctl --list-devices
Lists all the usb cameras. Some cameras have multiple/dev/video*
devices so it is useful to see what goes with what- When working with cameras you should use udev rules to give your cameras the proper permissions and a persistent name.
- More information can be found here: V4l2
ROS Camera "Drivers"
- usb_cam The main and most up-to-date ROS camera driver. Try this one first and prefer it to the others.
- cv_camera A camera driver that uses OpenCV to open the camera. So if you can use your camera in open CV this node should work
- libuvc_camera (not released for noetic)
Depending on your camera, one may work better than the other: libuvc is for cameras that follow the UVC standard (most modern webcameras) and usb_cam uses the v4l framework (video for linux).
- These camera nodes access your camera via the video device file (
/dev/videoX
) - These camera "drivers" accept several private parameters for setting camera options and formats.
- Essentially, the node reads from the camera and publishes an image message
- Sometimes, you need to experiment with various options to get a usable webcam image.
- If the node prints lots of warnings, it usually means a parameter like the image format should be changed
- Other information that the node prints can be useful for debugging issues or tweaking parameters:
- If possible, try to fix any warnings that you may find
- The image display type in rviz can be used to see the video from a camera
- Other industrial cameras may have their own ROS node drivers, or you may need to write your own.
usb_cam
lets you access the camera via/dev/videoX
, whereaslibuvc_camera
lets you access it via Vendor Id (VID), Product Id (PID), and Serial number
Using usb_cam
- The camera is one of the
/dev/videoX
devices - Run the
usb_cam
node. The basic parameters for most webcams are:rosrun usb_cam usb_cam_node _pixel_format:=yuyv
rosrun image_view image_view image:=/usb_cam/raw_image
to view- You can also view the image in
rviz
orrqt_image_view
Using cv_camera
- The camera is one of the
/dev/videoX
devices - Run the
cv_camera_node
. The basic parameters for most webcams are:rosrun cv_camera cv_camera_node
Using libuvc_camera
- Find the VID, PID, and Serial number of your camera: requires some investiagion
lsusb
- Print VID and PID of cameras on your computerv4lctl --all
- See camera's attached to computer (can use this to guess which USB device it is)
- Create udev rules for the camera:
- Copy udev rules to
/etc/udev/rules.d
sudo udevadm control --reload
(reloads udev rules)sudo udevadm trigger
(re-runs udev rules)
- Copy udev rules to
- Start with the launchfile provided on the wiki and edit it to match your camera settings
The Image Pipeline
image_pipeline contains several packages relating to image manipulation in ROS. The idea is to chain various calibration and image processing steps together to complete computer vision tasks. The pipeline also applies to stereo-vision cameras and 3D point clouds as well as monocular images.
- Note: if you get errors about the python interpreter not being found or missing
cv2
, then you should install the perception pipeline from source in~/customws
: https://github.com/ros-perception/image_pipeline
- Finds the intrinsic parameters of a camera using a checkerboard pattern and OpenCV.
- The calibration can be stored and used to provide a
camera_info
topic so that the calibration can be used by other nodes.- The CameraInfo Message Definition has some useful information
- See this tutorial
- More information about camera calibration CameraInfo
- Some more information on calibration matrices for the pinhole camera model
- Matlab tutorial on camera calibration
image_proc
Camera lenses cause distortion. Rectification is a transformation to make the image "rectangular" by accounting
for camera lens distortion. This node handles rectification for you:
Subscribes to a raw camera image (image_raw
) and camera calibration information camera_info
and
publishes
image_mono
: A monochrome unrectified imageimage_rect
: A monochrome rectified imageimage_color
: A color unrectified image. These images are de-omsaiced (that is it accounts for the rgb pixel pattern of the camera)image_rect_color
A color rectified image.
- Simple, stand-alone image viewer. Specify the
raw_image
topic as a parameter and it will display the image - There is also
rqt_image_view
(although that is part of rqt, not the image pipeline)
depth_image_proc Processes depth image
- In a depth image, each pixel corresponds to the depth (i.e., how far light has to travel before hitting something) in the scene.
- Used by RGB-D cameras such as the intel RealSense.
- Used to make point clouds (basically 3 dimensional pixelized images) from calibrated depth cameras
Other Image Tools
- Image Transport determines how image data is sent across the system.
Rather than raw camera data, the image data can be compressed to use less bandwidth.
For python usage, the republish
feature is the most important
- Web Video Server used to create an http stream of video, useful if interfacing your robot to a website.
OpenCV
OpenCV is a major image processing library. Their website has many tutorials on how to process images.
Some packages that use/relate to OpenCV are:
- cv_bridge Converts between OpenCV data and ROS messages.
- If you want to use opencv with ROS, you should use this package
- opencv_apps Each "app" is a node that performs a single image processing function using opencv. Thus you can chain together many computer vision algorithms by connecting some nodes together in a launchfile.
- image_geometry Using a camera calibration, lets you convert between pixel coordinates and the other frames in your system.
Tag Tracking
Tags are special 2-Dimensional patterns that are designed to be viewed with a camera. Computer vision algorithms can then determine the 6 Degree of freedom pose of the tag and often an identifier (thus Tags are like QR codes but they also contain geometric information.
The best tags to use are April tags.
Prior to the invention of April tags, AR tags were a popular choice for tag tracking. These were popular enough at a time that many people (included me) still say "AR tag".
- If I say this, please correct me, I mean April tags.
I see no reason to use this package over apriltag_ros, but you may encounter other packages that use it.
There are many other types of tags, that have different properties: for example aruco tags.
Other Image Packages and Libraries
- Darknet ROS - realtime object detection
- Up to date list of visual SLAM solutions
- face_detector
- facial_recognition You will need to compile this since it was last released for ROS indigo.
- 2D tracking of multiple objects
- Find object 2D Feature detection to find objects
Example Image Processing
Other Learning Resources
- Cameras and Lenses (Highly recommend this)
- mrcal NASA-quality lenses modeling and calibration software.