Autonomous navigation using Intel® RealSense™ Technology
Most of us are familiar with the concept of autonomous, self-driving cars, but perhaps less familiar with other types of autonomous technology outside of science fiction. In 2019 the robotics market was valued at almost $40 billion dollars, which is forecast to grow by 25% in 2020. The commercial drone market is valued around $6 billion dollars. While a significant portion of both markets is comprised of operator-controlled robots or drones, or robots that perform repetitive tasks, there is a growing need for autonomous versions of both.
The difference between autonomous and automatic
Autonomous devices do not require a human to have programmed or anticipated every action they will take in response to a changing environment. An autonomous robot can be directed to a location but react to obstacles intelligently along the way. For example, this video shows a prototype robot that uses the Intel® RealSense™ D435 and T265 to plan a path from A to B, but also to react to objects thrown in its intended path.
It’s easy to think of robots in the science fiction sense of a humanoid robot, able to respond to us in any situation and act in a way similar to an artificial human. The truth is that this kind of robot is far out in the future. Robots today are not designed to replace humans, but rather to augment and extend our capabilities by taking over dangerous or repetitive tasks like hospital food delivery or industrial cleaning. In any of these uses, there are some common capabilities that an autonomous robot needs in order to be able to navigate safely.
Object detection
A very basic capability – robots and drones need to be able to detect objects in order to react appropriately to them. A depth camera like the Intel RealSense depth camera D455, D435i or D415 can be used to detect many different objects easily, but there are some challenging materials that can be hard to recognize automatically. In this paper from the Beijing University of Chemical Technology, the team explores the use of an Intel RealSense depth camera along with some additional object recognition algorithms to identify transparent objects. By moving the camera around the object and taking a few different images, they successfully created scans of transparent objects that they were able to reconstruct and prepare for robot grasping.
Detecting objects and recognizing objects are two different capabilities. Intel RealSense depth cameras can detect objects and measure objects within their field of view fairly trivially, since measuring size and distance is an inherent capability of the cameras. Recognizing what an object is requires additional effort however, utilizing a machine learning framework. Using a depth camera to train an object recognition system does offer some advantages like the ability to easily segment an object from background objects, so less training data may be necessary than with a 2d image-based approach.
Collision avoidance
A key factor in safety, collision avoidance is the ability of a robot to be able to navigate around or avoid any obstacles. This includes both static obstacles like walls or uneven surfaces, but also temporary or moving obstacles like people or pets. While some robots are robust enough to handle literal bumps in the road, an autonomous robotic wheelchair needs to be able to segment a ‘drivable’ area from obstacles and undrivable areas in order to safely carry people without injury. Researchers at the Hong Kong University of Science and Technology tackle this problem space in this paper. They use an Intel RealSense D415 depth camera in order to measure the depth difference between drivable surfaces and road anomalies that have a height larger than 5cm from that surface.
Using the depth camera allows them to use a Self-Supervised Label Generator (SSLG) to automatically label road anomalies and drivable areas. This approach gives better results than traditional methods because it can adapt autonomously at run time to new obstacles the system may not have seen before. They trained the system on around 4000 hand labeled depth images gathered in areas that are common situations for wheelchairs, such as sidewalks. The system does not require this hand labeled ground truth, this is simply used to evaluate the effectiveness of the autonomous approach. Areas beyond the range of the camera are automatically labeled as ‘unknown’ where everything else in the scene is segmented into drivable and non-drivable or road anomaly areas.
While the main goal of this paper is to create a system that works for autonomous wheelchairs, there is no reason that such an approach would not also work well for delivery robots or any other wheeled autonomous robot.
Path Planning
Once drivable areas are identified, the next stage of autonomous navigation is to plan a path of motion that avoids obstacles. This paper uses an algorithm they developed to take data from a D435 depth camera to plan paths through agricultural greenhouses. In previous research they had used a 2D laser range finder to sense drivable areas prior to planning their paths, but the laser range finder had issues identifying the plants as obstacles. Using the depth camera allowed them to refine the algorithm to correctly allow the robot to move through the rows without collisions.
Simultaneous Localization and Mapping
Collision avoidance, path planning and object detection algorithms inherently have no historical data about the location of a robot. They are acting in real time or planning future actions, but once those actions have taken place or are taking place in real time, they do not have any awareness or memory of where they have been. This can be a challenge – in the last paper, the researchers found the problem that once a plant is out of view of the camera, collisions could occur even where they had been previously detected and avoided.
For awareness of a space and location within it, Simultaneous Localization and Mapping or SLAM is necessary. SLAM algorithms like those embedded in the Intel RealSense Tracking Camera T265 are designed to help an object know not only where it is but where it has been within a space. It does this by taking the visual feed from a pair or more of cameras, combined with other sensors like wheel odometers or inertial measurement units(IMU) and compare how visual features and the data for those sensors move over time. This happens extremely rapidly, and with this changing data, the algorithms create a ‘map’ of the area a robot or drone has traveled. This map can then be shared with other robots, so for example, you could have a swarm of robots designed to travel through the same space every ten minutes. The first one would create the map and path which could then be shared with the following robots.
Subscribe here to get blog and news updates.
You may also be interested in
“Intel RealSense acts as the eyes of the system, feeding real-world data to the AI brain that powers the MR
In a three-dimensional world, we still spend much of our time creating and consuming two-dimensional content. Most of the screens