Home > Technology peripherals > AI > body text

From a technical perspective, let's talk about why binocular autonomous driving systems are difficult to popularize?

王林
Release: 2023-05-19 10:12:20
forward
1553 people have browsed it

Monocular vision is the magic weapon of Mobileye (ME). In fact, it also considered binocular vision at that time, but finally chose to give up.

What do monocular ranging and 3-D estimation rely on? It is the Bounding Box (BB) that detects the target. If the obstacle cannot be detected, the system cannot estimate its distance and 3-D attitude/orientation. Without deep learning, ME mainly estimates the distance based on the BB, the attitude and height obtained by camera calibration, and the assumption that the road surface is straight.

With deep learning, the NN model can be trained based on 3-D ground truth to obtain 3D size and attitude estimates. The distance is obtained based on the parallel line principle (single view metrology) . The monocular L3 solution announced by Baidu Apollo not long ago explains it clearly. The reference paper is "3D Bounding Box Estimation by Deep Learning and Geometry".

Binocular can certainly calculate parallax and depth, even if no obstacle is detected (because of the additional depth information, the detector will be better than the monocular), it will alarm. The problem is that it is not that easy for a binocular vision system to estimate disparity. Stereo matching is a typical problem in computer vision. A wide baseline results in accurate ranging results for far targets, while a short baseline results in good ranging results for near targets. There is a trade-off here.

The current ADAS binocular vision system on the market is Subaru EyeSight, and its performance is said to be okay.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

The Apollo L4 shuttle bus launched by Baidu was mass-produced with 100 units installed, and a binocular system was installed. The EU autonomous parking project V-Charge also uses a forward binocular vision system, as does the autonomous driving research and development system Berta Benz, which is integrated with the radar system. Among them, the binocular matching obstacle detection algorithm Stixel is very famous. Tier-1 companies such as Bosch and Conti have also developed binocular vision solutions in the past, but they did not have an impact on the market and were reportedly axed.

Talking about the difficulties of the binocular system, in addition to stereo matching, there is also calibration. The calibrated system will "drift", so online calibration is a must. The same is true for monocular, because tire deformation and vehicle body bumps will affect the changes in external parameters of the camera, and some parameters must be calibrated and corrected online, such as pitch angle and yaw angle.

Binocular online calibration is more complicated. Because binocular matching is simplified as much as possible into a 1-D search, it is necessary to use stereo rectification to make the optical axes of the two lenses parallel and perpendicular to the baseline. . Therefore, compared with the gained gain, merchants will give up if it is not cost-effective due to the increased complexity and cost.

Binocular vision has been mentioned recently because Silicon Valley chip company Ambarella acquired Vis Lab of the University of Parma in Italy in 2014 and developed binocular ADAS and autonomous driving chips. , after CES last year, it began to enter car companies and Tier-1. Moreover, Ambarella is currently continuing research to improve the performance of the system.

The picture below is a schematic diagram of six pairs of stereo vision systems installed on the roof of the car. Their baseline widths can be different, and the effective detection distances are correspondingly different. The author once took a ride in its self-driving car and could see 200 meters in the distance and 20-30 meters in the distance. It can indeed perform online calibration and adjust some binocular vision parameters at any time.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

01 Stereo matching

Let’s talk about stereo matching first, that is, disparity/depth estimation. As shown in the figure, assume that the focal length of the left and right cameras is f, the width of the baseline (the line connecting the two optical centers) is B, the depth of the 3-D point

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

#Visible parallax can inversely calculate the depth value. But the most difficult thing here is how to determine that the images seen by the left and right lenses are the same target, that is, the matching problem.

There are two matching methods, global method and local method. There are four steps for binocular matching:

  • Matching cost calculation;
  • Cost aggregation;
  • Disparity calculation/optimization;
  • Disparity Refinement.

The most famous local method is SGM (semi-global matching). The method used by many products is based on this improvement. Many vision chips use this method. algorithm.

SGM approximates a global optimization into a combination of multiple local optimization problems. The following formula is the optimization objective function of 2-D matching. SGM is implemented as multiple 1-D optimization paths. The sum

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

The following figure is the path optimization function along the horizontal direction

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

Census Transform converts 8/24-bit pixels into a binary sequence. Another binary feature called LBP (local binary pattern) is similar to it. The stereo matching algorithm is based on this transformation and turns the matching into a minimization search of Hamming distance. Intel's RealSense acquired a binocular vision startup company founded in 1994 based on this technology, and also acquired several other small companies and combined them to create it.

The following figure is a schematic diagram of CS transformation:

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

PatchMatch is an algorithm that accelerates image template matching , is used in optical flow calculation and disparity estimation. Microsoft Research has previously done a project based on 3-D reconstruction of a monocular mobile phone camera, imitating the previously successful KinectFusion based on RGB-D algorithm, with a name similar to MonoFusion, in which the depth map estimation uses a modified PatchMatch method.

The basic idea is to randomly initialize the disparity and plane parameters, and then update the estimate through information propagation between neighboring pixels. The PM algorithm is divided into five steps:

  • 1) Spatial propagation: Each pixel checks the disparity and plane parameters of the left and upper neighbors. If the matching cost becomes smaller, Replace the current estimate;
  • 2) View propagation: Transform pixels from other perspectives, check the estimate of the corresponding image, and replace it if it becomes smaller;
  • 3) Temporal propagation: Consider the estimation of corresponding pixels in the previous and next frames;
  • 4) Plane refinement: Randomly generate samples , if the estimate brings the matching cost down, update.
  • 5) Post-processing: Left-right consistency and weighted median filter remove outliers.

The following picture is a schematic diagram of PM:

02 Online calibration

More on Online calibration.

This is a calibration method that uses road markings (zebra crossings): the parallel line pattern of the zebra crossing is known, the zebra crossing is detected and the corner points are extracted, and the zebra crossing pattern is calculated to match the road surface. Homography parameters are used to obtain calibration parameters.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

The other method is based on VO and SLAM, which is more complicated, but it can do map-based positioning at the same time. Using SLAM for online calibration is not suitable for high-frequency operations. The following figure is the flow chart of its algorithm: Steps 1-4, obtain the global continuous map through stereo vision SLAM; Step 5 gives the initial estimate of the binocular camera transformation, Step 6 Aggregate the maps of all stereo cameras into one map; obtain the poses between multiple cameras in steps 7-8.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

Similar to the monocular method, online calibration can be quickly completed using the assumption that the lane lines are parallel and the road is flat, that is, the vanishing point theory: assuming a flat road Model, clear longitudinal lane lines, no other objects have edges parallel to them; the driving speed is required to be slow, the lane lines are continuous, and the binocular configuration of the left and right cameras requires a comparison of the elevation/roll angles of the left camera relative to the road surface (yaw/roll angles) Small; in this way, the drift amount of the binocular external parameters can be calculated by comparing it with the initialized vanishing point (related to offline calibration) (Figure 5-269). The algorithm is to estimate the camera elevation/oblique angle from the vanishing point.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

03 Typical binocular automatic driving system

The following introduces several typical binocular automatic driving systems driving system.

The obstacle detection algorithm Stixel adopted by Berta Benz is based on the following assumptions: the targets in the scene are described as columns, the center of gravity of the target is standing on the ground, and the upper part of each target is larger than The lower part has great depth. The following figure (a-d) introduces how SGM disparity results generate Stixel segmentation results:

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

The following figure is a schematic diagram of Stixels calculation: (a) Based on dynamics Planned free driving space calculation (b) Attribute values ​​in height segmentation (c) Cost image (grayscale values ​​reversed) (d) Height segmentation.

This is the block diagram and new results of Stixel after they added deep learning to do parallax fusion:

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

##Introducing a VisLab early binocular obstacle algorithm, Generic Obstacle and Lane Detection system (GOLD). Based on IPM (Inverse Perspective Mapping), detect lane lines and calculate obstacles on the road based on the difference between the left and right images:

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

(a) Left . (b) Right (c) Remapped left. (d) Remapped right. (e) Thresholded and filtered difference between remapped views. (f) In light gray, the road area visible from both cameras.

(a) Original. (b) Remapped. (c) Filtered. (d) Enhanced. (e) Binarized.

GOLD system architecture

This is VisLab’s vehicle participating in the autonomous driving competition VIAC (VisLab Intercontinental Autonomous Challenge). In addition to binocular cameras, there is also a laser on the vehicle Radar as an aid to road classification.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

#This is its binocular obstacle detection flow chart: disparity estimation utilizes the SGM algorithm and related algorithms based on SAD.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

Two DSI (Disparity Space Image) space filters are added to the post-processing, as shown in Figure 5-274. One is smoothing processing. The other is motion trajectory processing based on inertial navigation (IMU).

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

The obstacle detection algorithm adopts the JPL method and clusters obstacles based on the spatial layout characteristics and the physical characteristics of the vehicle. The physical characteristics include the maximum height (vehicle), the minimum height (obstacle) and the maximum passable range of the road. These constraints define a spatial truncated cone, as shown in the figure, then during the clustering process, everything falls Points within the truncated cone are designated as obstacles.

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

In order to speed up the disparity estimation algorithm, the method of dividing DSI is adopted:

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

Another classic method is to obtain the road parallax based on the road equation (stereo vision), and calculate the obstacles on the road based on this:

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

From a technical perspective, lets talk about why binocular autonomous driving systems are difficult to popularize?

04 Summary

In general, the method of binocular detection of obstacles is basically based on disparity maps, and there are many methods based on road disparity. Perhaps with the rapid development of deep learning and the enhancement of computing platforms, binocular autonomous driving systems will also become popular.

The above is the detailed content of From a technical perspective, let's talk about why binocular autonomous driving systems are difficult to popularize?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:51cto.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template