Over 70% mAP for the first time! GeMap: Local high-precision map SOTA refreshed again-AI-php.cn

Table of Contents

Written in front&The author’s personal understanding

The geometric properties of vectorized high-precision maps

Propose the importance of geometric representation for high-precision maps

Design of GeMap

Summary

Home

Technology peripherals

Over 70% mAP for the first time! GeMap: Local high-precision map SOTA refreshed again

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Dec 15, 2023 am 10:46 AM

Autopilot map

Written in front&The author’s personal understanding

Building vectorized high-precision maps based on sensor data in real time is crucial for downstream tasks such as prediction and planning, and can effectively make up for offline high-precision maps The disadvantage of poor real-time performance of the map. With the development of deep learning, online vectorized high-precision map construction has gradually emerged, and representative works such as HDMapNet, MapTR, etc. have emerged one after another. However, existing online vectorized high-precision map construction methods lack exploration of the geometric properties of map elements (including the shape of elements, vertical, parallel and other geometric relationships).

The geometric properties of vectorized high-precision maps

The vectorized high-precision maps highly abstract the elements on the road and represent each map element as a two-dimensional point sequence. The design of urban roads has specific specifications. For example, in most cases, pedestrian crosswalks are square rectangle or parallelogram; in road sections that do not involve diverging and merging, two adjacent Lanes are parallel to each other. Different elements in high-precision maps also have many similar characteristics. These common-sense rules are abstracted into Geometric properties of high-precision maps, including the shape of map elements (rectangle, parallelogram , straight lines, etc.), or associations between different map elements (parallel, vertical, etc.). Geometric properties strongly constrain the representation of map elements. If you fully understand the geometric properties of online model construction, you can get more accurate results.

Propose the importance of geometric representation for high-precision maps

Although in theory it is still possible for existing models to learn the geometric properties of map elements, however, the geometric properties The characteristics determine that at least under traditional design, the model is not easy to learn.

Invariance of geometric properties

When the central vehicle drives straight on the road, changes lanes, or turns, (in the vehicle coordinate system ) The absolute coordinates of map elements are constantly changing. The shape of crosswalks, lanes, road boundaries, etc. will not change; similarly, the parallel relationship between lanes will not change. The geometric properties of map elements are objective, and one of its important characteristics is invariance. More specifically, it is rigid invariance (remaining invariant to rotation and translation transformation). Previous work, whether using simple polyline representation or polynomial curves with control points (such as Bezier curves, piecewise Bezier curves), was based on absolute coordinates, and in absolute coordinates Basic end-to-end optimization. The optimization goal based on absolute coordinates itself does not have rigid invariance, so it is difficult to expect that the local optimal solution that the model falls into contains an understanding of geometric properties. Therefore, a representation that can fully characterize the geometric properties and have certain invariance is necessary.

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新 Figure 1. Example of geometric invariance.

When the vehicle turns right, the absolute coordinates will change significantly. The image on the right shows a corresponding real-life scenario.

Diversity of Geometric Properties

Furthermore, despite strong prior knowledge, the geometric properties of roads are still diverse. These various geometric properties can generally be divided into two categories, one is about the geometric shape of a single map element, and the other is about the geometric association of different map elements. Due to the diversity of geometric properties, it is impossible to exhaustively and manually convert geometric properties into constraints, so we prefer that the model can autonomously learn a variety of geometric properties end-to-end.

Design of GeMap

Geometric representation

In view of the above two problems, we first improve the representation method. We hope to introduce a good geometric representation in addition to the traditional representation based on absolute coordinates, which needs to satisfy:

Be able to describe the shape of map elements
Be able to depict the association between map elements
rigidityinvariance

to ensure translation invariance, we used a relative quantity, that is, the offset vector between points; to further ensure rotation invariance, we chose the length of the offset vector , and different Angle between offset vectors. These two - length and angle - form the basis of the geometric representation we propose. In addition, in order to better distinguish and describe shapes and relate two different types of geometric properties, we further refined the design according to the principle of simplicity:

In order to describe the shape, weCalculate the offset vector between adjacent points in a single map element, and calculate the length of the offset vector and the angle between adjacent offset vectors. This representation uniquely identifies any polyline/polygon. Examples of two images are shown below:

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

Please look at Figure 2, which shows the representation of geometric shapes

For a rectangle, it can be described by using a right angle and two pairs of equal sides; for a straight line, all included angles are 0 Degree or 180.

To characterize the association, similarly, we first consider the distance between any two points. However, if the angle is calculated for all point-to-point offset vectors, the complexity of the representation is too high and the computational cost is unaffordable. Specifically, assuming that there are a total of map elements, and each element is represented by a point, the amount of data for all angles will reach (when taking 1000, assuming that each angle data is a 32-bit floating point number, such a representation is only The space occupied will reach TB level). In fact, this is not necessary for normal vertical, parallel, etc. relationships. Therefore, we first calculate the offsets within the elements, and then only calculate the angle between the two offsets as part of the geometric representation. This simplified association representation retains the ability to describe parallel, vertical and other relationships, while the corresponding data amount is only (roughly 4MB under the aforementioned conditions). For ease of understanding, we also provide some examples:

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

Figure 3. Geometric association representation.

The parallel relationship and the perpendicular relationship are expressed by the angle between the offset vector being 0 degrees or 90 degrees; the distance between the two points can reflect the width information of the lane to a certain extent

To optimize the representation of geometric shapes and associations, we adopt the simplest approach, directly calculate the geometric representation of predictions and labels, and then use the norm as the optimization target :

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

Here and represent the length and angle calculated based on the label respectively, and and represent the length and angle calculated based on the prediction. A trick is used when dealing with included angles: directly calculating the angle involves a discontinuous arctan function, which will encounter difficulties during optimization (there is a vanishing gradient problem near ±90 degrees), so what we actually compare is the included angle The cosine and sine values of It also represents the robustness of the loss to rotation and translation transformation

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新 Geometric decoupling attention

An architecture adopted by MapTR, PivotNet, etc. to combine map elements Each point on corresponds to a query of Transformer. The problem with this architecture is that it does not distinguish between the two major categories of geometric properties.

In self-attention, all queries (that is, "points") interact equally with each other. However, the shape of the map elements corresponds to a group of queries. The interaction between these groups becomes a liability when perceiving the shape of elements. On the contrary, when perceiving the relationship between elements, shape also becomes a redundant factor

. This means that

decoupling the perception of shape and association may lead to better results

To decouple geometry and association processing, we adopt a two-step self-attention process: Each map element consists of

queries, Attention is performed within this

Supplement the attention relationship across elements to process geometric associationsGeometric solution Coupled attention can be more vividly represented by the following figure. Our implementation is relatively simple, directly using masks to control the scope of attention. Since these two types of attention are complementary, with reasonable implementation, the time complexity may be equivalent to performing a single self-attention

Figure 4. Geometry Decoupling attention.

The left side is the shape attention carried out within a single element, and the right side is the associated attention carried out between elements. 首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

Experimental results

We conducted a large number of experiments on nuScenes and Argoverse 2 data sets. Both are commonly used large-scale autonomous driving data sets, and both provide map annotations.

Main results

We conducted three sets of experiments on nuScenes. First, we use a relatively pure combination of objective functions, including only geometric losses and other necessary losses (such as point-to-point distance, edge direction, classification). This combination aims to highlight the importance of the geometric properties we propose. value without overly pursuing SOTA results. The results show that our method improves mAP compared to MapTR in this case. To explore the limits of GeMap, we also add some auxiliary objectives, including segmentation and depth estimation. In this case, we also achieved SOTA results (mAP improvement). It is worth noting that achieving such an improvement does not require sacrificing too much inference speed. Finally, we also tried to introduce additional LiDAR modal inputs. With the help of additional modal inputs, the performance of GeMap was further improved

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

Similarly, in the Argoverse 2 data set On the above, our method also achieved very outstanding results.

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

The rewritten content is: ablation experiment

The further rewritten content on nuScenes is: ablation experiment proof The value of geometric loss and geometrically decoupled attention. Interestingly, as we expected, using geometric loss directly will lead to a decrease in model performance. We believe that this is because the structural coupling of shape and association processing makes it difficult for the model to optimize the geometric representation; and after combining with the geometric decoupling attention, the geometric loss plays its due role (From "Euclidean Loss" to "Full").

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

More results

In addition, we also performed a visual analysis of nuScenes. It can be seen from the visualization results that GeMap is not only robust in handling rotation and translation, but also shows certain advantages in solving occlusion problems, as shown in the figure below. Challenging map elements are marked with orange boxes in the figure

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

Figure 5. Visual comparison results.

In the experimental results on rainy days, we also quantitatively verified the robustness of occlusion (see the table below). This is because rain naturally blocks the camera

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

This can be explained by the model learning geometric properties and therefore being able to better guess map elements even when there are occlusions. For example, if the model understands the shape of the lane lines, then it only needs to "see" part of it to estimate the rest; the model understands the parallel relationship between the lane lines, or the width characteristics of the lane, so even if one of them is blocked , and can also guess the occluded part based on the parallel relationship and width factors

Summary

We pointed out the geometric properties of the map elements and their importance for online vectorization The value of high-precision map construction. Based on this, we propose a powerful method to initially verify this value. In addition, GeMap's robustness to occlusion may indicate the idea of using geometric properties to deal with occlusion in other autonomous driving tasks (such as detection, occupancy prediction, etc.) - because both vehicles and roads have relatively standardized geometric properties. . Of course, our method itself has much to explore further. For example, can geometric elements of different complexity be adaptively described using different points? Is it possible to understand the geometric representation from a probabilistic perspective and make it more robust to noise? Because we have simplified the element association, is there a better representation of geometric association? These are all directions for further optimization.

首次超过70% mAP！GeMap：局部高精地图SOTA再次刷新

The content that needs to be rewritten is: https://mp.weixin.qq.com/s/BoxlskT68Kjb07mfwQ7Swg link

The above is the detailed content of Over 70% mAP for the first time! GeMap: Local high-precision map SOTA refreshed again. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Chat Commands and How to Use Them

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7530

CakePHP Tutorial

1378

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

How to make Google Maps the default map in iPhone Apr 17, 2024 pm 07:34 PM

The default map on the iPhone is Maps, Apple's proprietary geolocation provider. Although the map is getting better, it doesn't work well outside the United States. It has nothing to offer compared to Google Maps. In this article, we discuss the feasible steps to use Google Maps to become the default map on your iPhone. How to Make Google Maps the Default Map in iPhone Setting Google Maps as the default map app on your phone is easier than you think. Follow the steps below – Prerequisite steps – You must have Gmail installed on your phone. Step 1 – Open the AppStore. Step 2 – Search for “Gmail”. Step 3 – Click next to Gmail app

Why is Gaussian Splatting so popular in autonomous driving that NeRF is starting to be abandoned? Jan 17, 2024 pm 02:57 PM

Written above & the author’s personal understanding Three-dimensional Gaussiansplatting (3DGS) is a transformative technology that has emerged in the fields of explicit radiation fields and computer graphics in recent years. This innovative method is characterized by the use of millions of 3D Gaussians, which is very different from the neural radiation field (NeRF) method, which mainly uses an implicit coordinate-based model to map spatial coordinates to pixel values. With its explicit scene representation and differentiable rendering algorithms, 3DGS not only guarantees real-time rendering capabilities, but also introduces an unprecedented level of control and scene editing. This positions 3DGS as a potential game-changer for next-generation 3D reconstruction and representation. To this end, we provide a systematic overview of the latest developments and concerns in the field of 3DGS for the first time.

How to solve the long tail problem in autonomous driving scenarios? Jun 02, 2024 pm 02:44 PM

Yesterday during the interview, I was asked whether I had done any long-tail related questions, so I thought I would give a brief summary. The long-tail problem of autonomous driving refers to edge cases in autonomous vehicles, that is, possible scenarios with a low probability of occurrence. The perceived long-tail problem is one of the main reasons currently limiting the operational design domain of single-vehicle intelligent autonomous vehicles. The underlying architecture and most technical issues of autonomous driving have been solved, and the remaining 5% of long-tail problems have gradually become the key to restricting the development of autonomous driving. These problems include a variety of fragmented scenarios, extreme situations, and unpredictable human behavior. The "long tail" of edge scenarios in autonomous driving refers to edge cases in autonomous vehicles (AVs). Edge cases are possible scenarios with a low probability of occurrence. these rare events

Choose camera or lidar? A recent review on achieving robust 3D object detection Jan 26, 2024 am 11:18 AM

0.Written in front&& Personal understanding that autonomous driving systems rely on advanced perception, decision-making and control technologies, by using various sensors (such as cameras, lidar, radar, etc.) to perceive the surrounding environment, and using algorithms and models for real-time analysis and decision-making. This enables vehicles to recognize road signs, detect and track other vehicles, predict pedestrian behavior, etc., thereby safely operating and adapting to complex traffic environments. This technology is currently attracting widespread attention and is considered an important development area in the future of transportation. one. But what makes autonomous driving difficult is figuring out how to make the car understand what's going on around it. This requires that the three-dimensional object detection algorithm in the autonomous driving system can accurately perceive and describe objects in the surrounding environment, including their locations,

This article is enough for you to read about autonomous driving and trajectory prediction! Feb 28, 2024 pm 07:20 PM

Trajectory prediction plays an important role in autonomous driving. Autonomous driving trajectory prediction refers to predicting the future driving trajectory of the vehicle by analyzing various data during the vehicle's driving process. As the core module of autonomous driving, the quality of trajectory prediction is crucial to downstream planning control. The trajectory prediction task has a rich technology stack and requires familiarity with autonomous driving dynamic/static perception, high-precision maps, lane lines, neural network architecture (CNN&GNN&Transformer) skills, etc. It is very difficult to get started! Many fans hope to get started with trajectory prediction as soon as possible and avoid pitfalls. Today I will take stock of some common problems and introductory learning methods for trajectory prediction! Introductory related knowledge 1. Are the preview papers in order? A: Look at the survey first, p

SIMPL: A simple and efficient multi-agent motion prediction benchmark for autonomous driving Feb 20, 2024 am 11:48 AM

Original title: SIMPL: ASimpleandEfficientMulti-agentMotionPredictionBaselineforAutonomousDriving Paper link: https://arxiv.org/pdf/2402.02519.pdf Code link: https://github.com/HKUST-Aerial-Robotics/SIMPL Author unit: Hong Kong University of Science and Technology DJI Paper idea: This paper proposes a simple and efficient motion prediction baseline (SIMPL) for autonomous vehicles. Compared with traditional agent-cent

nuScenes' latest SOTA | SparseAD: Sparse query helps efficient end-to-end autonomous driving! Apr 17, 2024 pm 06:22 PM

Written in front & starting point The end-to-end paradigm uses a unified framework to achieve multi-tasking in autonomous driving systems. Despite the simplicity and clarity of this paradigm, the performance of end-to-end autonomous driving methods on subtasks still lags far behind single-task methods. At the same time, the dense bird's-eye view (BEV) features widely used in previous end-to-end methods make it difficult to scale to more modalities or tasks. A sparse search-centric end-to-end autonomous driving paradigm (SparseAD) is proposed here, in which sparse search fully represents the entire driving scenario, including space, time, and tasks, without any dense BEV representation. Specifically, a unified sparse architecture is designed for task awareness including detection, tracking, and online mapping. In addition, heavy

FisheyeDetNet: the first target detection algorithm based on fisheye camera Apr 26, 2024 am 11:37 AM

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

See all articles