Can self-driving cars signal their intentions to pedestrians?
Determining whether a wide road can be safely crossed requires social cues and cooperative communication between pedestrians and drivers. So, what happens if it’s a self-driving vehicle? Self-driving car company Motional believes making vehicles more expressive may be the key to maintaining these important signals.
While waiting at a crosswalk, Paul Schmitt, Motional's chief engineer, experienced what he calls "a dance with a glance." It's a quick and almost subconscious assessment: Where are the drivers of oncoming cars looking? Did they notice him? "With autonomous vehicles, half of these interactions don't exist," Schmidt said. "So, what cues are there for pedestrians to understand the vehicle's intentions?"
To answer this question, he The team hired animation studio CHRLX to build a highly realistic VR experience designed to test pedestrian reactions to various signaling mechanisms. Their research results were published in IEEE Robotics and Automation Letters. Schmidt and his team say that exaggerated driving maneuvers - braking early or stopping in front of pedestrians - are the most effective ways to communicate their intentions.
The company is currently integrating the most promising expressive behaviors into its motion planning system and has also open sourced its virtual reality traffic environment so other teams can experiment.
The study also tested various expressive behaviors that implicitly signal to pedestrians that a vehicle is stopping for them. These include making the car brake harder further from the baseline, stopping a car's length away, adding hard braking and low-rpm sounds, and finally combining these sounds with an exaggerated nose dive, like It's like the vehicle is braking hard.
The team measured how quickly pedestrians decided to cross lanes, and conducted a quick survey of pedestrians after each trial to understand how safe they felt, how confident they were in their decision to cross lanes, and and how well the car’s intentions are understood. Short stops scored highest for safety and understanding the car's intentions.
Schmitt said that short stops received the best response, which was not surprising because the approach was inspired by the behavior of human drivers slowing down in front of pedestrians. Surprisingly, he added, there was little difference in responses to this baseline scenario with or without a driver, suggesting pedestrians were paying more attention to the movement of the vehicle rather than the driver behind the wheel.
The above is the detailed content of Can self-driving cars signal their intentions to pedestrians?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Yesterday during the interview, I was asked whether I had done any long-tail related questions, so I thought I would give a brief summary. The long-tail problem of autonomous driving refers to edge cases in autonomous vehicles, that is, possible scenarios with a low probability of occurrence. The perceived long-tail problem is one of the main reasons currently limiting the operational design domain of single-vehicle intelligent autonomous vehicles. The underlying architecture and most technical issues of autonomous driving have been solved, and the remaining 5% of long-tail problems have gradually become the key to restricting the development of autonomous driving. These problems include a variety of fragmented scenarios, extreme situations, and unpredictable human behavior. The "long tail" of edge scenarios in autonomous driving refers to edge cases in autonomous vehicles (AVs). Edge cases are possible scenarios with a low probability of occurrence. these rare events

Target detection is a relatively mature problem in autonomous driving systems, among which pedestrian detection is one of the earliest algorithms to be deployed. Very comprehensive research has been carried out in most papers. However, distance perception using fisheye cameras for surround view is relatively less studied. Due to large radial distortion, standard bounding box representation is difficult to implement in fisheye cameras. To alleviate the above description, we explore extended bounding box, ellipse, and general polygon designs into polar/angular representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model fisheyeDetNet with polygonal shape outperforms other models and simultaneously achieves 49.5% mAP on the Valeo fisheye camera dataset for autonomous driving

A purely visual annotation solution mainly uses vision plus some data from GPS, IMU and wheel speed sensors for dynamic annotation. Of course, for mass production scenarios, it doesn’t have to be pure vision. Some mass-produced vehicles will have sensors like solid-state radar (AT128). If we create a data closed loop from the perspective of mass production and use all these sensors, we can effectively solve the problem of labeling dynamic objects. But there is no solid-state radar in our plan. Therefore, we will introduce this most common mass production labeling solution. The core of a purely visual annotation solution lies in high-precision pose reconstruction. We use the pose reconstruction scheme of Structure from Motion (SFM) to ensure reconstruction accuracy. But pass

Written above & The author’s personal understanding In recent years, autonomous driving has received increasing attention due to its potential in reducing driver burden and improving driving safety. Vision-based three-dimensional occupancy prediction is an emerging perception task suitable for cost-effective and comprehensive investigation of autonomous driving safety. Although many studies have demonstrated the superiority of 3D occupancy prediction tools compared to object-centered perception tasks, there are still reviews dedicated to this rapidly developing field. This paper first introduces the background of vision-based 3D occupancy prediction and discusses the challenges encountered in this task. Next, we comprehensively discuss the current status and development trends of current 3D occupancy prediction methods from three aspects: feature enhancement, deployment friendliness, and labeling efficiency. at last

Written above & the author’s personal understanding: This paper is dedicated to solving the key challenges of current multi-modal large language models (MLLMs) in autonomous driving applications, that is, the problem of extending MLLMs from 2D understanding to 3D space. This expansion is particularly important as autonomous vehicles (AVs) need to make accurate decisions about 3D environments. 3D spatial understanding is critical for AVs because it directly impacts the vehicle’s ability to make informed decisions, predict future states, and interact safely with the environment. Current multi-modal large language models (such as LLaVA-1.5) can often only handle lower resolution image inputs (e.g.) due to resolution limitations of the visual encoder, limitations of LLM sequence length. However, autonomous driving applications require

The deep reinforcement learning team of the Institute of Automation, Chinese Academy of Sciences, together with Li Auto and others, proposed a new closed-loop planning framework for autonomous driving based on the multi-modal large language model MLLM - PlanAgent. This method takes a bird's-eye view of the scene and graph-based text prompts as input, and uses the multi-modal understanding and common sense reasoning capabilities of the multi-modal large language model to perform hierarchical reasoning from scene understanding to the generation of horizontal and vertical movement instructions, and Further generate the instructions required by the planner. The method is tested on the large-scale and challenging nuPlan benchmark, and experiments show that PlanAgent achieves state-of-the-art (SOTA) performance on both regular and long-tail scenarios. Compared with conventional large language model (LLM) methods, PlanAgent

Written above & the author’s personal understanding At present, as autonomous driving technology becomes more mature and the demand for autonomous driving perception tasks increases, the industry and academia very much hope for an ideal perception algorithm model that can simultaneously complete three-dimensional target detection and based on Semantic segmentation task in BEV space. For a vehicle capable of autonomous driving, it is usually equipped with surround-view camera sensors, lidar sensors, and millimeter-wave radar sensors to collect data in different modalities. This makes full use of the complementary advantages between different modal data, making the data complementary advantages between different modalities. For example, 3D point cloud data can provide information for 3D target detection tasks, while color image data can provide more information for semantic segmentation tasks. accurate information. Needle

1 Overview of decision control and motion planning Current decision control methods can be divided into three categories: sequential planning, behavior-aware planning, and end-to-end planning. Sequential planning: The most traditional method, the three parts of perception, decision-making and control are relatively clear; behavior-aware planning: Compared with the first method, the highlight is the introduction of human-machine co-driving, vehicle-road collaboration and vehicle risk estimation of the external dynamic environment; End-to-end planning: DL and DRL technologies use a large amount of data training to obtain sensory information such as images, steering wheel angles, etc.
