Executive Summary
Conventional metrics for evaluating LiDAR systems designed for autonomous driving are problematic because they often fail to adequately or explicitly address real-world scenarios. Therefore, AEye, the developer of iDAR™ technology, proposes a number of new metrics to better assess the safety and performance of advanced automotive LiDAR sensors.
In Part 1 of this series, two metrics (frame rate and fixed [angular] resolution over a fixed Field-of-View) were discussed in relation to the more meaningful metrics of object revisit rate and instantaneous (angular) resolution. Now in Part 2, we’ll explore the metrics of detection range and velocity, and propose two new corresponding metrics for consideration: object classification range and time to true velocity.
Download “Deconstructing Two Conventional LiDAR Metrics, Part 2” [pdf]
Introduction
How is the effectiveness of an autonomous vehicle’s perception system measured? Performance metrics matter because they ultimately determine how designers and engineers approach problem-solving. Defining problems accurately makes them easier to solve, saving time, money, and resources.
When it comes to measuring how well automotive LiDAR systems perceive the space around them, manufacturers commonly agree that it’s valuable to determine their detection range. To optimize safety, the on-board computer system should detect obstacles as far ahead as possible. The speed with which they can do so theoretically determines whether control systems can plan and perform timely, evasive maneuvers. However, AEye believes that detection range is not the most important measurement in this scenario. Ultimately, it’s the control system’s ability to classify an object (here we refer to low level classification [e.g., blob plus dimensionality]) that enables it to decide on a basic course of action.
What matters most then, is how quickly an object can be identified and classified and how quickly a decision can be made about an object so an appropriate response can be calculated. In other words, it is not simply enough to quantify a distance at which a potential object can be detected at the sensor. One must also quantify the latency from the actual event to the sensor detection — plus the latency from the sensor detection to the CPU decision.
Similarly, the conventional metric of velocity has limitations. Today, some lab prototype frequency modulated continuous wave (FMCW) LiDAR systems can determine the radial velocity of nearby objects by interrogating them continuously for a period of time sufficient to observe a discernible change in position. However, this has two disadvantages: 1) the beam must remain locked on in fixed position for a certain period of time, and 2) only velocity in the radial direction can be discerned. Lateral velocity must be calculated with the standard update in position method. Exploration of these disadvantages will illustrate why, to achieve the highest degree of safety, time to true velocity is a much more useful metric. In other words, how long does it take a system to determine the velocity — in any direction — of a newly identified or appearing object?
Both object classification range and time to true velocity are more relevant metrics for assessing what a LiDAR system can and should achieve in tomorrow’s autonomous vehicles. In this white paper, we examine how these new metrics better measure and define the problems solved by more advanced LiDAR systems, such as AEye’s iDAR (Intelligent Detection and Ranging).
Conventional Metric #1: Detection Range
A single point detection — where the LiDAR registers one detect on a new object or person entering the scene — is indistinguishable from noise. Therefore, we will use a common industry definition for detection which involves persistence in adjacent shots per frame and/or across frames. For example, we might require 5 detects on an object per frame (5 points at the same range) and/or from frame-to-frame (1 single related point in 5 consecutive frames) to declare that a detection is a valid object.
It is a widely held belief that a detection range of 200+ meters at highway speeds is the required range for vehicles to effectively react to changing road conditions and surroundings. Conventional LiDAR sensors scan and collect data about the occupancy grid in a uniform pattern without discretion. This forms part of a constant stream of gigabytes of data sent to the vehicle’s on-board controller in order to detect objects. This design puts a massive strain on resources. Anywhere from 70 to 90+ percent of data is redundant or useless, which means it’s discarded.
Under these conditions, even a system that’s able to operate at a 10-30 Hz frame rate will struggle to deliver low latency while supporting high frame rates and high performance. And if latency for newly appearing objects is even 0.25 seconds, the frame rate hardly matters — by the time the data is made available to the central compute platform in some circumstances, it’s practically worthless. On the road, driving conditions can change dramatically in a tenth of a second. After 0.1 seconds, two cars closing in at a mutual speed of 200 km/hour are 18 feet closer. While predictive algorithms work well to counter this latency in structured, well-behaved environments, there are several examples where they don’t. One such example is the fast, “head-on” approaching small object. Here, a newly appearing object appears “head-on” with a single LiDAR point and it requires N consecutive single LiDAR point detects before it can be classified as an object. In this example, it’s easy to see that detection range and object classification range are two vastly different things.
With a variety of factors influencing the domain controller’s processing speed, measuring the efficacy of a system by its detection range is problematic. Without knowledge of latency or other pertinent factors, unwarranted trust is put on the controller’s ability to manage competing priorities. While it is generally assumed that LiDAR manufacturers are not supposed to know or care about how the domain controller classifies (or how long classification takes), we propose that ultimately, this leaves designers vulnerable to very dangerous situations.
AEye’s Metric
Object Classification Range
Currently, classification takes place somewhere in the domain controller. It’s at this point that objects are labeled as such and eventually, more clearly identified. At some level of identification, this data is used to predict known behavior patterns or trajectories. It is obviously extremely important and therefore, AEye argues that a better measurement for assessing an automotive LiDAR’s capability is its object classification range. This metric reduces the unknowns — such as latency associated with noise suppression (e.g., N of M detections) — early in the perception stack, pinpointing the salient information about whether a LiDAR system is capable of operating at optimal safety.
As a relatively new field, the definition of how much data is necessary for classification in automotive LiDAR has not yet been defined. Thus, AEye proposes that adopting perception standards used by video classification provides a valuable provisional definition. According to video standards, enabling classification begins with a 3×3 pixel grid of an object. Under this definition, an automotive LiDAR system might be assessed by how fast it’s able to generate a high quality, high-resolution 3×3 point cloud that enables the domain controller to comprehend objects and people in a scene.
Generating a 3×3 point cloud is a struggle for conventional LiDAR systems. While many tout an ability to manifest point clouds comprised of half a million or more points in one second, there is a lack of uniformity in these images. Point clouds created by most LiDAR systems display a fine degree of high-density horizontal lines coupled with very poor density vertical spacing, or in general, low overall density. Regardless, these fixed angular sampling patterns can be difficult for classification routines because the domain controller has to grapple with half a million points per second that are, in many cases, out of balance with the resolution required for the critical sampling of the object in question. Such an askew “mish-mash” of points means it needs to do additional interpretation, putting extra strain on CPU resources.
A much more efficient approach would be to gather about 10 percent of this data, focusing solely on Special Regions of Interest (e.g., moving vehicles and pedestrians) while keeping tabs on the background scene (trees, parked cars, buildings, etc.). Collecting only the salient data in the scene significantly speeds up classification. AEye’s agile iDAR is a LiDAR system integrated with AI that can intelligently accelerate shots only in a Region of Interest (ROI). This comes from its ability to selectively revisit points twice in 10’s of microseconds — an improvement of 3 orders of magnitude over conventional 64-line systems that can only hit an object once per frame (every 100 milliseconds). Future white papers will discuss various methods of using iDAR to ensure that we do not discount important background information by correctly employing the concepts of Search, Acquisition, and Tracking. This is similar to how humans perceive.
In summary, one can move low-level object detection to the sensor level by employing, as an example, a dense 3×3 voxel grid every time a significant detection occurs more or less in real-time. This happens before the data is sent to the central controller, allowing for higher instantaneous resolution than a fixed pattern system can offer and, ultimately, better object classification ranges when using video detection range analogies.
Real-World Applications: Imagine that an autonomous vehicle is driving on a desolate highway. Ahead, the road appears empty. Suddenly, the sensor per..