Rethinking the Three “Rs” of LiDAR: Rate, Resolution and Range

Rethinking The Three “Rs” Of LiDAR: Rate, Resolution And Range

Extending Conventional LiDAR Metrics to Better Evaluate Advanced Sensor Systems

Executive Summary

As the autonomous vehicle market matures, sensor and perception engineers have become increasingly sophisticated in how they evaluate system efficiency, reliability and performance. Many industry leaders have recognized that conventional metrics for LiDAR data collection (such as frame rate, full frame resolution, and detection range) currently used for evaluating performance no longer adequately measure the effectiveness of sensors to solve real world use cases that underlie autonomous driving.

First generation LiDAR sensors passively search a scene and detect objects using background patterns that are fixed in both time (no ability to enhance with a faster revisit) and in space (no ability to apply extra resolution to high interest areas like the road surface or intersections).

A new class of advanced solid-state LiDAR sensors enable intelligent information capture that expands their capabilities and moves from passive “search” or detection of objects, to active search and, in many cases, to the actual acquisition and classification attributes of objects in real time.

Because early generation LiDARs used fixed raster scans, the industry was forced to adopt overly simplistic performance metrics that did not capture all the nuances of the sensor requirements needed to enable AVs. In response, AEye, the developer of iDAR technology (which includes agile LiDAR) is proposing the consideration of three new corresponding metrics for extending LiDAR evaluation. Specifically: extending the metric of frame rate to include intra-frame object revisit rate; expanding resolution to capture instantaneous enhanced resolution; and enhancing detection range to reflect the more critically important object classification range.

We are proposing that these new metrics be used in conjunction with existing measurements of basic camera and passive LiDAR performance as they measure a sensor’s ability to intelligently enhance perception and create a more complete evaluation of a sensor system’s efficacy in improving safety and performance in real-world scenarios.

Download “Rethinking the Three “Rs” of LiDAR: Rate, Resolution and Range” [pdf]

Introduction

We have often found it useful to leverage proven frameworks from advanced robotic vision research and apply them to LiDAR-specific product architecture. One that has proven to be both versatile and instructive has been work around object identification that connects search, acquisition (or classification) and action.

  • Search is the ability to detect any and all objects without the risk of missing anything.
  • Acquire is defined as the ability to take a search detection and enhance the understanding of an object’s attributes to accelerate classification and determine possible intent (this could be by calculating velocity or by classifying object type).
  • Act defines an appropriate sensor response as trained or as recommended by the vehicle’s perception system or domain controller. Responses can largely fall into four categories:
    • Continue scan for new objects (no enhanced information needed)
    • Continue scan but also interrogate the object further and gather more information on an acquired object’s attributes to enable classification
    • Continue scan but also continue to track an object classified as currently non-threatening
    • Continue scan but the control system is going to take evasive action.

Within this framework, performance specifications and system effectiveness need to be assessed with an “eye” firmly on the ultimate objective: completely safe operation of the vehicle. However, as most LiDAR systems today are passive, they are only capable of basic search. Therefore, conventional metrics used for evaluating these systems’ performance relate to basic object detection capabilities – frame rate, resolution, and detection range. If safety is the ultimate goal, then search needs to be more intelligent and acquisition (and classification) done more quickly and accurately so that the sensor or the vehicle can determine how to act immediately.

Rethinking the Metrics

Makers of automotive LiDAR systems are frequently asked about their frame rate, and whether or not their technology has the ability to detect objects with 10 percent reflectivity at some range (often 230 meters). We believe these benchmarks are required, but insufficient as they don’t capture critical details such as the size of the target, speed at which it needs to be detected and recognized, or the cost of collecting that information. We believe it would be productive for the industry to adopt a more holistic approach when it comes to assessing LiDAR systems for automotive use. Additionally, we make the argument that we must look at metrics as they relate to a perception system in general – rather than as an individual point sensor and ask ourselves: “What information would enable a perception system to make better, faster decisions?” Below, we have outlined the three conventional LiDAR metrics and a recommendation on how to extend these metrics.

Conventional Metric #1: Frame rate of 10Hz – 20Hz

Check!

New Metric: Object Revisit Rate
(The time between two shots at the same point or set of points)

Defining single point detection range alone is insufficient for sensor detection because a single interrogation point (shot) rarely delivers sufficient confidence – it is only suggestive. Therefore, passive LiDAR systems need multiple interrogation/detects at the same point or multiple interrogations/detects on the same object to validate an object or scene. The time it takes to detect an object is dependent on many variables, such as distance, interrogation pattern and resolution, reflectivity, or the shape of the objects to interrogate, and can “traditionally” take several full frames to achieve.

A key factor that is missing from the conventional metric is a finer definition of time. Thus, we propose that Object Revisit Rate becomes a new, more refined metric for automotive LiDAR because an agile LiDAR, such as AEye’s iDAR, can revisit an object within the same frame. The time between the first measurement of an object and the second is critical, as shorter object revisit times can help keep processing times low for advanced algorithms that need to correlate between multiple moving objects in a scene. The best algorithms used to associate/correlate multiple moving objects can be confused when many objects are in the scene and time elapsed between samples is high. This lengthy combined processing time is a primary issue for the industry.

The agile AEye iDAR platform accelerates revisit rate by allowing for intelligent shot scheduling within a frame. Not only can iDAR interrogate a position or object multiple times within a conventional frame, it can maintain a background search pattern while overlaying additional intelligent shots with the same frame. For example, an iDAR sensor can schedule two repeated shots on a point of interest in quick succession (30ms). These multiple interrogations can then be contextually integrated with the needs of the user (either human or computer) to increase confidence, reduce latency, or extend ranging performance.

These interrogations can also be data dependent. For example, an object can be revisited if a (low confidence) detection occurs, and it is desirable to quickly validate, or reject, enabled with secondary data and measurement, as seen in Figure 1. A typical completive full frame rate (traditional classic) for conventional sensors is approximately 10Hz, or 100 msec. This is also, for said conventional sensors, equivalent to the “object revisit rate.” With AEye’s flexible iDAR technology, the object revisit rate is now different from the frame rate and it can be as low as 10s of microseconds between revisits to key points/objects as the user/host requires – easily 100x to 1000x faster than alternative fixed scan sensors.

Figure 1. ROI Revisit Rate
Figure 1. Advanced Agile LiDAR Sensors enable intelligent scan patterns such as the “Foveation in Time” Intra-Frame Revisit Interval and random scan pattern of iDAR (B) compared to Revisit Interval on a typical fixed pattern LiDAR (A)

What this means is that a perception engineering team using dynamic object revisit capabilities can create a perception system that is at least an order of magnitude faster than what can be delivered by conventional LiDAR without disrupting the background scan patterns. We believe this capability is invaluable in delivering level 4/5 autonomy as the vehicle will need to handle significantly complex corner cases, such as identifying a pedestrian next to oncoming headlights or a semi-trailer laterally crossing the path of the vehicle.

Within the “Search, Acquire, and Act” framework, an accelerated object revisit rate, therefore, allows for faster acquisition because it can identify and automatically revisit an object, painting a more complete picture of it within the context of the scene. Ultimately, this allows for collection of object classification attributes in the sensor, as well as efficient and effective interrogation and tracking of a potential threat.

Real-World Applications

Use Case: Head-On Detection

When you’re driving, the world can change dramatically in a tenth of a second. In fact, two cars traveling towards each other at 100 kph are 5.5 meters closer to each other after 0.1 seconds. By having an accelerated revisit rate, we increase the likelihood of hitting the same target with a subsequent shot due to the decreased likelihood that the target has moved significantly in the time between shots. This helps the user solve the “Correspondence Problem” (determining which parts of one “snapshot” of a dynamic scene correspond to which parts of another snapshot of the same scene), while simultaneously enabling the user to quickly build statistical measures of confidence and generate aggregate information that downstream processors might require (such as object velocity and acceleration). The ability to selectively increase revisit rate on points of interest while lowering the revisit rate in sparse areas like the sky can significantly aid higher level inferencing algorithms, allowing perception and path planning systems to more quickly determine optimum autonomous decision making.

Use Case: Lateral Detection

A vehicle entering a scene laterally is the most difficult to track. Even Doppler Radar has a difficult time with this scenario. However, selectively allocating shots to extract velocity and acceleration when detections have occurred (part of the acquisition chain) vastly reduces the required number of shots per frame. Adding a second detection, via iDAR, to build a velocity estimate on each object detection increases the overall number of shots by only 1%, whereas obtaining velocity everywhere with a fixed scan system doubles the required shots (100%, i.e., 2x increase). This speed and shot saliency makes autonomous driving much safer because it eliminates ambiguity and allows for more efficient use of processing resources.

The AEye Advantage: Whereas other LiDAR systems are limited by the physics of fixed laser pulse energy, fixed dwell time, and fixed scan patterns, iDAR is a software definable system that allows perception, path and motion planning modules to dynamically customize their data collection strategy to best suit their information processing needs at design time and/or run time.

Starting with a unique bore-sighted, non-co-axial design that eliminates parallax between the camera and the LiDAR brings iDAR extremely close to solving the correspondence problem. Then AEye’s software agility allows iDAR to push the limits of physics in a tailored (as opposed to a static, one-time) fashion. The achievable object revisit rate of AEye’s iDAR system for points of interest (not merely the exact point just visited) is microseconds to a few milliseconds – which can be up to 3000x faster, compared to conventional LiDAR systems that require many tens or hundreds of milliseconds between revisits. This gives the unprecedented ability to calculate valuable attributes such as object velocity (both lateral and radial) faster than any other system, allowing the vehicle to act more readily to immediate threats and track them through time and space more accurately.

This ability to define the new metric, Object Revisit Rate,which is decoupled from the traditional “frame rate,” is important also for the next metric we introduce. This second metric helps to segregate the basic idea of “search” algorithms from “acquisition” algorithms: two algorithm types that should never be confused. Separation of these two basic types of algorithms provides insight into the heart of iDAR, which is the Principle of Information Quality (as opposed to Data Quantity): “more information, less data.”

Conventional Metric #2: Fixed (angular) resolution over a fixed Field-of-View

Check!

New Metric: Instantaneous (Angular) Resolution
(The degree to which a LiDAR sensor can apply additional resolution
to key areas within a frame)

The assumption behind the use of resolution as a conventional metric is that it is assumed the Field-of-View will be scanned with a constant pattern and uniform power. This makes perfect sense for less intelligent traditional sensors that have limited or no ability to adapt their collection capabilities. Additionally, the conventional metric assumes that salient information resident within the scene is uniform in space and time, which we know is not true. Because of these assumptions, conventional LiDAR systems indiscriminately collect gigabytes of data from a vehicle’s surroundings, sending those inputs to the CPU for decimation and interpretation (wherein an estimated 70 to 90 percent of this data is found to be useless or redundant, and thrown out). In addition, these systems apply the same level of power everywhere such that the sky is scanned at the same power as an object directly in the path of the vehicle. It’s an incredibly inefficient process.

As humans, we don’t “take in” everything around us equally. Rather, the visual cortex filters out irrelevant information, such as an airplane flying overhead, while simultaneously (not serially) focusing our eyes on a particular point of interest. Focusing on a point of interest allows other, less important objects to be pushed to the periphery. This is called foveation, where the target of our gaze is allotted a higher concentration of retinal cones, thus, allowing it to be seen more vividly.

iDAR improves on the visual cortex. Whereas humans typically only foveate on one area, iDAR can do it on multiple areas simultaneously (and in multiple ways) while also maintaining a background scan to assure it never misses new objects. We describe this capability as a Region of Interest (ROI). Furthermore, since humans rely entirely on light from the sun, moon, or artificial lighting, human foveation is receive only, i.e., passive. iDAR, in contrast, foveates on both transmit (regions that the laser light chooses to “paint”) and receive (where/when the processing chooses to focus).

An example of this follows.

Figure 2 below shows two squares, Square A and Square B. Both squares have a similar number of shot points within them. Square A represents a uniform scan pattern, typical of conventional LiDAR sensors. These fixed scan patterns produce a fixed frame rate with no concept of an ROI. Square B shows an adjusted, unfixed scan pattern. As we can see, the shots in Square B are gathered more densely within and around the ROI (the small box) with the square. You can also see in Square B that the background scan continues to search to ensure no new objects are missed, while focusing additional resolution on a fixed area to aid in acquisition. In essence, it is using intelligence to optimize use of power and shots.

Looking at the graphs associated with Squares A and B, we see that, additionally, the unfixed scan pattern of Square B is able to produce revisits to an ROI within a much shorter interval than Square A. Square B can not only complete one ROI revisit interval, but multiple ROIs within a single frame, whereas, Square A cannot complete even one revisit. iDAR does what conventional LiDAR cannot: it enables dynamic perception, allowing the system to focus in on, and gather more comprehensive data about, a particular Region of Interest at unprecedented speed.

ROI Revisit Rate Fig2

Figure 2. Region of Interest (ROI) Revisit Rate and foveation of iDAR (B) compared to conventional scan patterns (A)

Within the “Search, Acquire, and Act” framework, Instantaneous Resolution allows the iDAR system to search for and acquire multiple targets, capturing useful information about them, other than that they exist. It can apply an intelligent fixed resolution on high-probability areas such as road surfaces and intersections. iDAR also allows for the creation of multiple simultaneous Regions of Interest within a scene, allowing the system to focus and gather more comprehensive data about specific objects, enabling the system to interrogate them more completely and track them more effectively.

Real-World Application

Use Case: Object Interrogation

When objects of interest have been identified, iDAR can “foveate” its scanning to gather more useful information about them and acquire additional classification attributes. For example, let’s say the system encounters a fast pedestrian who is jaywalking across the street directly in the path of the vehicle. Because iDAR enables a dynamic change in both temporal and spatial sampling density within a Region of Interest (Instantaneous Resolution), the system can focus more of its attention on this jaywalker, and less on irrelevant information, such as parked vehicles along the side of the road (which iDAR has already identified long ago and is simply tracking). Regions of Interest allow iDAR to quickly, efficiently, and accurately identify critical information about the jaywalker. The iDAR system provides the most useful, actionable data to the domain controller to help determine the best timely course of action.

We see Instantaneous Resolution being utilized in three primary techniques to address different additional use cases.

  1. Fixed Region of Interest (ROI): Today, passive systems can allocate more density at the horizon – a very simple foveation technique driven by their limited control over frequency, placement, and power within a frame. An OEM or Tier 1 will be able to utilize advanced simulation programs to test hundreds (or even thousands) of shot patterns – varying speed, power, and other constraints to identify an optimal pattern that integrates a fixed ROI with higher Instantaneous Resolution to achieve their desired results. For example, in urban environments, threats are more likely to come from the side of the road – car doors opening, pedestrians, cross traffic, etc. ROIs can be defined for both sides of the road at a fixed distance in front of the vehicle, instantly providing superior resolution (both vertical and horizontal) in the area of greatest concern.
  2. Triggered ROI: Can only be done with an intelligent system that can be programmed to accept a trigger. The perception software team may determine that when certain conditions are met, an ROI is generated within the existing scan pattern. For example, a mapping or navigation system might signal that you are approaching an off-ramp or intersection, which generates an appropriately targeted ROI on key areas of the scene with greater detail.
  3. Dynamic ROI: The highest level of intelligence is the same technique and methodology deployed by Automatic Targeting Systems (ATS) to continuously interrogate objects of high interest over time. As these objects move closer or further away, the size and density of the ROI is varied. For example, an ambulance or fire truck is detected and classified several hundred meters behind the vehicle, and a Dynamic ROI could be applied to the emergency response vehicle to track its movements.

The AEye Advantage: A major advantage of iDAR is that it is agile in nature, meaning that the main parameters do not have to be fixed, and therefore, it can take advantage of concepts like time multiplexing. It can actually trade off temporal sampling resolution, spatial sampling resolution, and even range simultaneously at multiple points in the “frame” for any of the other two. This allows the system to have tremendous value in perception and do some amazing things that no other system can.

In a conventional LiDAR system, there is (i) a fixed Field-of-View and (ii) a fixed uniform or patterned sampling density, choreographed to (iii) a fixed laser shot schedule. AEye’s technology allows for these three parameters to vary almost independently. This leads to an almost endless stream of potential innovations and will be the topic of a later paper. The agile iDAR system allows the user to dynamically change the angular density over the entire Field-of-View enabling the robust collection of useful, actionable information.

Instantaneous Resolution is required to convey that resolution is not something dictated by physical constraints alone, such as beam divergence, or number of shots per second. Rather, it is determined by starting with a faster, more efficient agile LiDAR and then intelligently optimizing resources. The ability to instantaneously increase resolution is a critical enabler in the next metric we introduce.

Conventional Metric #3: Object Detection Range

Check!

New Metric: Object Classification Range
(Range at which you have sufficient data to classify an object)

When it comes to measuring how well automotive LiDAR systems perceive the space around them, manufacturers commonly agree that it’s valuable to determine their detection range. To optimize safety, the on-board computer system should detect obstacles as far ahead as possible. The speed with which they can do so theoretically determines whether control systems can plan and perform timely, evasive maneuvers (i.e., the 230M at 10% reflectivity example cited earlier). However, AEye believes that detection range is a required, but insufficient measurement in this scenario. Ultimately, it’s the control system’s ability to classify an object (here we refer to low level classification [e.g., blob plus dimensionality]) that enables it to decide on a basic course of action.

What matters most then, is how quickly an object can be detected, identified and classified, a threat-level decision made, and an appropriate response calculated. A single point detection is indistinguishable from noise. Therefore, we will use a common industry definition for detection which involves persistence in adjacent shots per frame and/or across frames. For example, we might require 5 detects on an object per frame (5 points at the same range) and/or from frame-to-frame (1 single related point in 5 consecutive frames) to declare that a detection is a valid object. At 20Hz, that takes .25 seconds to define a simple detect.

Currently, classification typically takes place in the perception stack. It’s at this point that objects are labeled and eventually, more clearly identified. This data is used to predict behavior patterns or trajectories. The more the sensor can provide classification attributes, the faster the perception system can confirm. AEye argues that a better measurement for assessing this critical automotive LiDAR capability is its ability to impact Object Classification Range. This metric reduces the unknowns – such as latency associated with noise suppression (e.g., N of M detections) – early in the perception stack, pinpointing the salient information.

As a relatively new field, the definition of how much data is necessary for classification in automotive LiDAR has not yet been defined. Thus, we propose that adopting perception standards used by video classification provides a valuable provisional definition. According to video standards, enabling classification begins with a 3×3 pixel grid of an object. Under this definition, an automotive LiDAR system might be assessed by how fast it’s able to generate a high quality, high-resolution 3×3 point cloud that enables the perception stack to comprehend objects and people in a scene.

Generating a 3×3 point cloud is a struggle for conventional LiDAR systems. While many systems tout an ability to manifest point clouds comprised of half a million or more points in one second, there is a lack of uniformity in these images. These fixed angular sampling patterns can be difficult for classification routines because the domain controller has to grapple with half a million points per second that are, in many cases, out of balance with the resolution required for the critical sampling of the object in question. Such an askew “mish-mash” of points means it needs to do additional interpretation, putting extra strain on CPU resources.

Figure 3. Packing a dense 3x3 grid around a detect allows the collection of more useful data and greatly speeds up classification. We have left a single detect on a vehicle. Rather than wait for the next frame to resample this vehicle (as is the traditional mode in LiDAR) we instead quickly form a dedicated dense ROI, as indicated in “Scan 2” on the right. This is done almost immediately after the initial single detect, and before completing the next scan.
Figure 3. Packing a dense 3×3 grid around a detect allows the collection of more useful data and greatly speeds up classification. We have left a single detect on a vehicle. Rather than wait for the next frame to resample this vehicle (as is the traditional mode in LiDAR) we instead quickly form a dedicated dense ROI, as indicated in “Scan 2” on the right. This is done almost immediately after the initial single detect, and before completing the next scan.

In summary, one can achieve low-level object detection to the sensor level by employing a dense 3×3 voxel grid every time a significant detection occurs in real time. This happens before the data is sent to the central controller, allowing for higher Instantaneous Resolution than a fixed pattern system can offer and, ultimately, better object classification ranges when using video detection range analogies.

Returning to the “Search, Acquire, Act” framework, once we have acquired a target and determined that it is valid (and is potentially encroaching on our planned path) we can allocate more shots to it and take action if need be. Alternatively, if we determine that the target is not an immediate threat, we can more fully interrogate the object for additional classification data or simply track it with a few shots per scan.

Real-World Applications

Use Case: Unprotected Left-Hand Turn

Different objects demand different responses. This is especially true in challenging driving scenarios such as an unprotected left-hand turn – especially when traversing across high-speed, oncoming traffic.

Imagine an autonomous vehicle on a four-lane road with a speed limit of 100kph needing to make an unprotected left-hand turn across two lanes of traffic. In the oncoming traffic, one lane has a motorcycle and the other has a car. In this situation, object classification range is critical, as classifying one of the objects as a motorcycle at sufficient range would indicate that the AV should behave more cautiously in proceeding as motorcycles are capable of traveling at higher speeds and can take more unpredictable paths.

Use Case: School Bus

The fundamental value of being able to classify objects at range is greatest in instances where the identity of the object defines a specific and immediate response from the vehicle. An excellent example of this is encountering a school bus full of children. The faster that object is classified specifically as a school bus, the faster the autonomous vehicle can initiate an appropriate protocol – slowing the vehicle and deploying Instantaneous Resolution ROIs in areas around the school bus to immediately capture any movement of children toward the path of the vehicle. This would enable similarly specific responses for police cars, ambulances, fire trucks, or any vehicle whose presence would require the autonomous vehicle to alter how it interrogates the scene and/or change its speed or path.

The AEye Advantage: LiDAR sensors embedded with AI for perception are very different than those that passively collect data. AEye’s agile system can acquire targets and enable classification in far less time than conventional LiDAR systems would require to merely register a detection. With the ability to modulate revisit rate up to 3000x faster in a frame, we no longer focus on detection alone: it’s now more important to gauge speed of acquisition (i.e., classification range). This brings to light the difference between detection range and basic object classification range.

Assuming that metrics like detection range are used for accurately scoring how LiDAR systems contribute to autonomous vehicle safety, then evaluators should also consider how long it takes these systems to identify hazards. Thus, Object Classification Range is a far more meaningful metric.

Conclusion

In this white paper, we have discussed why reducing the time between object detections is critical. As capturing multiple detects at the same point/object are required to fully comprehend an object or scene, measuring object revisit rate is a more critical metric for automotive LiDAR than frame rate.

Additionally, we have argued that quantifying (angular) resolution is insufficient. It is important to further quantify instantaneous resolution because intelligent, agile resolution is more efficient and provides greater safety through faster response times, especially when pairing ROIs with convolutional neural networks (a future paper).

Lastly, we have shown the criticality of moving from measuring a basic detection range to measuring how quickly an object can be identified and classified. It is not simply enough to quantify a distance at which a potential object can be detected at the sensor. One must also quantify the latency from the actual event to the sensor detection – plus the latency from the sensor detection to the CPU decision. Under this framework, the more attributes a LiDAR system can provide, the faster a perception system can classify.

The agile iDAR system enables the type of toolkit that reduces latency and bandwidth in a dramatic way. It allows for dynamic “Search, Acquire, Act” functions to be implemented at the sensor level, mimicking the process of human perception: new objects can be detected efficiently, classified with multiple supporting sensors, acted upon if the object is perceived as an immediate threat, fully interrogated for more information about the object, or tracked with real time data. Therefore, equipped with iDAR, an autonomous vehicle can spot hazards sooner and respond more quickly and accurately than other sensor systems. This avoids accidents and gives credibility to the safety promise of self-driving vehicles.

While groundbreaking in its time, earlier LiDAR sensors passively search with scan patterns that are fixed in both time and space. A new generation of intelligent sensors expands their capabilities and moves from passive detection of objects to active search and identification of classification attributes of objects in real time. As perception technology and sensor systems evolve, it’s imperative that metrics used to measure their capabilities also evolve.

With safety of paramount importance, these metrics should not only indicate what LiDAR systems are capable of achieving, but also how those capabilities bring vehicles closer to optimal safety conditions in real world driving scenarios.

Throughout this series of white papers, AEye will continue to propose new, interconnected metrics that build on each to help create a more complete and accurate picture of what makes a LiDAR system effective.

Sign up to receive AEye updates

Rethinking the Three “Rs” of LiDAR: Rate, Resolution and Range —

Original Article

Leave a comment

Your email address will not be published. Required fields are marked *