“Truck-truckin’ down the highway
Sittin’ in the cab of
A ten ton machine”
The rock group, Bread, came up with this song 7 years before the Enterprise was hauled to the Marshall Flight Center in Alabama (Figure 1) for its maiden flight. It is highly unlikely that an autonomous truck is going to do something like this even a century from now. But bringing autonomy to trucks for logistics and movement of goods has received increased focus over the past 3 years and is becoming real. According to Richard Bishop, an autonomous trucking expert, the primary factors driving this include driver fatigue and health, COVID-19, and the more immediate financial benefits that can accrue with autonomous trucking (as compared to autonomous cars). Ever since Otto (later acquired by Uber for ~ $500M) made the first autonomous trucking run on a Colorado highway to transport 51,744 cans of beer (@$10K/can, must have been excellent!), significant financial investments have been made – in start-ups and unicorns like TuSimple, Embark, Kodiak, Gatik and Plus. Aurora and Waymo (which initially focused on autonomous cars) are aggressively pursuing this opportunity. OEMs like Daimler, Volvo, and Navistar have allocated significant internal resources to make autonomous trucking a reality. Amazon, UPS, and FedEx have joined the battle since losing here is an existential threat.
The operational and business models for autonomy in trucking and cars are different. The former is focused on moving goods primarily on well-structured and well-mapped routes (in many cases with human drivers swapping trailer sections at appropriate interface points along the route). Between these points, autonomy eliminates the issue of driver fatigue and enables high utilization and duty cycles. It also provides a path for drivers to work locally (higher quality of life and health), reduces risks in more cluttered, unpredictable, and dense traffic, allows more efficient handling for the loading and unloading of goods, and reduces overall labor costs. For autonomous cars (either for ride-hailing or consumer cars), highway, rural and urban areas streets must be addressed, since the option of having a human driver does not exist and utilization rates may be low or dependent on ride-sharing demand models, population density, and competitive transportation options. The difficulty this entails has delayed and stalled efforts for autonomy for ride-hailing and consumer cars, as exemplified by Uber’s recent fire sale of the autonomy effort to Aurora.
Perception is an important component of autonomy – it provides situational awareness of static (road surface, pavements, lane markings, highway signs, traffic lights, construction barriers and zones, accidents, stationary vehicles, railings, parked cars, tire debris, etc.) and dynamic events (pedestrians, animals, other vehicles, emergency vehicles, etc.). This is required along with knowledge of the current position of the vehicle (localization) to control the vehicle (speed, direction, braking, acceleration, lane change). All of which is intuitive for experienced human drivers, but non-trivial for sensors and computers.
Does autonomy for trucks pose different perception challenges as compared to cars? The simple answer is yes – trucks are different from cars in weight, maneuverability, stability (higher center of gravity), and design perspectives (pneumatic braking for example). From an obstacle avoidance and lane control aspect, significantly greater stopping distances are required (longer braking distance which in turn mandates longer-range perception). Accurate positioning within a lane is critical due to the size of a truck relative to a car. Sensors for perception need to cover larger blind-spots since placement may be constrained to the cab section of the truck (the trailer/container section may be swapped). Typical lifetimes for trucks are ~1M miles as compared to cars (~ 200K miles). This, along with the higher capital cost of a truck makes it possible to use more expensive and higher performance perception sensor stacks to guarantee safer operation. This is vital – autonomous trucking cannot afford a disaster like the Uber car accident that destroyed public confidence and delayed the path to autonomous cars significantly.
Discussions with experts at 4 different autonomous trucking companies provide more insights into some of the specific perception challenges and performance requirements for LiDAR and other sensors, and how these are utilized in the overall goal of providing safe and efficient autonomy.
Kodiak: Andreas Wendel is the VP of Engineering at Kodiak. With a track record of having led the autonomous car perception team at Google (and later Waymo), and now solving complex challenges with autonomous trucking, he is in an authoritative position to compare the relative challenges. At Kodiak, he was instrumental in establishing principles for the perception system design (the Kodiak Driver) that rely on established information theory principles, “with a strong commitment to avoiding rules, heuristics, and shortcuts”. What this really means is that if a particular sensor provides uncertain data, it should be acknowledged explicitly and worked through during the training and operational phase (as opposed to other approaches that simply use heuristics to force a decision). The uncertainty can reduce over time as more data is collected and used by the Kodiak Driver. If a high level of uncertainty persists, control actions like reducing speed or changing lanes are initiated.
In terms of LiDAR requirements, the 250 m perception range for a front-facing unit is critical, as is the ability of the sensor stack to work reliably in all weather conditions. Since LiDAR performance is impacted by weather, Kodiak relies on radar and cameras to deliver consistent performance even under bad weather conditions. A total of 3 LiDARs are used – one is a forward-facing long-range, 1550 nm scanning LiDAR mounted on the roof of the cab. Shorter range, 360° FoV, 8XX-9XX nm LiDARs and cameras are mounted on rear-view mirrors on either side. Kodiak pays particular attention to how sensors are mounted – specially designed mounts ensure that higher shock loads typical in a trucking application minimize performance and reliability impacts. High-reliability LiDARs that can hold up to harsh environmental conditions are critical, and the performance in a trustworthy region under the worst-case conditions matters. Higher resolution within a specified FOV near the horizon could help find roadway debris earlier. Finally, for Kodiak, a close relationship with their LiDAR suppliers is critical so that capabilities can be refined as the Kodiak Driver evolves. The graceful integration of new sensors like LiDAR as they become available is aided by the design of the Kodiak Driver which is based on fundamental mathematics, making it easier to integrate new detectors. All detectors speak the same language allowing the system to assimilate information from next generation detectors without requiring new code. This aspect of the system design gives Kodiak a tremendous advantage in the rapidly evolving truck autonomy industry.
Plus: is focused on implementing L4 autonomy (conditional autonomy) in a well-defined ODD (Operational Design Domain), connecting hubs on highways and local routes to distribution centers. The company does not expect fully autonomous driving to occur until 2024. Until then, they are launching the autonomous driving system on trucks with human drivers. This will help develop and prove the safety of the autonomy stack through billions of real-world road miles. Once enough experience and confidence has been established, the operation of the truck will transition to complete autonomy
Tim Daly is the Chief Architect at Plus. One important element of the system is the reliance on internal HD maps, generated on routes that Plus vehicles traverse with human drivers. Mr. Daly believes that using externally procured maps is not optimal from a cost and information currency perspective. Mapping is done with a combination of GNSS, LiDAR, and cameras. LiDARs are also used for range determination, and 250 m range performance with high resolution and Field of View (FoV) is critical. Frequency Modulated Continuous Wave (FMCW) LiDAR would be ideal since it solves the aperture problem. Specifically, an object tracker using Time-of-Flight or Amplitude Modulated (AM) lidar extracts texture or shape by associating these across multiple scans to calculate a motion vector; when there are not enough points or the shape is too smooth, the motion is unobservable. FMCW LiDAR allows motion estimation without this association step since like radar, it can make direct velocity measurements. The concern is that it is not ready for prime time in terms of reliable performance and productization. Flash LiDAR is not required, since a rolling shutter image (flash provides for a global shutter) is not a real concern. However, the potential benefit of a solid-state LiDAR system (flash or otherwise) is the higher robustness. One of the key challenges with current LiDARs in trucking is the ability to handle large FoV at low ranges and smaller FoV at high ranges (with adequate resolution). A stepped FoV in both directions would be a welcome addition to the perception suite. The exciting thing about LiDAR is that due to its relative immaturity (compared to cameras and radar), improvements are dramatic year over year. The perception stack needs to be able to accommodate these gracefully. Data augmentation approaches that Plus is building into its systems now are crucial to enable this. Specifically, this refers to the training of neural networks (or the artificial computer brain) on aspects of the LiDAR data that are not specific to a particular LiDAR (for example the scan pattern). The idea is to randomly change the training data to keep the model from fitting certain aspects that are likely to change as LiDAR architectures evolve.
Embark is focused on long-haul, autonomous trucking on highways which offers structured traffic, well-mapped roads, lighting, and road signs. The exclusive focus on this use-case allows them to build a technology stack that is tailored to the specific needs of freight trucking. This includes dealing with the challenges of automating a vehicle with variable weight, variable trailer configuration, and operating at highway speeds.
Gilbran Alvarez is the perception lead at Embark and he explains that the stack uses a combination of sensors (lidar, radar, cameras, GPS) to create an accurate understanding of the truck’s position and its surrounding environment. Combining inputs from each sensor type provides redundancy and is able to take advantages of the strengths of each sensor while mitigating weaknesses. GPS and IMUs (Inertial Measurement Units) are used to localize the truck in conjunction with visual information from the camera and LiDAR sensors. Dead reckoning is used when GPS signals degrade through knowledge of the last position and IMU information. LiDAR requirements are different for trucking applications in multiple dimensions. Styling is less important and there is a large amount of surface area on which multiple sensors can be mounted. Sensor cost is less of an issue given the overall capital costs, lifetime, and duty cycle of a truck. Longer stopping distances (due to higher weight and volume) require longer range perception. Finally, performance, durability, and reliability under more extreme conditions of shock and vibration need to be addressed. Currently, scanning LiDAR solutions meet the performance requirements that are demanded in trucking applications. Flash LiDAR solution could be advantageous since it improves reliability and allows mature image processing techniques developed over the past 30 years to be applied to LiDAR point clouds. However, such LiDARs do not currently provide the kinds of range, resolution, and FoV performance required for trucking automation, and will need to mature to exploit the image processing and reliability advantages discussed above. Embark’s sensor suite is custom designed to provide the range needed to operate a truck at highway speeds in varied conditions, including at night and rain while providing a view around the entire truck. Close cooperation with sensor partners to design and test hardware capable of withstanding the vibration, weather, and other conditions needed to provide reliable sensing over long periods of highway driving is critical. The design and validation process includes hazard risk analysis, component- level and closed-course system-level testing. Ground Penetrating Radar is an emerging technology that could help with localization even in bad weather conditions (when visual sensors like LiDAR and cameras suffer performance degradation). This is something that Embark is monitoring since it could offer an important sensor modality in the future.
Gatik is a Sanskrit word meaning “speed” and “progressive” and captures the essence of the company perfectly. The company was founded in 2017, and in 3 years has an impressive list of corporate customers, most notably Walmart. Gatik focuses on the “middle mile” in a hub-and-spoke model, with Class 2-6 box trucks (not cab-trailer configurations that allow swapping) to move goods between micro-fulfillment centers, warehouses, and stores. This makes available the entire perimeter of the truck for placement of perception sensors (rather than just the cab area for tractor-trailer configurations). The operation is completely autonomous (no transition between autonomous and human drivers).
Arjun Narang a co-founder and CTO at Gatik emphasized the two important dimensions that underpin Gatik’s perception and autonomy stacks – “Explainable AI” and “Structured Autonomy”. The former refers to a deterministic approach of decomposing massive Deep Learning Neural Networks (DNNs) into micro-models while building rule-based validation systems around them. The latter refers to a policy of operating the autonomous fleet on known routes with fixed pick-up and drop-off locations. The perception stack needs to be good enough to enable the detection & classification of any object at a variety of distances (up to 250 m at high resolution) and attach semantic understanding to the data. A multi-sensor suite including LiDARs, cameras, and radars enables an awareness radius around the vehicle at varying ranges and environmental conditions. Localization is achieved through multi-sensor fusion which tracks measurement and uncertainties across LiDAR, camera, GNSS, and IMU data to deliver high fidelity localization and lane control. LiDAR technologies in production today meet Gatik’s need for operations in Arkansas where the trucks are operated at a top speed of 45 mph while handling lane change/merge operations, negotiating intersections, and navigating through other faster moving vehicles. For higher speed routes like highways, the vehicle requires higher performance sensor capabilities. As compared to radar and cameras, some of the new promising LiDAR technologies are in the early stages of development with mass production planned for 2021 and beyond. Wish-list items include flash LiDAR with no-moving parts (primarily due to the processing advantage and speed since all points are received simultaneously, and distortion and time-lag effects do not have to be addressed), foveated imaging across the FoV (higher resolution in regions of interest) and stepped FoV systems.
The Rangwala Take: Autonomous trucks are here. The common themes are safety, higher performance, and higher reliability sensors (especially LiDAR). The companies interviewed realize that LiDAR is still maturing with innovation to come, and are making sure that the perception and control stacks can seamlessly leverage these as they evolve in the future and ODDs need to expand. Also of note, the manic focus on the costs of sensors and LiDAR in particular (which seems to have become a theme lately as LiDAR companies go public) is not prevalent within the trucking companies. They realize that safe autonomy is a critical and valuable feature and are willing to pay more to get it.