I like this report from Ford on how they keep self driving sensors clean. They have air blowers which can divert any insects heading for the sensor up and over it, and sprayers and other air blowers to clean off the LIDAR when it gets hit.
The issue of cleaning sensors is a serious one. Waymo has also demonstrated a system of wipers for their cameras. You need to deal with bugs, and also rain, snow and slush. You don’t want to shut down for a dirty windshield. This is why Tesla puts their main cameras at the rear view mirror, which the car’s wipers already keep clean. Their side and rear cameras have no means of cleaning, which may be a challenge for Tesla when they try to move into “real, true, actual, full self driving” someday using those cameras, as they say they intend to.
The original Velodyne LIDAR which changed the self-driving world starting with the 2007 DARPA Urban Challenge was a physical spinning device. That gave it an advantage — any water that got on it was usually wicked away. Modern designs tend to have a housing around them, even if they spin, and that means things that get on the house stay there. If the original spinning unit got dirt on the laser or detector array, that could be a big problem, losing the whole field. Dirt on a housing is a problem but only in that one direction.
It’s vital that cars can “fail semi-operational” when any component, like a sensor, degrades or is lost. This means that even though the component has failed, the system remains at a certain operational level, limiting but not necessary stopping its drive. This is better than “fail-safe” which is a basic minimum. A car that comes to a stop in its lane, or immediately pulls over would be following a fail-safe strategy. Even better is a “fail fully operational” approach where there is not even a hiccup after a component fails. A vehicle of this type will complete its task, then head for service.
With a robotaxi fleet, you don’t have to fail that operational. That’s because you have a fleet of taxis at your disposal. If a failure means your car has to pull off to a suitable spot you can immediately dispatch another taxi vehicle to meet it and take the passengers on their way. Another service vehicle can be dispatched too. This changes the dynamics of your fleet in a potentially cost saving way.
Today, if you own a car and it breaks down, it’s a big annoyance. You might need to get it towed. You need to arrange repairs and possibly get a loaner rental car. You’re not happy. To prevent that, today’s cars are engineered to have very few customer-enraging failures.
While safety is not to be compromised, non-dangerous mechanical problems are not a big problem if they just mean a 2 minute delay. You give the customer their ride free, maybe another free ride and they go away happy. You go away happy because you can build and maintain your fleet to a modestly level of (non safety related) reliability.
This may mean that designs that depend on one core sensor, such as a 360 degree spinning LIDAR or a single forward camera or close pair of forward cameras may not be ideal. If you lose the core sensor, you need to get off the road reasonably soon. Designs that use cheaper LIDARs put more than one, and they have them overlap over the forward direction. That way the loss of one may mean the loss of some perception to the side, but forward perception still works. With cameras, two cameras are common (as well as cameras with different fields of view.) These systems lose some abilities if they lose one of the cameras, but they can still do basic stuff with only one. You would not want to drive 10,000 miles with only one, but the risk of driving a couple of miles to exit the highway might be easily within acceptable limits if you do it right.
Most teams use a triad of radar, LIDAR, and cameras today. Any two of those can do the job if you slow down and follow certain cautious principles. With two LIDARs you can do even better.
Still, driving through on a slushy road with constant dirty spray from passing trucks will present a problem. You can’t have to pull off the road every time that happens. And while you can ask your passenger to clean things in an emergency — and a privately owned car that never runs unoccupied can do that more easily — it’s not a great long term solution.
This philosophy has to extend inside the vehicle too. If it’s cheap, you have duplicates of components and can survive the failure of any one. Last week, I attended the Drive World Conference in Santa Clara and saw the pitches of processor makers who have built redundant processors which have two identical halves which operate in lockstep, each doing the same thing. They can indeed survive the loss of one processor without a hiccup. That’s good, but less useful than they imagine. The truth is, processor core failures are a fairly rare thing. Most computing errors are software caused, and two lockstep processors will duplicate a software mistake. The lock-step processor is a hardware designer’s solution to reliability. It’s not wrong, but it misses the software designer’s approaches.
It may be better to have two different computers, not identical, but both able to handle the vehicle. The bigger computer does everything, and asks the small computer to execute it. But if the small computer doesn’t hear from the big computer every millisecond — as it should expect — it starts to take over the task. It doesn’t do the full task. Maybe it’s only able to keep the car driving in a lane and avoid other cars, while the big system can do the full driving task. This way the big computer can fail and reboot without causing a safety risk. Alternate designs have three different systems all deciding the same thing, like how much to turn the steering wheel. They all come up with their answer and vote. Most of the time, they all agree. If one disagrees with the other two though, they win and we have a fail operational situation. There’s lots of debate in the systems design world over just what the best design principles are in this area.
Most agree, though, that the right approach is often not to simply make more reliable hardware. It’s to expect and plan for the hardware to fail. Because even the most reliable hardware and software still fails sometimes — it just fails less often. If you can tolerate failures, you are not just more reliable, you can also be more reliable with lower cost components. Lower cost components can save money, of course, but they can also mean you can do more with the same money.
Vehicles also can be constantly planning for problems. In the background, the computer can be thinking, “What would I do if 0.5 seconds from now, my sensor is blinded or my perception module crashes?” They can create an emergency plan (while those things still work) of the right thing to do, and that can be input to the decision made after being crippled. Any crippled computer was fully functional just before.
If you’re going to bet your life on the safety of a car, it is comforting to know that any component within it can be unplugged, destroyed or go bad without causing a hiccup, or at most a delay of a couple of minutes. It’s going to make everybody more comfortable and safer.