Andrej Karpathy former director of AI at Tesla (Photo by Michael Macor/The San Francisco Chronicle … [+]
In a recent interview, Andrej Karpathy, who was previously the AI manager for Tesla’s Autopilot and FSD products, explained their reasoning behind removing radar and ultrasound from Tesla cars, as well as never using LIDAR or the cards. While Elon Musk is best known for his statements on this subject, Karpathy was his privileged interlocutor to support this reasoning. Karpathy, however, raised eyebrows when he took a sabbatical earlier this year and eventually announced he would be leaving.
The main points of Karpathy:
- Additional sensors increase system cost and, more importantly, complexity. They complicate the task of the software and increase the cost of all data pipelines. They add risk and complexity to the supply chain and manufacturing.
- Elon Musk is pushing a “the best part isn’t a part” philosophy that can be seen throughout the car in things like doing everything via the touchscreen. It is an expression of this philosophy.
- Vision is necessary for the task (which almost everyone agrees on) and it should also be sufficient. If sufficient, the cost of additional sensors and tools outweighs their benefits.
- Sensors change as parts change or become available and unavailable. They must be maintained and the software adapted to these changes. They also need to be calibrated for fusion to work properly.
- Having a fleet collecting more data is more important than having more sensors.
- Having to deal with LIDAR and radar produces a lot of bloat in the code and data pipelines. He predicts that other companies will also abandon these sensors in time.
- Mapping the world and keeping it up to date is far too expensive. You won’t change the world with this limitation, you have to focus on the vision that is most important. Roads are designed to be interpreted with vision.
Complexity of sensor fusion
In my recent interview with Jesse Levinson, CEO and co-founder of Zoox, I asked him the same question. While he agreed that having more sensors is definitely more work and more noise, these issues are not unsolvable and worth the effort. He thinks that if you’re smart and do your sensor fusion correctly, you can ensure that new sensor data and conflicting data aren’t a downside. Although each input has noise, if you are good you can draw the true signal from it and win.
In general, other teams won’t necessarily disagree with too many points from Karpathy. Having multiple sensors and fusion adds a lot of complexity and cost. Many will even agree that one day on the road, vision may be enough and those other sensors can be left behind. However, everyone (including probably Karpathy and Musk) would agree that vision is not enough today. Also, others would say it’s not at all clear when the vision will be enough. Karpathy and many others argue that humans primarily drive with vision, so it’s clearly possible, but the reality is that computers don’t have the power that human brains have to do that. Very few technologies work like human minds – the fact that birds fly with flapping wings does not imply that aircraft designers follow these routes. It is more common to use different or, in some cases, superhuman abilities of the machines to compensate for the machines’ lack of brainpower.
Tesla’s approach would be quite rare in the world of AI, deliberately limiting a system to the sole capability of human sensors and hoping to make the human brain work with those constrained sensors.
Cost as a driver or time to market?
This difference in perspective stems partly from the fact that Tesla is a car manufacturer, and further from their goal of having their system work on their already shipped cars, or at worst a minor modification of their already shipped cars. (That renovation is already underway, and owners of older cars have seen a main processor upgrade with a second pending, as well as a replacement for the cameras – and possibly a rumored new camera system – in some cases.)
Automakers are very, very cost conscious. Everything they add to a vehicle adds 2-5 times its cost to the list price of the vehicle. Anything they can withdraw adds to their bottom line. The philosophy of removing parts makes sense here and has worked well for Tesla, although many drivers complain that they went a bit too far in some cases.
But this is less clear when removing a part when the system does not work without this part. After Tesla removed radar support, they downgraded a number of features in Tesla Autopilot, and even a year later it hasn’t returned to the speed it was capable of. Many Tesla owners complain that the radar-less system has far more frequent “ghost braking” events where the vehicle brakes, sometimes hard, for obstacles that aren’t there or aren’t a problem.
Tesla’s new cars shipped without ultrasonics have removed almost all functions of ultrasonics, such as park assist, auto-park, summon and more. They are promised, says Tesla, to return in the near future.
Most self-driving teams believe the shortest path to deployable self-driving is using LIDAR, radar, and in some cases other sensors. They see it as the shortest and safest way, not the easiest and cheapest. As they do not sell vehicles, these constraints are not a priority for them. Zoox’s Jesse Levinson says that because their custom robotaxi will get a lot of use and charge a good fee, the added cost of the special sensors isn’t the barrier it would be on a car sold to consumers.
But if cost is a factor, speed of development is most important. LIDAR today performs fully reliable detection of a large class of obstacles, at a level of reliability one can bet one’s life on. The camera doesn’t, and while it probably will one day, when it will is unknown to both Tesla and the other teams. The date when they will have a low cost is much better understood.
This question of when affects the complexity of the software. Today, it is more difficult to ensure that cameras provide the necessary reliability – so much more work than anyone can do yet. That it could allow for a simpler system in the future is not considered by most teams today. Leadership teams all invest billions of dollars and accept the cost of added complexity. A theoretically simpler solution that does not yet work is no simpler than a more complex but operational solution.
Naturally, it’s worth noting that none of the other standalone teams have production deployments, although several have pilots operating in complex cities without an onboard safety driver. Earlier I posted a series of articles and videos on what the remaining grand challenges teams see, and overall getting reliable perception isn’t one of the big hurdles for LIDAR and teams using the card. Rather, the challenge lies in the immense detail work required to be sure the vehicle can handle any unusual cases, especially never-seen-before cases.
Cartography and fleet
The issue of card virtue is another on which Tesla/Karpathy and other teams differ. While Karpathy hoped to create a car that could fully understand the road and where it needed to go without a detailed map, such a car is also a car that can remember what it has learned and use that to create a map to help the next version of this car to travel this road. Ironically, Karpathy’s own statement about the enormous value of a large fleet applies well here – if one has a large fleet, it is possible to build complete and detailed maps of the whole world, and keep them up to date, and it is foolish to throw away the useful information learned by this fleet.
These issues were discussed in more detail in my article and video on Tesla’s mapping decisions:
The way to the future
Karpathy is right that at some point a breakthrough is likely to come that will allow computer vision to perform the task of driving with great safety. Most other teams don’t disagree with him on that. He may be right in his prediction that they will eventually get rid of their LIDARs to cut costs. But they believe they will once they are in production, having taken the lead in the robotaxi sector when Tesla is still only driving assistance. They may be wrong – this breakthrough could come sooner, in which case Tesla will be very successful. But they don’t think that’s the way to bet.
It is also true that as time passes and all tools improve, additional sensors may not cost much more or add much more complexity. LIDAR, radar and thermal cameras provide superhuman detection. They can detect things that cameras cannot. Even if this advantage diminishes, it will not fall to zero – the debate will be whether their cost is justified. But when it comes to digital technology, that cost has historically been known to drop. The immense complexity of a modern mobile phone would baffle the mind of anyone not so long ago, and its cost would shock them even more. People who have bet on the high price of technology have rarely won the technology race. Tesla is actually a prime example of a company that won by betting that technologies would be better and cheaper.
Karpathy’s view of that future is hard to discern. His position at Tesla was highly coveted and lucrative in his field. For someone who believes in how far Tesla has come, this is an especially important place to change the world. However, he didn’t leave Tesla to start another project, at least as far as public announcements go. His departure suggests (but doesn’t guarantee) that he had some kind of trouble – perhaps with the project or his notoriously difficult job for the boss. Could be something else or something personal, of course – this is just speculation.
What is true is that the bet Tesla has made on these principles is a big one – with a big payoff or a big risk of falling behind. Luckily for Tesla, it has so many resources that even if its internal research fails, it can afford to change direction. In fact, if he had wanted to, he probably would have liked to buy Argo.AI last week, but Argo’s assets do not match Tesla’s current plan. Maybe if the plan changes, another player will be available for acquisition.
#ExTesla #Chief #Explains #Removed #Sensors #differ