We previously reported about what Tesla’s Autopilot can see with its suite of 8 cameras around the car, but we rarely got into how the Autopilot understands and interprets what it is looking at through its computer vision system.
That’s changing today after a few hackers were able to overlay Autopilot data on top of snapshots of what the system is seeing.
Home Solar Power
As with our previous looks at what Autopilot can see, it’s thanks to our favorite Tesla hacker ‘verygreen’ who has been able to intercept the data that Tesla gathers from his own vehicle.
With the help of a few other Tesla hackers, they were able to overlay the data with the snapshots of what the vehicle can see, which results in an interesting visualization of what Autopilot understands when looking through its sensors.
It’s not a complete look at what Autopilot can see. For example, it doesn’t include the tracking of the lanes.
But it does show how the system is tracking objects in its field of view and a way that we have never seen before.
verygreen sent us some pictures and videos along with an interesting explanation of what they managed to do.
Here it is in his own words:
Would not it be great if Tesla published videos about how their cars see things, in a fashion similar to what Google does? Well, they don’t and so the exercise is left to hackers to try and piece things together.
Even though my unicorn car with hacked autopilot board is long lost, I still have a whole bunch of snapshots from those days. I am sure you have seen the picture. But what’s not widely known is that alongside those pictures, radar data was included too. Now TMC user DamianXVI (the same guy that wrote color interpolator for originally black and white images) came up with a way to overlay the radar data on the pictures.
It looks like this:
Now a bit of an explanation about what the circles mean: circle color represents the type of the object:
green – moving
yellow – stopped
orange – stationary
red – unknown.
The circle size represents the distance to the object, the bigger the circle, the closer is the object (so we are not trying to encircle object, or approximate the size – the radar has no way of knowing this).
A thick circle means this object has a label and is therefore likely being tracked by the autopilot. Fading in and out is due to the probability of existence changing.
Sometimes the circle is in some empty space – this is likely due to radar having some problems with determining object elevation, try to look higher or lower for a relevant object. Also at times, radar cannot determine elevation at all (value of -5 is used in that case). The circle is drawn at elevation 0 in this case.
Also keep in mind that the radar reports coordinates in 3D and we need to project them onto a 2D picture. Sometimes there are errors in such conversions as you might expect.
Here’s a video with the radar data overlaid:
Now for new exciting development. While the unicorn is gone, I came upon a stash of video and picture snapshots that you have not seen before.
Here’s a video from sometime in October 2017 with radar data overlaid from that stash:
But that’s not all, the stash also includes pretty recent snapshots from March 2018 where Tesla was apparently debugging radar-vision fusion implementation of depth perception.
Every one of these pictures (captured by firmware 2018.10.4) below came with a detailed list of objects autopilot sees and tracks. It reports things like the dimensions, relative velocity, confidence that the actual object exists, the probability that the object would be an obstacle and so on.
Now this last one is interesting in that the autopilot is clearly recognizing and tracking a stopped work truck, it even assigned a 25% obstacle rating to it (not the poor worker, though). This should put to rest the theory that the stopped vehicles are not seen (now autopilot might see and ignore them – but that’s whole other topic).
Another interesting observation you can see is from second and third pictures – they depict the same scene taken with the narrow and main cameras. What’s interesting in there is that apparently only main (more wide-angled) camera is used to detect oncoming traffic.
These bounding boxes also validate a lot of the theories about vision NN outputs that TMC user jimmy_d presented in a forum post here.
Of course, this does not represent the full autopilot internal state, we don’t see the detected/tracked lanes or information gleaned from maps but it gives us another glimpse behind the curtain.