Implementation Question
These questions are for Waymo employees or industry people if there are any here. I am certain there is some kind of predictive control going on so the car can plan a few steps ahead. An approach commonly taught is to use YOLO to identify objects of interest, frame-by-frame, and then match them across frames with some algorithm. Do actual self-driving cars do this, and predict trajectories for each object? If you have a variable number of discrete objects, how do you train a neural net to learn their interactions? I guess my question is, does the predictive model deal in discrete objects or is there a single representation of each frame? Then of course there is depth, whether from a neural net, LiDAR or radar. At what level is this incorporated? Are the trajectories of objects predicted in 2D or 3D? Please point me to any relevant reading. Thanks!