u/EchoImpressive6063

These questions are for Waymo employees or industry people if there are any here. I am certain there is some kind of predictive control going on so the car can plan a few steps ahead. An approach commonly taught is to use YOLO to identify objects of interest, frame-by-frame, and then match them across frames with some algorithm. Do actual self-driving cars do this, and predict trajectories for each object? If you have a variable number of discrete objects, how do you train a neural net to learn their interactions? I guess my question is, does the predictive model deal in discrete objects or is there a single representation of each frame? Then of course there is depth, whether from a neural net, LiDAR or radar. At what level is this incorporated? Are the trajectories of objects predicted in 2D or 3D? Please point me to any relevant reading. Thanks!

Implementation Question