Is this an accurate analogy for JEPA?
It is two models working together:
Professor Compressor🧑🏼🏫Johnny Guesser🙋🏻
Professor Compressor🧑🏼🏫
- sees
currentWorldState, converts it tocompressed(currentState) - sees
nextWorldState, converts it tocompressed(nextState)
The constraint is that Professor Compressor🧑🏼🏫 needs to compress a state with like 100% detail to like 5% ^(example) detail.
Professor Compressor🧑🏼🏫 gives Johnny Guesser🙋🏻 the compressed(currentState), and asks him to guess the compressed(nextState).
Professor Compressor🧑🏼🏫 is learning from trial&error – which 5% details he should keep, and which 95% to discard – to help Johnny Guesser🙋🏻 guess better.
and our Johnny Guesser🙋🏻, of course, is trying his best to guess the next state through trial&error.
Over time, we see that Professor Compressor🧑🏼🏫 gives better and better compressions for Johnny Guesser🙋🏻 to guess from/about.
and Johnny Guesser🙋🏻 gives better and better predictions of what the next state will be.