Struggling to learn Spark UI on Databricks, all tutorials are outdated. Any good resources?
Hey everyone, I'm fairly new to Spark and trying to understand how it actually executes jobs specifically the DAG visualization, stages, task metrics, and executor stats in the Spark UI.
The problem I'm running into: almost every video tutorial I find was recorded on an older version of Databricks, and the UI looks completely different from what I see today. The gap is big enough that I can't follow along at all.
A few specific issues I've hit:
- `spark.databricks.io.cache.enabled` throws a CONFIG_NOT_AVAILABLE error on newer runtimes
- `spark.catalog.clearCache()` throws a NOT_SUPPORTED error because I'm on Serverless compute (Community Edition)
- The Spark UI itself looks different from what tutorials show
I'm using Databricks Community Edition (free tier), which I've now learned only gives Serverless compute so some things just aren't available.
My questions:
Is there a good up-to-date resource (video, blog, or docs) for understanding the Spark UI on the current Databricks version?
For learning Spark internals (DAG, stages, task metrics), is it better to just use local Spark or Google Colab instead of Databricks free tier?
Any tips for following older Spark UI tutorials and mentally mapping them to the current UI?
Thanks in advance!