u/Salt_Macaron_6582

▲ 56 r/apachespark+1 crossposts

Learning (Py)Spark the easy way

Hi guys, I'm starting a job as a Junior Data Engineer soon and I will be using a lot of PySpark yet I have no experience with it. I want to grasp the basics and start my journey into the engine architecture and optimization but I'm kind of lazy so I'm looking for the easy way. I do have experience with Python and SQL as I have worked as a SWE and DevOps Engineer before.

I was wondering if there are any good courses I can just go through that will teach me the basic commands and concepts, ideally something low effort I can just put an hour in every now and then.

Also I'm looking for a book that goes deeper into architecture and optimization so I can start to gain some deeper knowledge. I have read books like 'designing data intensive application' and am looking for something similar where it mostly explains separated concepts so I can stop reading for a week without being lost when starting again.

YouTube channel recommendations with content I can tune out to while still learning just a little bit would also be appreciated. Or anything else for lazy engineers like me.

Thanks in advance!

reddit.com
u/Salt_Macaron_6582 — 8 days ago