pySparkGuide.com

Reference and tutorials for busy professionals


Why this website?

Right now, if you want to find anything for pySpark besides the documentation, the experience is very painful and time consuming -

  • Articles are scattered throughout the internet. Searching out good articles is a pain as most are shallow blogspam on personal blogs, medium etc.
  • The websites don't have structure, which is terrible for discovering and studying related topics.
  • Chat GPT gives a LOT of wrong explanations and wrong answers. If Chat GPT was good, I would have never gone through the pain of creating this website.
  • In pySpark, there are too many ways to achieve the same output. You need articles that give you a map of the land, and tell which approach you should actually prefer.

This website tries to solve this problem by becoming a go-to destination for structured pySpark articles that go in depth when needed. These articles are from of my personal research notes, where I tend to go deep and break things until I understand them.

I'm not a great technical writer, but I'm improving. So, the articles here are evolving with time.

And obviously,
Everything here is free.
Technical knowledge is a powerful thing. Become a good engineer, and build better solutions into the world.
Enjoy.