Right now, if you want to find anything for pySpark besides the documentation, the experience is very painful and time consuming -

  • Articles are scattered throughout the internet. Searching out good articles is a pain as most are shallow blogspam on personal blogs, medium etc.
  • The websites don't have structure, which is terrible for discovering and studying related topics.
  • Chat GPT gives a LOT of wrong explanations and wrong answers. If Chat GPT was good, I would have never gone through the pain of creating this website.
  • In pySpark, there are too many ways to achieve the same output. You need articles that give you a map of the land, and tell which approach you should actually prefer.

This website tries to solve this problem by becoming a go-to destination for structured pySpark articles that go in depth when needed. These articles are from of my personal research notes, where I tend to go deep and break things until I understand them.

I'm not a great technical writer, but I'm improving. So, the articles here are evolving with time.

And obviously,
Everything here is free.
Technical knowledge is a powerful thing. Become a good engineer, and build better solutions into the world.

