How to kill your PySpark performance with these simple tricks
Language: Hebrew
The presentation was given on 2023.07.04 at PyCon Israel 2023 - Conference.
The talk would start with explaining what spark is. what problems it solves, and why you might want to use it. Then I'll describe common anti patterns, especially with the data engineering/science related code. and what you should probably do instead
Pyspark, spark’s python interface is a potent data processing tool and potentially very high performing. This talk is about PYSpark's strong points and how common anti-patterns abuse and hurt PYSpark applications' performance, forcing you to throw more money and lose many of spark benefits. But there is a better way, using native pyspark tools and patterns that I’ll present