Spark Performance Tuning¶
Tip
- Caching Data
- Tuning Partitions
- Leveraging Statistics
- Optimizing the Join Strategy
- Adaptive Query Execution
- Coalescing Post Shuffle Partitions
- Splitting skewed shuffle partitions
- Converting Shuffle Sort Merge Join (SMJ) to Broadcast Hash Join (BHJ)
- Converting Shuffle Sort Merge Join to Shuffle Hash Join
- Optimizing Skew Join
- Advanced Customization
- Storage Partition Join