Uber engineering has recently open-sourced its highly scalable and reliable shuffle as a service for Apache Spark. Spark is one of the most important tools and platforms in data engineering and analytics. It is shuffling data on local machines by default and causes challenges while the scale is getting very large. Shuffle as a service is a solution developed at Uber for this problem.
By Reza Rahimi
Leave a Reply