In this module we will cover the tools needed for the processing of data such as Spark Development, Py Spark which includes PySpark Architecture, PySpark Dataframe, PySpark SQL, Accumulators and Broadcast, serialization Cache, optimization Technique, wide transformations