You can purchase the book on Amazon and Packt.
With this book, you will learn about a wide variety of topics including Apache Spark and the Spark 2.0 architecture; build and interact with Spark DataFrames using Spark SQL; learn how to solve graph and deep learning problems using GraphFrames and TensorFrames respectively; and read, transform, and understand data and use it to train machine learning models with MLlib and ML.
Spark SQL Engine / Catalyst Optimizer
Table of contents:
- Understanding Spark
- Resilient Distributed Dataset
- DataFrames
- Preparing Data for Modeling
- Introducing MLlib
- Introducing the ML Package
- GraphFrames
- TensorFrames
- Polyglot Persistence with Blaze
- Structured Streaming
- Packaging Spark Applications
The code samples within this book can be found at: https://github.com/drabastomek/learningPySpark.