I keep getting asked "Why is Spark so popular?" My view is it comes down to addressing Four major requirements :
(1) Easy access to the right data - spark abstracts a lot of complexities when accessing data and handles it in a uniform way using Resilient Distributed Data Sets and DataFrames - more on this in other blogs. It can access local and remote data - depends on how critical performance is.
(2) Cognitive capabilities - the machine learning capabilities enables Spark to become more intelligent and learn about the data. Goodbye static algorithms - hello dynamic learning algorithms.
(3) Ubiquity - Spark runs on many platforms and environments so your Spark analytics can run where the data is. It brings the analytics to the data.
(4) Superduper fast - It can all be analyzed in memory,
I'm going to be discussing this in more detail March 1. But what's your view on why it is so popular?
http://www.worldofdb2.com/events/the-ultimate-platform-for-spark-hadoop-projects