What is the difference between map and flatMap and a good use case for each?
Here is an example of the difference, as a spark-shell session: First, some data – two lines of text: Now, map transforms an RDD of length N into another RDD of length N. For example, it maps from two lines into two line-lengths: But flatMap (loosely speaking) transforms an RDD of length N into a collection of N collections, then flattens … Read more