Map: Parallelize processing on different chunks of data stored on different nodes
The processing function is cloned and ‘mapped’ to each data chunk to key-value pairs
Translation: Breaking down the elements to key-value pairs. There can be many of the same keys
For example, the processing function keeps a tally of the number of occurrences of every word in the doc
Reduce: Consolidating the results of each mapping function into a common unit
The final output might be a tally of the occurrence of each word in all the documents
Translation: Adding the values of the key-value pairs in the Map task based on the keys
Intermediate Step: shuffling and sorting the results produced by each mapping function.
E.g. all words from all the documents that start with ‘a’ to ‘g’ go to one intermediate unit, all words that start with ‘h’ to ‘p’ go to another, and so on, and then they are sorted alphabetically.
MapReduce Example #1
Given this variable:
some_string = “Deer, Bear, River, Car, Car, River, Deer, Car and Bear”
MapReduce Example #2
Hire Map Reduce Expert to get instant help in map reduce coding or project help in any programming languages like python, java, machine learning.
Contact Us at: realcode4you@gmail.com
Comments