current position:Home>What is the basic logic of MapReduce client for map phase parallelism planning?

What is the basic logic of MapReduce client for map phase parallelism planning?

2022-02-04 16:52:31 Alibaba cloud Q & A

mapreduce The client to map What is the basic logic of stage parallelism planning ?




Take the answer 1:

Perform logical slicing of the data to be processed ( That is, according to a specific slice size , Divide the data to be processed into logical multiple individual split), And then every one split Allocate one mapTask Parallel instance processing This section of logic and the resulting slice plan description file , By FileInputFormat Implementation class getSplits() Method . This method returns List , InputSplit Encapsulates the information of each logical slice , Including length and position Information , and getSplits() Method returns a group InputSplit


copyright notice
author[Alibaba cloud Q & A],Please bring the original link to reprint, thank you.
https://en.cdmana.com/2022/02/202202041652298901.html

Random recommended