Yarn provides greater scalability than MapReduce

MapReduce could have resource utilization problems because an arbitrary process could have allocated all map slots, while some reduce slots are empty (and vise-versa). Yarn (Yet Another Resource Negotiator) splits the JobTracker into a global ResourceManager (RM) and a per-application ApplicationMaster (AM) which works with the per-machine NodeManagers to execute and monitor tasks.  The ResourceManager has a Scheduler which only schedules (does not monitor).

The ApplicationMaster is far more scalable in 2.0 than 1.0. HortonWorks has successful simulated a 10k node cluster. This is possible due to the ApplicationMaster not being global to the entire cluster but rather has an instance per application so is no longer a bottleneck through which all applications must pass.

The ResourceManager is also more scalable in 2.0 since it’s scope is reduced to scheduling and no longer is responsible for fault tolerance of the entire cluster.

Source:

Advertisements

Comments are closed.