Not sure how well this will work, or if the use cases support it. Rather than optimize Hadoop for use cases that it was designed for, Teradata is merging Hadoop into its legacy core data warehouse. Will Hadoop add value or make it overly complex?
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques. One of R’s strengths is the ease with which well-designed publication-quality plots can be produced.
The update to R, from Revolution Analytics, is significant because previously data had to be moved into a R environment in order to process and plot the data. This update enables R to run within the Cloudera Hadoop enviornment so that data does not need to be moved out of HDFS, across the network, and onto another machine for processing.
I think that this is significant because R enables a single page graphic to represent the analysis on data (that is potentially petabytes in size). Seems to me that R takes as input the data that is generated by the Reduce portion of MapReduce.