Hadoop mindset glorifies having as much raw data as possible. Just build more nodes if necessary. However, there is currently a lack of good meta data tools. Where did the data come from? What’s the retention policy? Who has access to read it, delete it?
CTO of Sqrrl thinks this is a result of the Hadoop environment being designed for developers, not for business users.