Don’t run Hadoop on a SAN

By definition, a SAN is about consolidating data and Hadoop is about distributing data. Can they co-exist? Not according to this article.

If you take data out of a Hadoop node and put it on a SAN, you’re reducing performance. You want data to transfer to the CPU at bus speed, not network speed. And maybe a heavy Hadoop load could saturate your network.

source:

Advertisements

Comments are closed.