Category Archives: NSA

Sqrrl co-founder explains how NSA uses Accumulo

At its core, what the NSA is doing is finding anti-patterns. Crunching through huge sets of non-interesting data is the only way to find the interesting data.

Also, the Department of Defense sees the success that NSA is having with Hadoop technologies, and is considering using it (most likely Accumulo) to store large amounts of unstructured and non-schema data.


Sqrrl Enterprise, Accumulo, and Encryption

Sqrrl is powered by Apache Accumulo, which was originally developed for the NSA in 2008, is a low latency NoSQL database using Hadoop as its file system.

  • Support for both role based and attribute based security controls
  • Encryption at rest and in motion
  • Can use multiple keys
  • Trust boundaries limit the admin’s access to data
  • Impact of encryption is only about 10% performance degradation


Good article about how the NSA replicates data from Yahoo and Google Hadoop file systems

Most of the discussions about NSA data collection are devoid of technical facts. The media just likes to throw around the word “metadata” as if that means nothing to those of us who work all day with nothing other than metadata.

Here’s an article that doesn’t talk down to us, but explains how simple it is to replicate the HDFS nodes from Yahoo and Google data centers.

The problem seems to be that Yahoo and Google encrypt data in motion, but not data at rest. Would Accumulo solve the encryption problem for data at rest? However,  Accumulo was originally developed for the NSA, who can likely break the encryption using the processing power of huge Hadoop clusters.