Hydra is not built on top of Hadoop, but functions similar to Summingbird, Storm, and Spark.
Data can stream into it, and analytics can be run in real time, rather than only in batch.
AddThis is the company that originally developed Hydra, which is now in open sourced through Apache. AddThis runs six Hydra clusters, one of which is comprised of 156 servers and processes 3.5 billion transactions per day.
Advantage is that schemas don’t need to be created in order to search for patterns, since Hadoop is leveraged. Makes sense, since by creating a schema the user is already making assumptions about where the patterns exist. By doing a schema-less analysis, it’s possible to find unexpected anomalies within patterns and to find entirely new patterns.
Splunk includes visualization components.
Splunk’s Director of Big Data Marketing, Brett Sheppard, says that this is well suited for the Internet of Things (IoT), which can leverage visualization tools that report on the results of searching for anomalies in large amounts of machine generated data.
Elasticworks enables real-time searching and analytics. Yarn is supported. Integration extends into Hive and Pig.
Open source framework for for collection and analysis of data for real-time applications such as energy usage and fraud monitoring.