- Terradata Aster 6 platform
- Includes graph analysis engine (visualization), in addition to traditional rows/columns.
- Enables execution of SQL across multiple NoSQL repositories
- Integrates with multiple 3rd parties for solutions such as analytical workflow (Alteryx), advanced analytics algorithms (Fuzzy Logix).
- Cloud services at comparable cost to on-premises
1. Big Data Exploration
I don’t agree with the author’s category. He admits that this is a “one size fits all category”. Almost seems like he had four use cases, and decided to make it into five by says adding that you can search, visualize, and understand data from multiple sources to help decision making. Haven’t we been doing this all along, with whatever database tools we’ve had?
2. Enhanced 360 degree view of the customer
From my own experience I had a project where we did this for a call center. However, the key was that we did real time queries to generate the 360 degree view when the call center agent took the call from the customer. The problem there was that in order to produce the view in only a couple of seconds we were very limited in what sort of data we had access to, and how we could analyze this. The Big Data perspective of 360 degrees assumes that the Hadoop repository retains a persistent copy of the data, something that many organizations don’t want. For example, the data will likely not be real time. However, having a copy of the data, and having the time to crunch it in batch mode will give a deeper insight into the customer. Perhaps what’s needed is a hybrid of realtime and batch, sort of like what Twitter is doing with Storm.
3. Security/Intelligence Extension
Searching for past occurrences of fraud, or creating a predictive model of possible future occurrences is very much a batch operation, and Hadoop works well on this since the scope of the analysis is limited only by the depth of the data and the duration of operations upon it.
4. Operations Analysis
I think that the author’s example of the “internet of things” might be a stretch, but commingling and analysis of unstructured and/or semi-structured server and application logs is a perfect use case for Hadoop. This is especially true if the log data streams in, so that the results of your analysis are updated as each batch cycle completes.
5. Data Warehouse Augmentation
Some data can be pre-processed in Hadoop before loading into a traditional data warehouse. Other data can be analyzed without needing to load into a data warehouse at all, where it might just clutter up other queries. Hadoop lets you dump everything in, and sort it out later. Data warehouses are intended to be kept tidy.
This is a good article, but true to form the mainstream media has latched onto a technology buzz word and assumes that the dictionary definition is as deep they need to go. So to them “big data” means only “a lot of data”. Just as “cloud” is anything that doesn’t run on a desktop/laptop computer.
However, they make good points about how to show trends via a single graphic rather than in the rows and columns of spreadsheet. They mention the work by Charles Minard in 1869 showing impact of the Russian campaign on Napoleon’s troops. Yet The Journal fails to mention the excellent works of Edward Tufte, especially “The Visual Display of Quantitative Information”. You can’t talk about this topic without including Tufte.