HADOOP
Main Purpose
Apache™ Hadoop® is an open source software project that enables distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with very high degree of fault tolerance. Rather than relying on high-end hardware, the resiliency of these clusters comes from the software's ability to detect and handle failures at the application layer.
Benefits of this Resource
Incredibly robust for large sets of data
Does not require high-end hardware
Currently open source but there are other vendors that have more sophisticated software utilizing the Hadoop platform (Cloudera, IBM BigInsights)
Limitations
Primarily used for visualizing big data. SQL based.
File Type Compatibility
Level of Expertise Required
High
Where to access
Apache Hadoop
Cloudera
SAS
IBM
Sources to Learn
coming soon