Apache™ Hadoop® is an open source software project that enables distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with very high degree of fault tolerance. Rather than relying on high-end hardware, the resiliency of these clusters comes from the software's ability to detect and handle failures at the application layer.
Benefits of this Resource
Incredibly robust for large sets of data
Does not require high-end hardware
Currently open source but there are other vendors that have more sophisticated software utilizing the Hadoop platform (Cloudera, IBM BigInsights)
Primarily used for visualizing big data. SQL based.
File Type Compatibility
Level of Expertise Required
Where to access
Sources to Learn