top of page




Main Purpose

Apache™ Hadoop® is an open source software project that enables distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with very high degree of fault tolerance. Rather than relying on high-end hardware, the resiliency of these clusters comes from the software's ability to detect and handle failures at the application layer.

Benefits of this Resource

Incredibly robust for large sets of data
Does not require high-end hardware
Currently open source but there are other vendors that have more sophisticated software utilizing the Hadoop platform (Cloudera, IBM BigInsights)


Primarily used for visualizing big data.  SQL based.

File Type Compatibility

Level of Expertise Required


Where to access

Apache Hadoop

Sources to Learn

coming soon

bottom of page