Apache Hadoop


Topic | v1 | created by janarez |
Description

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster.


Relations

is Big data framework

Frameworks for processing Big data.

a tool for Big data

Big data is a field that treats ways to analyze, systematically extract information from, or otherwis...


Edit topic New topic

Resources

is compared in Hadoop vs Spark: Detailed Comparison of Big Data Frameworks

7.0 rating 3.0 level 8.0 clarity 2.0 background – 1 rating

We compare Hadoop vs Spark platforms in multiple categories including use cases. Which big data frame...

No intermediate resources matching your criteria have been registered, yet.

No advanced resources matching your criteria have been registered, yet.

No unrated resources matching your criteria have been registered, yet.