What is Apache Hadoop?

Hadoop is open source software framework for distributed processing of large datasets.

Scalable data storage and processing

  • Open source Apache project
  • Harnesses the power of commodity servers
  • Distributed and fault-tolerant

“Core” Hadoop consists of two main parts

  • HDFS (storage)
  • MapReduce (processing)

Leave a Reply

Your email address will not be published. Required fields are marked *