The core of hadoop is the distributed file system hdfs and MapReduce. HDFS provides storage for massive data, while MapReduce provides calculation for massive data.
Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without understanding the underlying details of distribution. Make full use of the power of clusters for high-speed computing and storage.
Hadoop implements a distributed file system (Hadoop Distributed File System), one of which is HDFS. HDFS is highly fault-tolerant and designed to be deployed on low-cost hardware; and it provides high throughput to access application data, making it suitable for those with large data sets. set) application.
HDFS relaxes POSIX requirements and can access data in the file system in the form of streaming access.
The core design of the Hadoop framework is: HDFS and MapReduce. HDFS provides storage for massive data, while MapReduce provides calculation for massive data.
Hadoop is made up of many elements. At the bottom is the Hadoop Distributed File System (HDFS), which stores files on all storage nodes in the Hadoop cluster. The upper layer of HDFS is the MapReduce engine, which consists of JobTrackers and TaskTrackers. Through the introduction of the core distributed file system HDFS and MapReduce processing of the Hadoop distributed computing platform, as well as the data warehouse tool Hive and the distributed database Hbase, it basically covers all the technical core of the Hadoop distributed platform.
For more related knowledge, please visit: PHP Chinese website!
The above is the detailed content of The core of hadoop is the distributed file system hdfs and what?. For more information, please follow other related articles on the PHP Chinese website!