hbase relies on "HDFS" to store underlying data. HBase uses Hadoop HDFS as its file storage system to provide HBase with high-reliability underlying storage support; HDFS has high fault tolerance and is designed to be deployed on low-cost hardware.
HBase – Hadoop Database is a highly reliable, high-performance, column-oriented, scalable distributed storage system that can be run on cheap PCs using HBase technology. A large-scale structured storage cluster is built on the server.
hbase relies on "HDFS" to store underlying data.
HBase is an open source implementation of Google Bigtable. Similar to Google Bigtable, which uses GFS as its file storage system, HBase uses Hadoop HDFS as its file storage system. ; Google runs MapReduce to process Bigtable HBase also uses Hadoop MapReduce to process the massive data in HBase; Google Bigtable uses Chubby as a collaborative service, and HBase uses Zookeeper as a counterpart.
The above figure describes each layer of the system in Hadoop EcoSystem. Among them, HBase is located in the structured storage layer, Hadoop HDFS provides HBase with high-reliability underlying storage support, Hadoop MapReduce provides HBase with high-performance computing capabilities, and Zookeeper provides stable services and failover for HBase. mechanism.
HDFS
Hadoop Distributed File System (HDFS) refers to a distributed file system (Distributed File System) designed to run on common hardware (commodity hardware). System). It has a lot in common with existing distributed file systems. But at the same time, the difference between it and other distributed file systems is also obvious. HDFS is a highly fault-tolerant system suitable for deployment on cheap machines. HDFS can provide high-throughput data access and is very suitable for applications on large-scale data sets. HDFS relaxes some POSIX constraints to achieve the purpose of streaming file system data. HDFS was originally developed as the infrastructure for the Apache Nutch search engine project. HDFS is part of the Apache Hadoop Core project.
HDFS has the characteristics of high fault-tolerant and is designed to be deployed on low-cost hardware. And it provides high throughput to access application data, suitable for applications with large data sets. HDFS relaxes POSIX requirements so that streaming access to data in the file system can be achieved.
HDFS adopts a master/slave structure model. An HDFS cluster is composed of a NameNode and several DataNodes. The NameNode serves as the main server, managing the namespace of the file system and the client's access to files; the DataNode in the cluster manages the stored data.
The above is the detailed content of What hbase relies on to store underlying data. For more information, please follow other related articles on the PHP Chinese website!