Hadoop简单介绍
Hadoop简单介绍 一、Hadoop要解决的两个问题: 首先我们撇开Hadoop的历史、概念,我们先了解Hadoop是用来干啥的。 Hadoop解决两个问题: 1.海量数据存储 HDFS 2.海量的数据分析 MapReduce 二、Hadoop历史: 2002年的apache项目Nutch 2003年Google发表了关于G
Hadoop简单介绍
一、Hadoop要解决的两个问题:
首先我们撇开Hadoop的历史、概念,我们先了解Hadoop是用来干啥的。
Hadoop解决两个问题:
1.海量数据存储 HDFS
2.海量的数据分析 MapReduce
二、Hadoop历史:
2002年的apache项目Nutch
2003年Google发表了关于GFS的论文
2004年Nutch的开发者开发了NDFS
2004年Google发表了关于MapReduce的论文
2005年MapR被引入了NDFS
2006年改名为Hadoop,NDFS创始人加入了yahoo,yahoo成立了一个专门的小组发展Hadoop
三、学习Hadoop的目的:
Hadoop是IT行业一个新的热点,是云计算的一个具体实现
Hadoop本身具有很高的技术含量,是IT工程师学习的首选
四、HDFS设计目标:
1.Very large files
2.Streaming data access
write-once read-many-times
3.Commodity hardware
五、Hadoop不适合的场景:
1.low-latency data access
2.Lots of small files
3.Multiple writers,arbitrary file modifications
六、HDFS架构:
(1)假设有一个 600G的文件a.txt,由于我们的Hadoop默认一个块的大小是64M,故将这600G文件以64M为一块分别存储到所有的集群的主机上,这样我们的读取速度将会大大提高。
(2)同一个文件块在不同的节点中有多个副本,这样当集群里某一文件块损坏或者数据丢失时,会在另外一个节点得到补充。另外这些副本和原本都是在一个配置文件里配置的,Hadoop会根据配置信息自动寻找备份的内容块。
(3)刚刚我们提到的配置文件,我们需要一个集中的地方保存文件的分块信息:
/home/asdf/a.txt.part1,3,(dm1,dm2,dm3)
/home/asdf/a.txt.part2,3,(dm2,dm3,dm4)
/home/asdf/a.txt.part3,3,(dm6,dm11,dm28)
这里边的3是指加上备份有三份。
(4)Block:一个文件分块,默认64M
NameNode:保存整个文件系统的目录信息,文件信息以及文件相应的分块信息。
DataNode:用于存储Blocks
HDFS的HA策略:NameNode一旦宕机,整个文件系统将无法工作。 如果NameNode中的数据丢失,整个文件系统也就丢失了。 2.x开始,HDFS支持NameNode的active-standy模式。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Solution to the problem that Win11 system cannot install Chinese language pack With the launch of Windows 11 system, many users began to upgrade their operating system to experience new functions and interfaces. However, some users found that they were unable to install the Chinese language pack after upgrading, which troubled their experience. In this article, we will discuss the reasons why Win11 system cannot install the Chinese language pack and provide some solutions to help users solve this problem. Cause Analysis First, let us analyze the inability of Win11 system to

As smartphone technology continues to develop, mobile phones play an increasingly important role in our daily lives. As a flagship phone focusing on gaming performance, the Black Shark phone is highly favored by players. However, sometimes we also face the situation that the Black Shark phone cannot be turned on. At this time, we need to take some measures to solve this problem. Next, let us share five tips to teach you how to solve the problem of Black Shark phone not turning on: Step 1: Check the battery power. First, make sure your Black Shark phone has enough power. It may be because the phone battery is exhausted

The hard disk serial number is an important identifier of the hard disk and is usually used to uniquely identify the hard disk and identify the hardware. In some cases, we may need to query the hard drive serial number, such as when installing an operating system, finding the correct device driver, or performing hard drive repairs. This article will introduce some simple methods to help you check the hard drive serial number. Method 1: Use Windows Command Prompt to open the command prompt. In Windows system, press Win+R keys, enter "cmd" and press Enter key to open the command

Everyone knows that if the computer cannot load the driver, the device may not work properly or interact with the computer correctly. So how do we solve the problem when a prompt box pops up on the computer that the driver cannot be loaded on this device? The editor below will teach you two ways to easily solve the problem. Unable to load the driver on this device Solution 1. Search for "Kernel Isolation" in the Start menu. 2. Turn off Memory Integrity, and it will prompt "Memory Integrity has been turned off. Your device may be vulnerable." Click behind to ignore it, and it will not affect the use. 3. The problem can be solved after restarting the machine.

With the continuous development of social media, Xiaohongshu has become a platform for more and more young people to share their lives and discover beautiful things. Many users are troubled by auto-save issues when posting images. So, how to solve this problem? 1. How to solve the problem of automatically saving pictures when publishing on Xiaohongshu? 1. Clear the cache First, we can try to clear the cache data of Xiaohongshu. The steps are as follows: (1) Open Xiaohongshu and click the "My" button in the lower right corner; (2) On the personal center page, find "Settings" and click it; (3) Scroll down and find the "Clear Cache" option. Click OK. After clearing the cache, re-enter Xiaohongshu and try to post pictures to see if the automatic saving problem is solved. 2. Update the Xiaohongshu version to ensure that your Xiaohongshu

How to solve the problem that Huawei browser has stopped accessing this webpage? When using Huawei mobile browser to access certain websites, a prompt indicating that access is prohibited may appear, preventing users from browsing related content normally. This is very inconvenient for users. So, what should we do when we encounter a situation where access to the Huawei mobile browser website is prohibited? The editor below will provide you with solutions to the problem of prohibiting access to Huawei browser websites. I hope it will be helpful to you. Solution to the prohibition of access to the Huawei Browser website 1. After opening the Huawei mobile browser, click the three-dot icon below, and then click Settings. 2. After entering the settings, click [Security and Privacy] 3. Turn off the switch on the right side of [Safe Browsing] to remove website access restrictions. The above is the solution to the ban on Huawei browser website access.

Title: Analysis of Oracle Error 3114: Causes and Solutions When using Oracle database, you often encounter various error codes, among which error 3114 is a relatively common one. This error generally involves database link problems, which may cause exceptions when accessing the database. This article will interpret Oracle error 3114, discuss its causes, and give specific methods to solve the error and related code examples. 1. Definition of error 3114 Oracle error 3114 pass

WordPress is a very popular open source content management system. Many individual users and businesses choose to use WordPress to build and manage their own websites. However, during the installation of WordPress, you sometimes encounter some problems, such as database connection errors, file permission issues, etc. This article will provide a guide to solving common WordPress installation problems and help users solve problems quickly through specific code examples. Problem 1: Database connection error when installing WordPress
