Home Database Mysql Tutorial 为已存在的Hadoop集群配置HDFS Federation

为已存在的Hadoop集群配置HDFS Federation

Jun 07, 2016 pm 02:50 PM
hadoop hdfs Configuration cluster

一、实验目的 1. 现有Hadoop集群只有一个NameNode,现在要增加一个NameNode。 2. 两个NameNode构成HDFS Federation。 3. 不重启现有集群,不影响数据访问。 二、实验环境 4台CentOS release 6.4虚拟机,IP地址为 192.168.56.101 master 192.168.56.102 slave

一、实验目的
1. 现有Hadoop集群只有一个NameNode,现在要增加一个NameNode。
2. 两个NameNode构成HDFS Federation。
3. 不重启现有集群,不影响数据访问。

二、实验环境
4台CentOS release 6.4虚拟机,IP地址为
192.168.56.101 master
192.168.56.102 slave1
192.168.56.103 slave2
192.168.56.104 kettle

其中kettle是新增的一台“干净”的机器,已经配置好免密码ssh,将作为新增的NameNode。

软件版本:
hadoop 2.7.2
hbase 1.1.4
hive 2.0.0
spark 1.5.0
zookeeper 3.4.8
kylin 1.5.1

现有配置:
master作为hadoop的NameNode、SecondaryNameNode、ResourceManager,hbase的HMaster
slave1、slave2作为hadoop的DataNode、NodeManager,hbase的HRegionServer
同时master、slave1、slave2作为三台zookeeper服务器

三、配置步骤
1. 编辑master上的hdfs-site.xml文件,修改后的文件内容如下所示。
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
	<name>dfs.namenode.name.dir</name>
	<value>file:/home/grid/hadoop-2.7.2/hdfs/name</value>
</property>
<property>
	<name>dfs.datanode.data.dir</name>
	<value>file:/home/grid/hadoop-2.7.2/hdfs/data</value>
</property>
<property>
	<name>dfs.replication</name>
	<value>1</value>
</property>
<property>
	<name>dfs.webhdfs.enabled</name>
	<value>true</value>
</property>

<!-- 新增属性 -->
<property>
    <name>dfs.nameservices</name>
    <value>ns1,ns2</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.ns1</name>
    <value>master:9000</value>
</property>
<property>
    <name>dfs.namenode.http-address.ns1</name>
    <value>master:50070</value>
</property>
<property>
    <name>dfs.namenode.secondary.http-address.ns1</name>
    <value>master:9001</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.ns2</name>
    <value>kettle:9000</value>
</property>
<property>
    <name>dfs.namenode.http-address.ns2</name>
    <value>kettle:50070</value>
</property>
<property>
    <name>dfs.namenode.secondary.http-address.ns2</name>
    <value>kettle:9001</value>
</property>
</configuration>
Copy after login
2. 拷贝master上的hdfs-site.xml文件到集群上的其它节点
scp hdfs-site.xml slave1:/home/grid/hadoop-2.7.2/etc/hadoop/
scp hdfs-site.xml slave2:/home/grid/hadoop-2.7.2/etc/hadoop/
Copy after login
3. 将Java目录、Hadoop目录、环境变量文件从master拷贝到kettle
scp -rp /home/grid/hadoop-2.7.2 kettle:/home/grid/
scp -rp /home/grid/jdk1.7.0_75 kettle:/home/grid/
# 用root执行
scp -p /etc/profile.d/* kettle:/etc/profile.d/
Copy after login
4. 启动新的NameNode、SecondaryNameNode
# 在kettle上执行
source /etc/profile
ln -s hadoop-2.7.2 hadoop
$HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
$HADOOP_HOME/sbin/hadoop-daemon.sh start secondarynamenode
Copy after login

执行后启动了NameNode、SecondaryNameNode进程,如图1所示。


图1

5. 刷新DataNode收集新添加的NameNode
# 在集群中任意一台机器上执行均可
$HADOOP_HOME/bin/hdfs dfsadmin -refreshNamenodes slave1:50020
$HADOOP_HOME/bin/hdfs dfsadmin -refreshNamenodes slave2:50020
Copy after login
至此,HDFS Federation配置完成,从web查看两个NameNode的状态分别如图2、图3所示。


图2


图3


四、测试
# 向HDFS上传一个文本文件
hadoop dfs -put /home/grid/hadoop/NOTICE.txt /
# 分别在两台NameNode节点上运行Hadoop自带的例子
# 在master上执行
hadoop jar /home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /NOTICE.txt /output
# 在kettle上执行
hadoop jar /home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /NOTICE.txt /output1
Copy after login
用下面的命令查看两个输出结果,分别如图4、图5所示。
hadoop dfs -cat /output/part-r-00000
hadoop dfs -cat /output1/part-r-00000
Copy after login
图4


图5


参考:
http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/Federation.html
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Repo: How To Revive Teammates
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to set up Git configuration in PyCharm How to set up Git configuration in PyCharm Feb 20, 2024 am 09:47 AM

Title: How to correctly configure Git in PyCharm In modern software development, the version control system is a very important tool, and Git, as one of the popular version control systems, provides developers with powerful functions and flexible operations. As a powerful Python integrated development environment, PyCharm comes with support for Git, allowing developers to manage code versions more conveniently. This article will introduce how to correctly configure Git in PyCharm to facilitate better development during the development process.

The working principle and configuration method of GDM in Linux system The working principle and configuration method of GDM in Linux system Mar 01, 2024 pm 06:36 PM

Title: The working principle and configuration method of GDM in Linux systems In Linux operating systems, GDM (GNOMEDisplayManager) is a common display manager used to control graphical user interface (GUI) login and user session management. This article will introduce the working principle and configuration method of GDM, as well as provide specific code examples. 1. Working principle of GDM GDM is the display manager in the GNOME desktop environment. It is responsible for starting the X server and providing the login interface. The user enters

The perfect combination of PyCharm and PyTorch: detailed installation and configuration steps The perfect combination of PyCharm and PyTorch: detailed installation and configuration steps Feb 21, 2024 pm 12:00 PM

PyCharm is a powerful integrated development environment (IDE), and PyTorch is a popular open source framework in the field of deep learning. In the field of machine learning and deep learning, using PyCharm and PyTorch for development can greatly improve development efficiency and code quality. This article will introduce in detail how to install and configure PyTorch in PyCharm, and attach specific code examples to help readers better utilize the powerful functions of these two. Step 1: Install PyCharm and Python

Understand Linux Bashrc: functions, configuration and usage Understand Linux Bashrc: functions, configuration and usage Mar 20, 2024 pm 03:30 PM

Understanding Linux Bashrc: Function, Configuration and Usage In Linux systems, Bashrc (BourneAgainShellruncommands) is a very important configuration file, which contains various commands and settings that are automatically run when the system starts. The Bashrc file is usually located in the user's home directory and is a hidden file. Its function is to customize the Bashshell environment for the user. 1. Bashrc function setting environment

Simple and easy-to-understand PyCharm configuration Git tutorial Simple and easy-to-understand PyCharm configuration Git tutorial Feb 20, 2024 am 08:28 AM

PyCharm is a commonly used integrated development environment (IDE). In daily development, using Git to manage code is essential. This article will introduce how to configure Git in PyCharm and use Git for code management, with specific code examples. Step 1: Install Git First, make sure Git is installed on your computer. If it is not installed, you can go to [Git official website](https://git-scm.com/) to download and install the latest version of Git

How to configure workgroup in win11 system How to configure workgroup in win11 system Feb 22, 2024 pm 09:50 PM

How to configure a workgroup in Win11 A workgroup is a way to connect multiple computers in a local area network, which allows files, printers, and other resources to be shared between computers. In Win11 system, configuring a workgroup is very simple, just follow the steps below. Step 1: Open the "Settings" application. First, click the "Start" button of the Win11 system, and then select the "Settings" application in the pop-up menu. You can also use the shortcut "Win+I" to open "Settings". Step 2: Select "System" In the Settings app, you will see multiple options. Please click the "System" option to enter the system settings page. Step 3: Select "About" In the "System" settings page, you will see multiple sub-options. Please click

MyBatis Generator configuration parameter interpretation and best practices MyBatis Generator configuration parameter interpretation and best practices Feb 23, 2024 am 09:51 AM

MyBatisGenerator is a code generation tool officially provided by MyBatis, which can help developers quickly generate JavaBeans, Mapper interfaces and XML mapping files that conform to the database table structure. In the process of using MyBatisGenerator for code generation, the setting of configuration parameters is crucial. This article will start from the perspective of configuration parameters and deeply explore the functions of MyBatisGenerator.

Flask installation and configuration tutorial: a tool to easily build Python web applications Flask installation and configuration tutorial: a tool to easily build Python web applications Feb 20, 2024 pm 11:12 PM

Flask installation and configuration tutorial: A tool to easily build Python Web applications, specific code examples are required. Introduction: With the increasing popularity of Python, Web development has become one of the necessary skills for Python programmers. To carry out web development in Python, we need to choose a suitable web framework. Among the many Python Web frameworks, Flask is a simple, easy-to-use and flexible framework that is favored by developers. This article will introduce the installation of Flask framework,

See all articles