Home Database Mysql Tutorial 为Hadoop集群选择合适的硬件配置

为Hadoop集群选择合适的硬件配置

Jun 07, 2016 pm 04:38 PM
hadoop suitable hardware choose Configuration along with cluster

随着Apache Hadoop的起步,云客户的增多面临的首要问题就是如何为他们新的的Hadoop集群选择合适的硬件。 尽管Hadoop被设计为运行在行业标准的硬件上,提出一个理想的集群配置不想提供硬件规格列表那么简单。?选择硬件,为给定的负载在性能和经济性提供最佳平

随着Apache Hadoop的起步,云客户的增多面临的首要问题就是如何为他们新的的Hadoop集群选择合适的硬件。 尽管Hadoop被设计为运行在行业标准的硬件上,提出一个理想的集群配置不想提供硬件规格列表那么简单。?选择硬件,为给定的负载在性能和经济性提供最佳平衡是需要测试和验证其有效性。(比如,IO密集型工作负载的用户将会为每个核心主轴投资更多)。 在这个博客帖子中,你将会学到一些工作负载评估的原则和它在硬件选择中起着至关重要的作用。在这个过程中,你也将学到Hadoop管理员应该考虑到各种因素。 结合存储和计算 过去的十年,IT组织已经标准化了刀片服务器和存储区域网(SAN)来满足联网和处理密集型的工作负载。尽管这个模型对于一些方面的标准程序是有相当意义 的,比如网站服务器,程序服务器,小型结构化数据库,数据移动等,但随着数据数量和用户数的增长,对于基础设施的要求也已经改变。网站服务器现在有了缓存 层;数据库需要本地硬盘支持大规模地并行;数据迁移量也超过了本地可处理的数量。 大部分的团队还没有弄清楚实际工作负载需求就开始搭建他们的Hadoop集群。 硬件提供商已经生产了创新性的产品系统来应对这些需求,包括存储刀片服务器,串行SCSI交换机,外部SATA磁盘阵列和大容量的机架单元。然 而,Hadoop是基于新的实现方法,来存储和处理复杂数据,并伴随着数据迁移的减少。 相对于依赖SAN来满足大容量存储和可靠性,Hadoop在软件层次处理大数据和可靠性。 Hadoop在一簇平衡的节点间分派数据并使用同步复制来保证数据可用性和容错性。因为数据被分发到有计算能力的节点,数据的处理可以被直接发送到存储有数据的节点。由于Hadoop集群中的每一台节点都存储并处理数据,这些节点都需要配置来满足数据存储和运算的要求。   ?工作负载很重要吗? 在几乎所有情形下,MapReduce要么会在从硬盘或者网络读取数据时遇到瓶颈(称为IO受限的应用),要么在处理数据时遇到瓶颈(CPU受限)。排序是一个IO受限的例子,它需要很少的CPU处理(仅仅是简单的比较操作),但是需要大量的从硬盘读写数据。模式分类是一个CPU受限的例子,它对数据进行复杂的处理,用来判定本体。 下面是更多IO受限的工作负载的例子: 索引 分组 数据导入导出 数据移动和转换 下面是更多CPU受限的工作负载的例子: 聚类/分类 复杂文本挖掘 自然语言处理 特征提取 Cloudera的客户需要完全理解他们的工作负载,这样才能选择最优的Hadoop硬件,而这好像是一个鸡生蛋蛋生鸡的问题。大多数工作组在没有彻底剖 析他们的工作负载时,就已经搭建好了Hadoop集群,通常Hadoop运行的工作负载随着他们的精通程度的提高而完全不同。而且,某些工作负载可能会被 一些未预料的原因受限。例如,某些理论上是IO受限的工作负载却最终成为了CPU受限,这是可能是因为用户选择了不同的压缩算法,或者算法的不同实现改变 了MapReduce任务的约束方式。基于这些原因,当工作组还不熟悉要运行任务的类型时,深入剖析它才是构建平衡的Hadoop集群之前需要做的最合理 的工作。 接下来需要在集群上运行MapReduce基准测试任务,分析它们是如何受限的。完成这个目标最直接的方法是在运行中的工作负载中的适当位置添加监视器来 检测瓶颈。我们推荐在Hadoop集群上安装Cloudera Manager,它可以提供CPU,硬盘和网络负载的实时统计信息。(Cloudera Manager是Cloudera 标准版和企业版的一个组件,其中企业版还支持滚动升级)Cloudera Manager安装之后,Hadoop管理员就可以运行MapReduce任务并且查看Cloudera Manager的仪表盘,用来监测每台机器的工作情况。 第一步是弄清楚你的作业组已经拥有了哪些硬件 在为你的工作负载构建合适的集群之外,我们建议客户和它们的硬件提供商合作确定电力和冷却方面的预算。由于Hadoop会运行在数十台,数百台到数千台节 点上。通过使用高性能功耗比的硬件,作业组可以节省一大笔资金。硬件提供商通常都会提供监测功耗和冷却方面的工具和建议。 为你的CDH(Cloudera?distribution?for?Hadoop) Cluster选择硬件 选择机器配置类型的第一步就是理解你的运维团队已经在管理的硬件类型。在购买新的硬件设备时,运维团队经常根据一定的观点或者强制需求来选择,并且他们倾 向于工作在自己业已熟悉的平台类型上。Hadoop不是唯一的从规模效率上获益的系统。再一次强调,作为更通用的建议,如果集群是新建立的或者你并不能准 确的预估你的极限工作负载,我们建议你选择均衡的硬件类型。 Hadoop集群有四种基本任务角色:名称节点(包括备用名称节点),工作追踪节点,任务执行节点,和数据节点。节点是执行某一特定功能的工作站。大部分你的集群内的节点需要执行两个角色的任务,作为数据节点(数据存储)和任务执行节点(数据处理)。 ?这是在一个平衡Hadoop集群中,为数据节点/任务追踪器提供的推荐规格: 在一个磁盘阵列中要有12到24个1~4TB硬盘 2个频率为2~2.5GHz的四核、六核或八核CPU 64~512GB的内存 有保障的千兆或万兆以太网(存储密度越大,需要的网络吞吐量越高) 名字节点角色负责协调集群上的数据存储,作业追踪器协调数据处理(备用的名字节点不应与集群中的名字节点共存,并且运行在与之相同的硬件环境上。)。 Cloudera推荐客户购买在RAID1或10配置上有足够功率和企业级磁盘数的商用机器来运行名字节点和作业追踪器。 ? [...]
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Can wallpaper engine be shared among families? Can wallpaper engine be shared among families? Mar 18, 2024 pm 07:28 PM

Does Wallpaper support family sharing? Unfortunately, it cannot be supported. Still, we have solutions. For example, you can purchase with a small account or download the software and wallpapers from a large account first, and then change to the small account. Simply launching the software is perfectly fine. Can wallpaperengine be family shared? Answer: Wallpaper does not currently support the family sharing function. 1. It is understood that WallpaperEngine does not seem to be suitable for family sharing environments. 2. In order to solve this problem, it is recommended that you consider purchasing a new account; 3. Or download the required software and wallpapers in the main account first, and then switch to other accounts. 4. Just open the software with a light click and it will be fine. 5. You can view the properties on the above web page"

How to set lock screen wallpaper on wallpaper engine? How to use wallpaper engine How to set lock screen wallpaper on wallpaper engine? How to use wallpaper engine Mar 13, 2024 pm 08:07 PM

WallpaperEngine is a software commonly used to set desktop wallpapers. Users can search for their favorite pictures in WallpaperEngine to generate desktop wallpapers. It also supports adding pictures from the computer to WallpaperEngine to set them as computer wallpapers. Let’s take a look at how wallpaperengine sets the lock screen wallpaper. Wallpaperengine setting lock screen wallpaper tutorial 1. First enter the software, then select installed, and click "Configure Wallpaper Options". 2. After selecting the wallpaper in separate settings, you need to click OK on the lower right. 3. Then click on the settings and preview above. 4. Next

The working principle and configuration method of GDM in Linux system The working principle and configuration method of GDM in Linux system Mar 01, 2024 pm 06:36 PM

Title: The working principle and configuration method of GDM in Linux systems In Linux operating systems, GDM (GNOMEDisplayManager) is a common display manager used to control graphical user interface (GUI) login and user session management. This article will introduce the working principle and configuration method of GDM, as well as provide specific code examples. 1. Working principle of GDM GDM is the display manager in the GNOME desktop environment. It is responsible for starting the X server and providing the login interface. The user enters

Is there any virus when watching wallpaper engine movies? Is there any virus when watching wallpaper engine movies? Mar 18, 2024 pm 07:28 PM

Users can download various wallpapers when using WallpaperEngine, and can also use dynamic wallpapers. Many users do not know whether there are viruses when watching videos on WallpaperEngine, but video files cannot be used as viruses. Is there any virus when watching movies on wallpaperengine? Answer: No. 1. Just video files cannot be used as viruses. 2. Just make sure to download videos from trusted sources and maintain computer security measures to avoid the risk of virus infection. 3. Application wallpapers are in apk format, and apk may carry Trojan viruses. 4. WallpaperEngine itself does not have viruses, but some application wallpapers in the creative workshop may have viruses.

In which folder are the wallpapers of wallpaper engine located? In which folder are the wallpapers of wallpaper engine located? Mar 19, 2024 am 08:16 AM

When using wallpaper, users can download various wallpapers they like for use. Many users do not know which folder the wallpapers are in. The wallpapers downloaded by users are stored in the content folder. Which folder is the wallpaper in? Answer: content folder. 1. Open File Explorer. 2. Click "This PC" on the left. 3. Find the "STEAM" folder. 4. Select "steamapps". 5. Click “workshop”. 6. Find the “content” folder.

Understand Linux Bashrc: functions, configuration and usage Understand Linux Bashrc: functions, configuration and usage Mar 20, 2024 pm 03:30 PM

Understanding Linux Bashrc: Function, Configuration and Usage In Linux systems, Bashrc (BourneAgainShellruncommands) is a very important configuration file, which contains various commands and settings that are automatically run when the system starts. The Bashrc file is usually located in the user's home directory and is a hidden file. Its function is to customize the Bashshell environment for the user. 1. Bashrc function setting environment

Does wallpaper engine consume a lot of power? Does wallpaper engine consume a lot of power? Mar 18, 2024 pm 08:30 PM

Users can change their computer wallpapers when using WallpaperEngine. Many users don't know that WallpaperEngine consumes a lot of power. Dynamic wallpapers consume a little more power than static wallpapers, but not a lot. Does wallpaperengine consume a lot of power? Answer: Not much. 1. Dynamic wallpapers consume a little more power than static wallpapers, but not a lot. 2. Turning on dynamic wallpaper will increase the computer's power consumption and take away a small amount of memory usage. 3. Users do not need to worry about the serious power consumption of dynamic wallpapers.

How to change font size in Microsoft Edge browser - How to change font size in Microsoft Edge browser How to change font size in Microsoft Edge browser - How to change font size in Microsoft Edge browser Mar 04, 2024 pm 05:58 PM

I guess you are not familiar with the Microsoft Edge browser, but do you know how to change the font size in the Microsoft Edge browser? The following article describes how to change the font size in the Microsoft Edge browser. Let's study it together. First, find the Microsoft Edge browser and double-click it to open it. You can find the Microsoft Edge browser in the desktop shortcut, start menu or taskbar, and double-click to open it. Secondly, open the [Settings] interface to enter this browser interface, click the [...] logo in the upper left corner; double-click [Settings] to open the settings interface. Again, find and open the [Appearance] interface and scroll down with the mouse

See all articles