Home Database Mysql Tutorial MapReduce文件切分个数计算方法

MapReduce文件切分个数计算方法

Jun 07, 2016 pm 04:33 PM
mapreduce number document method calculate

Hadoop的MapReduce计算的第一个阶段是InputFormat处理的,先将文件进行切分,然后将每个切分传递给每个Map任务来执行,本文阐述切分个数,也就是Map任务数目的计算方法; Hadoop首先会计算每个切分的大小,然后使用文件总大小/每个切分的大小来决定划分的总

Hadoop的MapReduce计算的第一个阶段是InputFormat处理的,先将文件进行切分,然后将每个切分传递给每个Map任务来执行,本文阐述切分个数,也就是Map任务数目的计算方法;

Hadoop首先会计算每个切分的大小,然后使用”文件总大小/每个切分的大小“来决定划分的总数,如果不足一个切分的大小,则当做1个;

在org.apache.hadoop.mapred.FileInputFormat中给出了计算每个划分大小的方法:

  protected long computeSplitSize(long goalSize, long minSize,
                                       long blockSize) {
    return Math.max(minSize, Math.min(goalSize, blockSize));
  }

其中几个变量的解释如下:

  • blockSize:HDFS存储的基本单元,默认为64MB或者128MB;
  • minSize:由用户设置的最小切分大小,配置项为mapred.min.split.size;
  • goalSize:计算公式为"文件总大小/用户设定的Map任务个数",即用户间接期望的大小;

由此可以推断出选定策略:

  • 划分大小为blockSize:blockSize小于用户期望的大小,比用户设定的最小值要大;也就是说如果用户设定的最小值太小的话,会使用block size作为划分大小
  • 划分大小为goalSize:用户设定了Map的任务数目,那么即使算出来的划分大小比block size小也会使用,这个时候出现了两个用户设定值:最小值和期望值,hadoop会选择两者中大的那个;
  • 划分大小为minSize:如果用户期望的值,还有blocksize只都比用户设定的最小值要小,那么就会使用这个最小值;

针对这几个值,用户可以根据输入数据的情况,合理的设置mapred.min.split.size和map.tasks.size来实现优化,InputSplit和blockSize相同是非常好的一种方法,因为不需要拆分block了.

文章地址:http://www.crazyant.net/1423.html

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to write a novel in the Tomato Free Novel app. Share the tutorial on how to write a novel in Tomato Novel. How to write a novel in the Tomato Free Novel app. Share the tutorial on how to write a novel in Tomato Novel. Mar 28, 2024 pm 12:50 PM

Tomato Novel is a very popular novel reading software. We often have new novels and comics to read in Tomato Novel. Every novel and comic is very interesting. Many friends also want to write novels. Earn pocket money and edit the content of the novel you want to write into text. So how do we write the novel in it? My friends don’t know, so let’s go to this site together. Let’s take some time to look at an introduction to how to write a novel. Share the Tomato novel tutorial on how to write a novel. 1. First open the Tomato free novel app on your mobile phone and click on Personal Center - Writer Center. 2. Jump to the Tomato Writer Assistant page - click on Create a new book at the end of the novel.

How to recover deleted contacts on WeChat (simple tutorial tells you how to recover deleted contacts) How to recover deleted contacts on WeChat (simple tutorial tells you how to recover deleted contacts) May 01, 2024 pm 12:01 PM

Unfortunately, people often delete certain contacts accidentally for some reasons. WeChat is a widely used social software. To help users solve this problem, this article will introduce how to retrieve deleted contacts in a simple way. 1. Understand the WeChat contact deletion mechanism. This provides us with the possibility to retrieve deleted contacts. The contact deletion mechanism in WeChat removes them from the address book, but does not delete them completely. 2. Use WeChat’s built-in “Contact Book Recovery” function. WeChat provides “Contact Book Recovery” to save time and energy. Users can quickly retrieve previously deleted contacts through this function. 3. Enter the WeChat settings page and click the lower right corner, open the WeChat application "Me" and click the settings icon in the upper right corner to enter the settings page.

How to calculate addition, subtraction, multiplication and division in word document How to calculate addition, subtraction, multiplication and division in word document Mar 19, 2024 pm 08:13 PM

WORD is a powerful word processor. We can use word to edit various texts. In Excel tables, we have mastered the calculation methods of addition, subtraction and multipliers. So if we need to calculate the addition of numerical values ​​in Word tables, How to subtract the multiplier? Can I only use a calculator to calculate it? The answer is of course no, WORD can also do it. Today I will teach you how to use formulas to calculate basic operations such as addition, subtraction, multiplication and division in tables in Word documents. Let's learn together. So, today let me demonstrate in detail how to calculate addition, subtraction, multiplication and division in a WORD document? Step 1: Open a WORD, click [Table] under [Insert] on the toolbar, and insert a table in the drop-down menu.

CUDA's universal matrix multiplication: from entry to proficiency! CUDA's universal matrix multiplication: from entry to proficiency! Mar 25, 2024 pm 12:30 PM

General Matrix Multiplication (GEMM) is a vital part of many applications and algorithms, and is also one of the important indicators for evaluating computer hardware performance. In-depth research and optimization of the implementation of GEMM can help us better understand high-performance computing and the relationship between software and hardware systems. In computer science, effective optimization of GEMM can increase computing speed and save resources, which is crucial to improving the overall performance of a computer system. An in-depth understanding of the working principle and optimization method of GEMM will help us better utilize the potential of modern computing hardware and provide more efficient solutions for various complex computing tasks. By optimizing the performance of GEMM

How to transfer files from Quark Cloud Disk to Baidu Cloud Disk? How to transfer files from Quark Cloud Disk to Baidu Cloud Disk? Mar 14, 2024 pm 02:07 PM

Quark Netdisk and Baidu Netdisk are currently the most commonly used Netdisk software for storing files. If you want to save the files in Quark Netdisk to Baidu Netdisk, how do you do it? In this issue, the editor has compiled the tutorial steps for transferring files from Quark Network Disk computer to Baidu Network Disk. Let’s take a look at how to operate it. How to save Quark network disk files to Baidu network disk? To transfer files from Quark Network Disk to Baidu Network Disk, you first need to download the required files from Quark Network Disk, then select the target folder in the Baidu Network Disk client and open it. Then, drag and drop the files downloaded from Quark Cloud Disk into the folder opened by the Baidu Cloud Disk client, or use the upload function to add the files to Baidu Cloud Disk. Make sure to check whether the file was successfully transferred in Baidu Cloud Disk after the upload is completed. That's it

What to do if the 0x80004005 error code appears. The editor will teach you how to solve the 0x80004005 error code. What to do if the 0x80004005 error code appears. The editor will teach you how to solve the 0x80004005 error code. Mar 21, 2024 pm 09:17 PM

When deleting or decompressing a folder on your computer, sometimes a prompt dialog box "Error 0x80004005: Unspecified Error" will pop up. How should you solve this situation? There are actually many reasons why the error code 0x80004005 is prompted, but most of them are caused by viruses. We can re-register the dll to solve the problem. Below, the editor will explain to you the experience of handling the 0x80004005 error code. Some users are prompted with error code 0X80004005 when using their computers. The 0x80004005 error is mainly caused by the computer not correctly registering certain dynamic link library files, or by a firewall that does not allow HTTPS connections between the computer and the Internet. So how about

What is hiberfil.sys file? Can hiberfil.sys be deleted? What is hiberfil.sys file? Can hiberfil.sys be deleted? Mar 15, 2024 am 09:49 AM

Recently, many netizens have asked the editor, what is the file hiberfil.sys? Can hiberfil.sys take up a lot of C drive space and be deleted? The editor can tell you that the hiberfil.sys file can be deleted. Let’s take a look at the details below. hiberfil.sys is a hidden file in the Windows system and also a system hibernation file. It is usually stored in the root directory of the C drive, and its size is equivalent to the size of the system's installed memory. This file is used when the computer is hibernated and contains the memory data of the current system so that it can be quickly restored to the previous state during recovery. Since its size is equal to the memory capacity, it may take up a larger amount of hard drive space. hiber

Quickly master: How to open two WeChat accounts on Huawei mobile phones revealed! Quickly master: How to open two WeChat accounts on Huawei mobile phones revealed! Mar 23, 2024 am 10:42 AM

In today's society, mobile phones have become an indispensable part of our lives. As an important tool for our daily communication, work, and life, WeChat is often used. However, it may be necessary to separate two WeChat accounts when handling different transactions, which requires the mobile phone to support logging in to two WeChat accounts at the same time. As a well-known domestic brand, Huawei mobile phones are used by many people. So what is the method to open two WeChat accounts on Huawei mobile phones? Let’s reveal the secret of this method. First of all, you need to use two WeChat accounts at the same time on your Huawei mobile phone. The easiest way is to

See all articles