Oozie Shell Action 配置
目录 1. Shell Action 2. Shell Action 日志 3. Shell Action 限制 1. Shell Action Shell action运行一个shell命令,需要配置的有job-tracker,name-node和一些必要的参数。 经过配置,在启动Shell Action之前可以创建或删除HDFS文件夹。 可以通过配置文件
目录
1. Shell Action
2. Shell Action 日志
3. Shell Action 限制
1. Shell Action
Shell action运行一个shell命令,需要配置的有job-tracker,name-node和一些必要的参数。
经过配置,在启动Shell Action之前可以创建或删除HDFS文件夹。
可以通过配置文件(通过job-xml元素)给定配置信息,或者是用内嵌的configuration元素进行配置。
可以在内嵌的configuration里面使用EL表达式,在configuration里面配置的信息会覆盖job-xml里面相同的值。
需要注意的是,Hadoop的mapred.job.tracker和fs.default.name属性不可以在内嵌的configuration里面配置。
跟hadoop的map-reduce jobs一样,可以添加附件到sqoop job里面。具体参见【http://archive.cloudera.com/cdh/3/oozie/WorkflowFunctionalSpec.html#a3.2.2.1_Adding_Files_and_Archives_for_the_Job】
shell任务的标准输出(STDOUT)在shell运行结束之后是可用的。这些信息可以被决策结点使用。如果shell job的输出被配置成可用的,那shell命令必须包含以下两个参数:
- 输出的格式必须是合法的java属性文件。
- 输出的大小不能超过2KB。
语法:
... [JOB-TRACKER] [NAME-NODE] ... ... [SHELL SETTINGS FILE] [PROPERTY-NAME] [PROPERTY-VALUE] ... [SHELL-COMMAND] [ARG-VALUE] ... [ARG-VALUE] [VAR1=VALUE1] ... [VARN=VALUEN] [FILE-PATH] ... [FILE-PATH] ... ...
prepare元素里面配置启动job前要删除或者创建的文件夹,文件夹路径必须是以hdfs://HOST:PORT开头。
job-xml指定一个存在的配置文件。
configuration里面配置传递给sqoop job的参数。
exec元素包含要执行的shell命令的路径。可以给shell命令添加参数。
argument元素指定要传递给shell脚本的参数。
env-var包含传递给shell命令的环境变量。env-var只能包含一个环境变量和值。如果这个环境变量包含像$PATH一样的,那它必须写成PATH=$PATH:mypath。不能用${PATH},因为它将会被EL解析。
capture-output元素指定用来捕获shell脚本的标准输出。可以通过String action:output(String node, String key)函数【EL函数】来获得输出。
例子:
${jobTracker} ${nameNode} mapred.job.queue.name ${queueName} ${EXEC} A B ${EXEC}#${EXEC} <!--Copy the executable to compute node's current working directory --> Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]
其中,job属性文件如下:
oozie.wf.application.path=hdfs://localhost:8020/user/kamrul/workflows/script#Execute is expected to be in the Workflow directory. #Shell Script to run EXEC=script.sh #CPP executable. Executable should be binary compatible to the compute node OS. #EXEC=hello #Perl script #EXEC=script.pl jobTracker=localhost:8021 nameNode=hdfs://localhost:8020 queueName=default
运行jar里面的java程序:
${jobTracker} ${nameNode} mapred.job.queue.name ${queueName} java -classpath ./${EXEC}:$CLASSPATH Hello ${EXEC}#${EXEC} <!--Copy the jar to compute node current working directory --> Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]
对应的属性文件是:
oozie.wf.application.path=hdfs://localhost:8020/user/kamrul/workflows/script#Hello.jar file is expected to be in the Workflow directory. EXEC=Hello.jar jobTracker=localhost:8021 nameNode=hdfs://localhost:8020 queueName=default
2. Shell Action 日志
shell action的stdout和stderr输出被重定向到运行该脚本的oozie执行器上的map-reduce任务的stdout。
除了在Oozie的web网页上可以看到少部分日志,还可以在hadoop的job-tracker的网页上看到详细的日志信息。
3. Shell Action 限制
虽然Shell Action可以执行任意的shell命令,但是有以下几个限制:
不支持交互命令。
不能通过sudo来让不同用户执行命令。
用户必须明确的上传所需要的第三方库。Oozie通过Hadoop的分布式缓冲来上传、打标签、使用。
Shell命令会在任意一个hadoop 计算节点上运行,但是计算节点上默认安装的工具集可能会不一样。不过在所有的计算节点上,通常都装有大部分普通的unix工具。因此需要明确的很重要的一点是:Oozie只支持有被安装到计算节点上的命令或者通过分布式缓存上传的命令。也就是说,我们必须通过file上传我们要用到的文件。
http://archive.cloudera.com/cdh/3/oozie/DG_ShellActionExtension.html
转载请注明: 转载自http://jyd.me/
本文链接地址: Oozie Shell Action 配置

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Title: The working principle and configuration method of GDM in Linux systems In Linux operating systems, GDM (GNOMEDisplayManager) is a common display manager used to control graphical user interface (GUI) login and user session management. This article will introduce the working principle and configuration method of GDM, as well as provide specific code examples. 1. Working principle of GDM GDM is the display manager in the GNOME desktop environment. It is responsible for starting the X server and providing the login interface. The user enters

Understanding Linux Bashrc: Function, Configuration and Usage In Linux systems, Bashrc (BourneAgainShellruncommands) is a very important configuration file, which contains various commands and settings that are automatically run when the system starts. The Bashrc file is usually located in the user's home directory and is a hidden file. Its function is to customize the Bashshell environment for the user. 1. Bashrc function setting environment

When processing files under Linux systems, it is sometimes necessary to delete lines at the end of the file. This operation is very common in practical applications and can be achieved through some simple commands. This article will introduce the steps to quickly delete the line at the end of the file in Linux system, and provide specific code examples. Step 1: Check the last line of the file. Before performing the deletion operation, you first need to confirm which line is the last line of the file. You can use the tail command to view the last line of the file. The specific command is as follows: tail-n1filena

DJI has not confirmed any plans to introduce a new action camera yet. Instead, it seems that GoPro will get ahead of its rival this year, having teased that it will introduce two new action cameras on September 4. For context, these are expected to a

Title: How to configure and install FTPS in Linux system, specific code examples are required. In Linux system, FTPS is a secure file transfer protocol. Compared with FTP, FTPS encrypts the transmitted data through TLS/SSL protocol, which improves Security of data transmission. In this article, we will introduce how to configure and install FTPS in a Linux system and provide specific code examples. Step 1: Install vsftpd Open the terminal and enter the following command to install vsftpd: sudo

Teach you step by step how to configure Maven local warehouse: improve project construction speed Maven is a powerful project management tool that is widely used in Java development. It can help us manage project dependencies, build projects, and publish projects, etc. However, during the actual development process, we sometimes encounter the problem of slow project construction. One solution is to configure a local repository to improve project build speed. This article will teach you step by step how to configure the Maven local warehouse to make your project construction more efficient. Why do you need to configure a local warehouse?

When we use win11 system, we sometimes need to check the configuration of our computer, but many users are also asking where to check the configuration of win11 computer? In fact, the method is very simple. Users can directly open the system information under settings, and then view the computer configuration information. Let this site carefully introduce to users how to find win11 computer configuration information. How to find win11 computer configuration information. Method 1: 1. Click Start and open Computer Settings. 3. You can view computer configuration information on this page. 2. In the command prompt window, enter systeminfo and press Enter to view the computer configuration.

The game Black Myth Wukong will be launched on all major platforms in the summer of 2024. Players need to meet certain computer configurations when downloading the game to experience it. The following is an introduction to the minimum configuration required for Black Myth Wukong. What computer configuration is required for Black Myth Wukong? Minimum configuration operating system: Windows 7, Windows 8.1, Windows 10 (all 64-bit) Processor: Intel Corei5-4430/AMDFX-6300 Running memory: 8GB RAM Graphics card: NVIDIA GeForce GTX9602GB/AMDRadeon R73702GB Storage space: 100GB required Available space recommended operating system: Windows 7, Win
