Home Database Mysql Tutorial 用Sqoop把数据从HDFS导入到关系型数据库

用Sqoop把数据从HDFS导入到关系型数据库

Jun 07, 2016 pm 03:36 PM
hdfs Relational import Work database

由于工作的需求,需要把HDFS中处理之后的数据转移至关系型数据库中成为对应的Table,在网上寻找有关的资料良久,发现各个说法不一,下面是本人自身测试过程: 使用Sqoop来实现这一需求,首先要明白Sqoop是什么? Sqoop是一个用来将Hadoop和关系型数据库中的

由于工作的需求,需要把HDFS中处理之后的数据转移至关系型数据库中成为对应的Table,在网上寻找有关的资料良久,发现各个说法不一,下面是本人自身测试过程:

使用Sqoop来实现这一需求,首先要明白Sqoop是什么?

<em> Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。</em>
Copy after login


首先需以下要准备:

第一:hadoop的NameNode节点下lib文件夹中要有相应数据库驱动的jar包和sqoop的jar包。

第二:预先在相应的数据库创建Table,注:在HDFS的某个目录上的数据格式要和相应的表中的字段数量一致


由于我这里使用的是Oracle数据库并且是使用Java来操作的。所以下面的代码以及截图都是以Java的例子:

首先标准化HDFS中文件格式,如下图:

用Sqoop把数据从HDFS导入到关系型数据库


Java代码如下:

Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://192.168.115.5:9000");
conf.set("hadoop.job.ugi", "hadooper,hadoopgroup");
conf.set("mapred.job.tracker", "192.168.115.5:9001");


ArrayList list = new ArrayList(); // 定义一个list
list.add("--table");
list.add("A_BAAT_CLIENT"); // Oracle中的表。将来数据要导入到这个表中。
list.add("--export-dir");
list.add("/home/hadoop/traffic/capuse/near7date/activeUser/capuse_near7_activeUser_2013-02-06.log"); // hdfs上的目录。这个目录下的数据要导入到a_baat_client这个表中。
list.add("--connect");
list.add("jdbc:oracle:thin:@10.18.96.107:1521:life"); // Oracle的链接
list.add("--username");
list.add("TRAFFIC"); // Oracle的用户名
list.add("--password");
list.add("TRAFFIC"); // Oracle的密码
list.add("--input-fields-terminated-by");
list.add("|"); // 数据分隔符号
list.add("-m");
list.add("1");// 定义mapreduce的数量。


String[] arg = new String[1];
ExportTool exporter = new ExportTool();
Sqoop sqoop = new Sqoop(exporter);
sqoop.setConf(conf);
arg = list.toArray(new String[0]);
int result = Sqoop.runSqoop(sqoop, arg);
System.out.println("res:" + result); // 打印执行结果。


最后再在Main方法中运行即可,生成后表数据如下图所示:

用Sqoop把数据从HDFS导入到关系型数据库

通过上面的操作以及代码即可在Java中实现把HDFS数据生成对应的表数据;

不过除了可以用Java来实现,使用基本的命令也是可以的,命令如下:

在Hadoop bin目录中:

sqoop export --connect jdbc:oracle:thin:@10.18.96.107:1521:life \

--table A_BAAT_CLIENT --username TRAFFIC --password TRAFFIC \
--input-fields-terminated-by '|' \
--export-dir /home/hadoop/traffic/capuse/near7date/activeUser/test.log  -m 1

意思和上面Java中代码一样。


注意:

1、数据库表名、用户名、密码使用大写(这有可能会出现问题,因为我在测试过程中,使用小写时出现错误,出现No Columns这个经典错误。所以推荐大写,当然这不是必须);

2、预先建好相应的Table;


好了上面的代码实际上很是简单,不过如果是从未接触过此,那么在做的过程中会发现很多问题,而且网上的资料很是繁杂,在此个人作此篇一是为了自己做个Memo;同时也希望给需要的道友一份帮助。当然过程中也许还有很多问题,望高手斧正!!!


Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How does Go language implement the addition, deletion, modification and query operations of the database? How does Go language implement the addition, deletion, modification and query operations of the database? Mar 27, 2024 pm 09:39 PM

Go language is an efficient, concise and easy-to-learn programming language. It is favored by developers because of its advantages in concurrent programming and network programming. In actual development, database operations are an indispensable part. This article will introduce how to use Go language to implement database addition, deletion, modification and query operations. In Go language, we usually use third-party libraries to operate databases, such as commonly used sql packages, gorm, etc. Here we take the sql package as an example to introduce how to implement the addition, deletion, modification and query operations of the database. Assume we are using a MySQL database.

iOS 18 adds a new 'Recovered' album function to retrieve lost or damaged photos iOS 18 adds a new 'Recovered' album function to retrieve lost or damaged photos Jul 18, 2024 am 05:48 AM

Apple's latest releases of iOS18, iPadOS18 and macOS Sequoia systems have added an important feature to the Photos application, designed to help users easily recover photos and videos lost or damaged due to various reasons. The new feature introduces an album called "Recovered" in the Tools section of the Photos app that will automatically appear when a user has pictures or videos on their device that are not part of their photo library. The emergence of the "Recovered" album provides a solution for photos and videos lost due to database corruption, the camera application not saving to the photo library correctly, or a third-party application managing the photo library. Users only need a few simple steps

How does Hibernate implement polymorphic mapping? How does Hibernate implement polymorphic mapping? Apr 17, 2024 pm 12:09 PM

Hibernate polymorphic mapping can map inherited classes to the database and provides the following mapping types: joined-subclass: Create a separate table for the subclass, including all columns of the parent class. table-per-class: Create a separate table for subclasses, containing only subclass-specific columns. union-subclass: similar to joined-subclass, but the parent class table unions all subclass columns.

An in-depth analysis of how HTML reads the database An in-depth analysis of how HTML reads the database Apr 09, 2024 pm 12:36 PM

HTML cannot read the database directly, but it can be achieved through JavaScript and AJAX. The steps include establishing a database connection, sending a query, processing the response, and updating the page. This article provides a practical example of using JavaScript, AJAX and PHP to read data from a MySQL database, showing how to dynamically display query results in an HTML page. This example uses XMLHttpRequest to establish a database connection, send a query and process the response, thereby filling data into page elements and realizing the function of HTML reading the database.

Detailed tutorial on establishing a database connection using MySQLi in PHP Detailed tutorial on establishing a database connection using MySQLi in PHP Jun 04, 2024 pm 01:42 PM

How to use MySQLi to establish a database connection in PHP: Include MySQLi extension (require_once) Create connection function (functionconnect_to_db) Call connection function ($conn=connect_to_db()) Execute query ($result=$conn->query()) Close connection ( $conn->close())

How to handle database connection errors in PHP How to handle database connection errors in PHP Jun 05, 2024 pm 02:16 PM

To handle database connection errors in PHP, you can use the following steps: Use mysqli_connect_errno() to obtain the error code. Use mysqli_connect_error() to get the error message. By capturing and logging these error messages, database connection issues can be easily identified and resolved, ensuring the smooth running of your application.

Tips and practices for handling Chinese garbled characters in databases with PHP Tips and practices for handling Chinese garbled characters in databases with PHP Mar 27, 2024 pm 05:21 PM

PHP is a back-end programming language widely used in website development. It has powerful database operation functions and is often used to interact with databases such as MySQL. However, due to the complexity of Chinese character encoding, problems often arise when dealing with Chinese garbled characters in the database. This article will introduce the skills and practices of PHP in handling Chinese garbled characters in databases, including common causes of garbled characters, solutions and specific code examples. Common reasons for garbled characters are incorrect database character set settings: the correct character set needs to be selected when creating the database, such as utf8 or u

How to use database callback functions in Golang? How to use database callback functions in Golang? Jun 03, 2024 pm 02:20 PM

Using the database callback function in Golang can achieve: executing custom code after the specified database operation is completed. Add custom behavior through separate functions without writing additional code. Callback functions are available for insert, update, delete, and query operations. You must use the sql.Exec, sql.QueryRow, or sql.Query function to use the callback function.

See all articles