MySQL向Hive/HBase的迁移工具
Apache Hive是目前大型数据仓库的免费首选产品之一,使用Apache Hive的人是不会期望在小数据量上做什么文章,例如把MySQL中的数据搬到Hive/HBase中去,那样的话原先很快能执行完毕的SQL,估计在Hive上运行跟原来相比时间延长10倍都不止。但如果你有MySQL数据
Apache Hive是目前大型数据仓库的免费首选产品之一,使用Apache Hive的人是不会期望在小数据量上做什么文章,例如把MySQL中的数据搬到Hive/HBase中去,那样的话原先很快能执行完毕的SQL,估计在Hive上运行跟原来相比时间延长10倍都不止。但如果你有MySQL数据可以把大量的数据向Hive导入,如果上亿条的数据量再加上复杂的SQL查询条件对于MySQL来说是一件比较头疼的事情,此时相比而言对于Hive来说还算比较easy没有那么非常的头痛,但是两者之间缺少一个沟通的桥梁。
而然伟大的云计算公司cloudera.com也是Hadoop强力支持者推出了Sqoop,Sqoop顾名思义SQL-to-Hadoop,在sqoop中通过 ManagerFactory 抽象类对多种数据库类型进行了抽象,可以做到 Hsqldb、MySQL、Oracle、PostgreSQL 这些数据库中的数据可以向Hive中写入。
从导出/导入所有数据一条命令即可,而且可以对表和数据的筛选,开发的效率提升和配置的简洁是这个工具的特色所在,同样的机器配置、机器数量、数据量和数据内容,但是换了不同的环境得到了不同的执行效率,通过对RMDBS到Hadoop的迁移,带来了性能的提升,所以就体现了sqoop的价值。
在一次开发大会上提到的Sqoop主要功能
JDBC-based implementation
? Works with many popular database vendors
Auto-generation of tedious user-side code
? Write MapReduce applications to work with your data, faster
Integration with Hive
? Allows you to stay in a SQL-based environment
Extensible backend
? Database-specific code paths for better performance
具体操作手册相见:
http://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html (官方)
相关文章:
Hive入门3–Hive与HBase的整合
Apache Hive入门2
Apache Hive入门1
Apache Pig入门1 –介绍/基本架构/与Hive对比
–end–
原文地址:MySQL向Hive/HBase的迁移工具, 感谢原作者分享。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Efficient methods for batch inserting data in MySQL include: 1. Using INSERTINTO...VALUES syntax, 2. Using LOADDATAINFILE command, 3. Using transaction processing, 4. Adjust batch size, 5. Disable indexing, 6. Using INSERTIGNORE or INSERT...ONDUPLICATEKEYUPDATE, these methods can significantly improve database operation efficiency.

With the popularization and development of digital currency, more and more people are beginning to pay attention to and use digital currency apps. These applications provide users with a convenient way to manage and trade digital assets. So, what kind of software is a digital currency app? Let us have an in-depth understanding and take stock of the top ten digital currency apps in the world.

Methods for configuring character sets and collations in MySQL include: 1. Setting the character sets and collations at the server level: SETNAMES'utf8'; SETCHARACTERSETutf8; SETCOLLATION_CONNECTION='utf8_general_ci'; 2. Create a database that uses specific character sets and collations: CREATEDATABASEexample_dbCHARACTERSETutf8COLLATEutf8_general_ci; 3. Specify character sets and collations when creating a table: CREATETABLEexample_table(idINT

To safely and thoroughly uninstall MySQL and clean all residual files, follow the following steps: 1. Stop MySQL service; 2. Uninstall MySQL packages; 3. Clean configuration files and data directories; 4. Verify that the uninstallation is thorough.

MySQL functions can be used for data processing and calculation. 1. Basic usage includes string processing, date calculation and mathematical operations. 2. Advanced usage involves combining multiple functions to implement complex operations. 3. Performance optimization requires avoiding the use of functions in the WHERE clause and using GROUPBY and temporary tables.

MySQL and phpMyAdmin can be effectively managed through the following steps: 1. Create and delete database: Just click in phpMyAdmin to complete. 2. Manage tables: You can create tables, modify structures, and add indexes. 3. Data operation: Supports inserting, updating, deleting data and executing SQL queries. 4. Import and export data: Supports SQL, CSV, XML and other formats. 5. Optimization and monitoring: Use the OPTIMIZETABLE command to optimize tables and use query analyzers and monitoring tools to solve performance problems.

In MySQL, add fields using ALTERTABLEtable_nameADDCOLUMNnew_columnVARCHAR(255)AFTERexisting_column, delete fields using ALTERTABLEtable_nameDROPCOLUMNcolumn_to_drop. When adding fields, you need to specify a location to optimize query performance and data structure; before deleting fields, you need to confirm that the operation is irreversible; modifying table structure using online DDL, backup data, test environment, and low-load time periods is performance optimization and best practice.

The main differences between Laravel and Yii are design concepts, functional characteristics and usage scenarios. 1.Laravel focuses on the simplicity and pleasure of development, and provides rich functions such as EloquentORM and Artisan tools, suitable for rapid development and beginners. 2.Yii emphasizes performance and efficiency, is suitable for high-load applications, and provides efficient ActiveRecord and cache systems, but has a steep learning curve.
