Table of Contents
步骤1:在PG中为元数据增加用户的DB
步骤2:下载PGJDBC驱动
步骤3:修改HIVE配置文件
步骤4:初始化元数据表
小结
Home Database Mysql Tutorial 配置apache HIVE元数据DB为PostgreSQL

配置apache HIVE元数据DB为PostgreSQL

Jun 07, 2016 pm 03:15 PM
apache hive postgresql data Configuration

本文出处:http://amutu.com/blog/2013/06/hive-metastore-db-postgresql/ HIVE 的元数据默认使用 derby 作为存储 DB , derby 作为轻量级的 DB ,在开发、测试过程中使用比较方便,但是在实际的生产环境中,还需要考虑易用性、容灾、稳定性以及各种监控、运

本文出处:http://amutu.com/blog/2013/06/hive-metastore-db-postgresql/

HIVE的元数据默认使用derby作为存储DBderby作为轻量级的DB,在开发、测试过程中使用比较方便,但是在实际的生产环境中,还需要考虑易用性、容灾、稳定性以及各种监控、运维工具等,这些都是derby缺乏的。MySQLPostgreSQL是两个比较常用的开源数据库系统,在生产环境中比较多的用来替换derby。配置MySQL在网上的文章比较多,这里不再赘述,本文主要描述配置HIVE元数据DBPostgreSQL的方法。

HIVE版本:HIVE 0.7-snapshotHIVE 0.8-snapshot

步骤1:在PG中为元数据增加用户的DB

首先在PostgreSQL中为HIVE的元数据建立帐号和DB

 

--以管理员身份登入PG

psql postgres -U postgres

 

--创建用户hive_user:

Create user hive_user;

 

--创建DB metastore_dbownerhive_user:

Create database metastore_db with owner=hive_user;

 

--设置hive_user的密码:

/password hive_user

 

完成以上步骤以后,还要确保PostgreSQLpg_hba.conf中的配置允许HIVE所在的机器ip可以访问PG

步骤2:下载PGJDBC驱动

HIVE_HOME目录下创建auxlib目录:

mkdir auxlib

此时HIVE_HOME目录中应该有binlibauxlibconf等目录。

 

下载PGJDBC驱动

Wget http://jdbc.postgresql.org/download/postgresql-9.0-801.jdbc4.jar

 

将下载到的postgresql-9.0-801.jdbc4.jar放到auxlib中。

步骤3:修改HIVE配置文件

HIVE_HOME中新建hive-site.xml 文件,内容如下,蓝色字体按照PG server的相关信息进行修改。

 

  javax.jdo.option.ConnectionURL

  jdbc:postgresql://pg_server_ip:pg_server_port/metastore_db?

  JDBC connect string for a JDBC metastore

 

  javax.jdo.option.ConnectionDriverName

  org.postgresql.Driver

  Driver class name for a JDBC metastore

 

  javax.jdo.option.ConnectionUserName

  hive_user

  username to use against metastore database

 

  javax.jdo.option.ConnectionPassword

  hive_user_pass

  password to use against metastore database

 

 

步骤4:初始化元数据表

元数据库metastore中默认没有表,当HIVE第一次使用某个表的时候,如果发现该表不存在就会自动创建。对derbymysql,这个过程没有问题,因此derbymysql作为元数据库不需要这一步。

PostgreSQL在初始化的时候,会遇到一些问题,导致PG数据库死锁。例如执行以下HIVE语句:

>Create table kv (key,int,value string) partitioned by (ds string);

OK

>Alter table kv add partition (ds = '20110101');

执行这一句的时候,HIVE会一直停在这。

 

查看PG数据库,发现有两个连接在进行事务操作,其中一个是:

 in transaction

此时处于事务中空闲,另外一个是:

ALTER TABLE "PARTITIONS" ADD CONSTRAINT "PARTITIONS_FK1" FOREIGN KEY ("SD_ID") REFERENCES "SDS" ("SD_ID") INITIALLY DEFERRED 

处于等待状态。

 

进一步查看日志,发现大致的过程是这样的:

 

HIVE发起Alter table kv add partition (ds = '20110101')语句,此时DataNucleus接口发起第一个isolationSERIALIZABLE的事务,锁定了TBLS等元数据表。在这个的事务进行过程中,DataNucleu发现PARTITIONS等表没有,则要自动创建。于是又发起了另外一个isolationSERIALIZABLE的事务,第一个事务变为 in transaction。第二个事务创建了PARTITIONS的表后,还要给它增加约束条件,这时,它需要获得它引用的表SDS的排他锁,但这个锁已经被第一个事务拿到了,因此需要等待第一个事务结束。而第一个事务也在等待第二个事务结束。这样就造成了死锁。

 

类似的情况出现在:

>create test(key int);

OK

>drop table test;

drop table时会去drop它的index,而此时没有index元数据表,它去键,然后产生死锁。

 

有三种方法可以解决这个死锁问题:

 

第一种方法:

使用PGpg_terminate_backend()将第一个事务结束掉,这样可以保证第二个事务完成下去,将元数据表键成功。

第二种方法:

使HIVE将创建元数据表的过程和向元数据表中添加数据的过程分离:

>Create table kv (key,int,value string) partitioned by (ds string);

OK

>show partitions kv;

OK

>Alter table kv add partition (ds = '20110101');

OK

执行以上语句时就不会发生死锁,因为在执行show partitions kv语句时,它是只读语句,不会加锁。当这个语句发现PARTITIONS等表不在时,创建这些表不会发生死锁。

同样对于index表,使用

>Show index on kv;

可以将IDXS表建好。

第三种方法:

使用DataNucleu提供的SchemaTool,将HIVEmetastore/src/model/package.jdo文件作为输入,这个工具可以自动创建元数据中的表。具体的使用方法见:

http://www.datanucleus.org/products/accessplatform_2_0/rdbms/schematool.html

小结

本文给出了使用PostgreSQL作为HIVE元数据DB的配置方法,以及遇到的死锁问题的解决办法,希望对使用HIVEPostgreSQL的朋友有帮助。

 

 

 

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Repo: How To Revive Teammates
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Slow Cellular Data Internet Speeds on iPhone: Fixes Slow Cellular Data Internet Speeds on iPhone: Fixes May 03, 2024 pm 09:01 PM

Facing lag, slow mobile data connection on iPhone? Typically, the strength of cellular internet on your phone depends on several factors such as region, cellular network type, roaming type, etc. There are some things you can do to get a faster, more reliable cellular Internet connection. Fix 1 – Force Restart iPhone Sometimes, force restarting your device just resets a lot of things, including the cellular connection. Step 1 – Just press the volume up key once and release. Next, press the Volume Down key and release it again. Step 2 – The next part of the process is to hold the button on the right side. Let the iPhone finish restarting. Enable cellular data and check network speed. Check again Fix 2 – Change data mode While 5G offers better network speeds, it works better when the signal is weaker

The U.S. Air Force showcases its first AI fighter jet with high profile! The minister personally conducted the test drive without interfering during the whole process, and 100,000 lines of code were tested for 21 times. The U.S. Air Force showcases its first AI fighter jet with high profile! The minister personally conducted the test drive without interfering during the whole process, and 100,000 lines of code were tested for 21 times. May 07, 2024 pm 05:00 PM

Recently, the military circle has been overwhelmed by the news: US military fighter jets can now complete fully automatic air combat using AI. Yes, just recently, the US military’s AI fighter jet was made public for the first time and the mystery was unveiled. The full name of this fighter is the Variable Stability Simulator Test Aircraft (VISTA). It was personally flown by the Secretary of the US Air Force to simulate a one-on-one air battle. On May 2, U.S. Air Force Secretary Frank Kendall took off in an X-62AVISTA at Edwards Air Force Base. Note that during the one-hour flight, all flight actions were completed autonomously by AI! Kendall said - "For the past few decades, we have been thinking about the unlimited potential of autonomous air-to-air combat, but it has always seemed out of reach." However now,

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

70B model generates 1,000 tokens in seconds, code rewriting surpasses GPT-4o, from the Cursor team, a code artifact invested by OpenAI 70B model generates 1,000 tokens in seconds, code rewriting surpasses GPT-4o, from the Cursor team, a code artifact invested by OpenAI Jun 13, 2024 pm 03:47 PM

70B model, 1000 tokens can be generated in seconds, which translates into nearly 4000 characters! The researchers fine-tuned Llama3 and introduced an acceleration algorithm. Compared with the native version, the speed is 13 times faster! Not only is it fast, its performance on code rewriting tasks even surpasses GPT-4o. This achievement comes from anysphere, the team behind the popular AI programming artifact Cursor, and OpenAI also participated in the investment. You must know that on Groq, a well-known fast inference acceleration framework, the inference speed of 70BLlama3 is only more than 300 tokens per second. With the speed of Cursor, it can be said that it achieves near-instant complete code file editing. Some people call it a good guy, if you put Curs

AI startups collectively switched jobs to OpenAI, and the security team regrouped after Ilya left! AI startups collectively switched jobs to OpenAI, and the security team regrouped after Ilya left! Jun 08, 2024 pm 01:00 PM

Last week, amid the internal wave of resignations and external criticism, OpenAI was plagued by internal and external troubles: - The infringement of the widow sister sparked global heated discussions - Employees signing "overlord clauses" were exposed one after another - Netizens listed Ultraman's "seven deadly sins" Rumors refuting: According to leaked information and documents obtained by Vox, OpenAI’s senior leadership, including Altman, was well aware of these equity recovery provisions and signed off on them. In addition, there is a serious and urgent issue facing OpenAI - AI safety. The recent departures of five security-related employees, including two of its most prominent employees, and the dissolution of the "Super Alignment" team have once again put OpenAI's security issues in the spotlight. Fortune magazine reported that OpenA

How to add a server in eclipse How to add a server in eclipse May 05, 2024 pm 07:27 PM

To add a server to Eclipse, follow these steps: Create a server runtime environment Configure the server Create a server instance Select the server runtime environment Configure the server instance Start the server deployment project

How to conduct concurrency testing and debugging in Java concurrent programming? How to conduct concurrency testing and debugging in Java concurrent programming? May 09, 2024 am 09:33 AM

Concurrency testing and debugging Concurrency testing and debugging in Java concurrent programming are crucial and the following techniques are available: Concurrency testing: Unit testing: Isolate and test a single concurrent task. Integration testing: testing the interaction between multiple concurrent tasks. Load testing: Evaluate an application's performance and scalability under heavy load. Concurrency Debugging: Breakpoints: Pause thread execution and inspect variables or execute code. Logging: Record thread events and status. Stack trace: Identify the source of the exception. Visualization tools: Monitor thread activity and resource usage.

Application of algorithms in the construction of 58 portrait platform Application of algorithms in the construction of 58 portrait platform May 09, 2024 am 09:01 AM

1. Background of the Construction of 58 Portraits Platform First of all, I would like to share with you the background of the construction of the 58 Portrait Platform. 1. The traditional thinking of the traditional profiling platform is no longer enough. Building a user profiling platform relies on data warehouse modeling capabilities to integrate data from multiple business lines to build accurate user portraits; it also requires data mining to understand user behavior, interests and needs, and provide algorithms. side capabilities; finally, it also needs to have data platform capabilities to efficiently store, query and share user profile data and provide profile services. The main difference between a self-built business profiling platform and a middle-office profiling platform is that the self-built profiling platform serves a single business line and can be customized on demand; the mid-office platform serves multiple business lines, has complex modeling, and provides more general capabilities. 2.58 User portraits of the background of Zhongtai portrait construction

See all articles