Home Database Mysql Tutorial HBase在处理中文字符串时的问题

HBase在处理中文字符串时的问题

Jun 07, 2016 pm 05:27 PM
hbase

文中可能涉及到的API: Hadoop/HDFS:http://hadoop.apache.org/common/docs/current/api/ HBase: http://hbase.apache.org/apido

文中可能涉及到的API:

Hadoop/HDFS:

HBase: ?overview-summary.html

Begin!

 

在设置scan的startRowKey与endRowKey时,经常需要在某个条件字符串后面补充出一个范围。(再比如SingleColumnValueFilter也会用到)

比如:我的条件字符串是“abc”,scan时我需要将下述内容都囊括到我scan的范围内。

abc123

abcdabc

abccca

....

这时候我startRowKey使用“abc”即可,,上述字符串按字典序都比“abc”要大,“abc”串c之后的值是0嘛~

而endRowKey最初我使用了“abc~”,因为我查ASCII码表时‘~’是倒数第二个,值为127,足够大,肯定大于上述串中的1、d、c等字符。

这样做,在处理英文数据时就足够了,系统运行正常。

但当我处理中文数据时,中文一般都是以UTF-8格式处理的,一个汉字表示出来类似“0xe6,0xc2,0xe1”。0xe6大于127。所以使用‘~’遇到中文必然悲催。

我的解决方法:

使用UltraEdit,进入十六进制编辑模式,将值改为FF。然后回到文本模式,将刚才的字符复制下来。这个字符应该是一个不可显示的字符,看着好像两个空格的长度。

然后在设置endRowKey时

new String(name + " "); //这里只是示例,引号间就是刚才复制的那个字符。将这个字符串作为endRowKey,果然所有的中文字符就囊括在内了。

另外一定要注意:使用HBase API时不要使用str.getBytes将String转化为byte[] ,而应该使用Bytes.toBytes(str);同样使用Bytes.toString(bytes);完成逆向转换。

linux

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Using Hadoop and HBase in Beego for big data storage and querying Using Hadoop and HBase in Beego for big data storage and querying Jun 22, 2023 am 10:21 AM

With the advent of the big data era, data processing and storage have become more and more important, and how to efficiently manage and analyze large amounts of data has become a challenge for enterprises. Hadoop and HBase, two projects of the Apache Foundation, provide a solution for big data storage and analysis. This article will introduce how to use Hadoop and HBase in Beego for big data storage and query. 1. Introduction to Hadoop and HBase Hadoop is an open source distributed storage and computing system that can

How to integrate hbase in springboot How to integrate hbase in springboot May 30, 2023 pm 04:31 PM

Dependency: org.springframework.dataspring-data-hadoop-hbase2.5.0.RELEASEorg.apache.hbasehbase-client1.1.2org.springframework.dataspring-data-hadoop2.5.0.RELEASE The official way to add configuration is through xml, which is simple After rewriting, it is as follows: @ConfigurationpublicclassHBaseConfiguration{@Value("${hbase.zooke

Use HBase in Go language to implement efficient NoSQL database applications Use HBase in Go language to implement efficient NoSQL database applications Jun 15, 2023 pm 08:56 PM

With the advent of the big data era, the storage and processing of massive data has become particularly important. In terms of NoSQL databases, HBase is currently a widely used solution. As a statically strongly typed programming language, Go language is increasingly used in fields such as cloud computing, website development, and data science due to its simple syntax and excellent performance. This article will introduce how to use HBase in Go language to implement efficient NoSQL database applications. HBase introduction HBase is a highly scalable, highly reliable, basic

How to use Java to develop a NoSQL database application based on HBase How to use Java to develop a NoSQL database application based on HBase Sep 20, 2023 am 08:39 AM

How to use Java to develop a NoSQL database application based on HBase Introduction: With the advent of the big data era, NoSQL databases have become one of the important tools for processing massive data. HBase, as an open source distributed NoSQL database system, has extensive applications in the field of big data. This article will introduce how to use Java to develop NoSQL database applications based on HBase and provide specific code examples. 1. Introduction to HBase: HBase is a distribution system based on Hadoop.

Using HBase for data storage and query in Beego Using HBase for data storage and query in Beego Jun 22, 2023 am 11:58 AM

Using HBase for data storage and query in Beego framework With the continuous development of the Internet era, data storage and query have become more and more critical. With the advent of the big data era, various data sources occupy an important position in their respective fields. Non-relational databases are a database with obvious advantages in data storage and query, and HBase is a distributed non-relational database based on Hadoop. Relational Database. This article will introduce how to use HBase for data storage and query in the Beego framework. 1.H

PHP and Apache HBase integrate to implement NoSQL database and distributed storage PHP and Apache HBase integrate to implement NoSQL database and distributed storage Jun 25, 2023 pm 06:01 PM

With the continuous growth of Internet applications and data volume, traditional relational databases can no longer meet the needs of storing and processing massive data. As a new type of database management system, NoSQL (NotOnlySQL) has significant advantages in massive data storage and processing, and has received more and more attention and applications. Among NoSQL databases, ApacheHBase is a very popular open source distributed database. It is designed based on Google’s BigTable idea and has

How to use HBase for data storage and query in Workerman How to use HBase for data storage and query in Workerman Nov 07, 2023 am 08:30 AM

Workerman is a high-performance PHPsocket framework that can host a large number of concurrent connections. Unlike traditional PHP frameworks, Workerman does not rely on web servers such as Apache or Nginx. Instead, it runs the entire application by itself by starting a PHP process. Workerman has extremely high operating efficiency and better load capacity. At the same time, HBase is a distributed NoSQL database system that is widely used in big data

Learn about HBase caching technology Learn about HBase caching technology Jun 20, 2023 pm 07:15 PM

HBase is a Hadoop-based distributed storage system designed to store and process large-scale structured data. In order to optimize its read and write performance, HBase provides a variety of caching mechanisms, which can improve query efficiency and reduce read and write delays through reasonable configuration. This article will introduce HBase caching technology and how to configure it. HBase cache types HBase provides two basic cache mechanisms: block cache (BlockCache) and MemStore cache (also called write cache). The block cache is in

See all articles