Home Database Mysql Tutorial HBase intra row scanning

HBase intra row scanning

Jun 07, 2016 pm 04:26 PM
hbase

By Lars Hofhansl Updated (again) Wednesday, January 25th, 2012. As I painfully worked through HBASE-5229 I realized that HBase already has all the building blocks needed for complex (local) transactions. What's important here is that (see

By Lars Hofhansl

Updated (again) Wednesday, January 25th, 2012.

As I painfully worked through HBASE-5229 I realized that HBase already has all the building blocks needed for complex (local) transactions.

What's important here is that (see my introduction to HBase):
  1. HBase ensures atomicity for operations for the same row key
  2. HBase keys have internal structure: (row-key, column family, column, ...)
The missing piece was ColumnRangeFilter. With this filter it is possible to retrieve all columns whose identifier starts with "abc", or all columns whose identifier sorts > "test". For example:

// all columns whose identifier starts with "abc"
Filter f = new ColumnRangeFilter(Bytes.toBytes("abc"), true,
Bytes.toBytes("abd"), false);

// all columns whose identifier sorts after "test"
Filter f = new ColumnRangeFilter(Bytes.toBytes("test"), true,
null, true);


So this allows to search (scan) inside a row by column identifier just  as HBase allows searching by row key.

A client application can exploit this to achieve transactions by grouping all entities that can participate in the same transaction into a single row (and single column family).
Then using prefixes of the column identifiers can be used to define rows inside that group. Basically the search criteria for keys was moved one level down to the column identifier.

Say we wanted to implement a store with transactional tables that contain rows and columns. One way to doing this with HBase as follows:
  • the HBase row-key/column-family maps to a "table"
  • a prefix of the HBase column identifier maps to a "row"
  • the rest of the HBase column identifier identifies the "column"
This is in fact similar to what Google's Megastore (pdf) does.

This leads to potentially wide HBase rows with many columns. The missing piece is allowing a Scan to efficiently retrieve a slice of a wide row.

This where ColumnRangeFilter comes into play. This filter seeks efficiently into the row by seeking ahead to the first HBase block that contains the first KeyValue (or cell) for that column.

Let's model a table "pets" this way. And let's say a pet has a name and a species. The HBase key for entries would look like this:
(table, CF1, rowA|column1) -> value for column1 in rowA
The code would look something like this:
(apologies for the initial incorrect code that I had posted here)

HTable t = ...;
Scan s = ...;
s.setStartRow("pets");
s.setStopRow("pets");
// get all columns for my pet "fluffy".
Filter f = new ColumnRangeFilter(Bytes.toBytes("fluffy"), true,
                                 Bytes.toBytes("fluffz"), false);
s.setFilter(f);
s.setBatch(20); // avoid getting all columns for the HBase row
ResultScanner rs = t.getScanner(s);
for (Result r = rs.next(); r != null; r = rs.next()) {

  // r will now have all HBase columns that start with "fluffy",

  // which would represent a single row
  for (KeyValue kv : r.raw()) {
    // each kv represent - the latest version of - a column
  }
}

The downside of this is that HBase achieves atomicity by collocating all cells with the same row-key, so it has to be hosted by a single region server.
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Using Hadoop and HBase in Beego for big data storage and querying Using Hadoop and HBase in Beego for big data storage and querying Jun 22, 2023 am 10:21 AM

With the advent of the big data era, data processing and storage have become more and more important, and how to efficiently manage and analyze large amounts of data has become a challenge for enterprises. Hadoop and HBase, two projects of the Apache Foundation, provide a solution for big data storage and analysis. This article will introduce how to use Hadoop and HBase in Beego for big data storage and query. 1. Introduction to Hadoop and HBase Hadoop is an open source distributed storage and computing system that can

How to integrate hbase in springboot How to integrate hbase in springboot May 30, 2023 pm 04:31 PM

Dependency: org.springframework.dataspring-data-hadoop-hbase2.5.0.RELEASEorg.apache.hbasehbase-client1.1.2org.springframework.dataspring-data-hadoop2.5.0.RELEASE The official way to add configuration is through xml, which is simple After rewriting, it is as follows: @ConfigurationpublicclassHBaseConfiguration{@Value("${hbase.zooke

Use HBase in Go language to implement efficient NoSQL database applications Use HBase in Go language to implement efficient NoSQL database applications Jun 15, 2023 pm 08:56 PM

With the advent of the big data era, the storage and processing of massive data has become particularly important. In terms of NoSQL databases, HBase is currently a widely used solution. As a statically strongly typed programming language, Go language is increasingly used in fields such as cloud computing, website development, and data science due to its simple syntax and excellent performance. This article will introduce how to use HBase in Go language to implement efficient NoSQL database applications. HBase introduction HBase is a highly scalable, highly reliable, basic

How to use Java to develop a NoSQL database application based on HBase How to use Java to develop a NoSQL database application based on HBase Sep 20, 2023 am 08:39 AM

How to use Java to develop a NoSQL database application based on HBase Introduction: With the advent of the big data era, NoSQL databases have become one of the important tools for processing massive data. HBase, as an open source distributed NoSQL database system, has extensive applications in the field of big data. This article will introduce how to use Java to develop NoSQL database applications based on HBase and provide specific code examples. 1. Introduction to HBase: HBase is a distribution system based on Hadoop.

Using HBase for data storage and query in Beego Using HBase for data storage and query in Beego Jun 22, 2023 am 11:58 AM

Using HBase for data storage and query in Beego framework With the continuous development of the Internet era, data storage and query have become more and more critical. With the advent of the big data era, various data sources occupy an important position in their respective fields. Non-relational databases are a database with obvious advantages in data storage and query, and HBase is a distributed non-relational database based on Hadoop. Relational Database. This article will introduce how to use HBase for data storage and query in the Beego framework. 1.H

PHP and Apache HBase integrate to implement NoSQL database and distributed storage PHP and Apache HBase integrate to implement NoSQL database and distributed storage Jun 25, 2023 pm 06:01 PM

With the continuous growth of Internet applications and data volume, traditional relational databases can no longer meet the needs of storing and processing massive data. As a new type of database management system, NoSQL (NotOnlySQL) has significant advantages in massive data storage and processing, and has received more and more attention and applications. Among NoSQL databases, ApacheHBase is a very popular open source distributed database. It is designed based on Google’s BigTable idea and has

How to use HBase for data storage and query in Workerman How to use HBase for data storage and query in Workerman Nov 07, 2023 am 08:30 AM

Workerman is a high-performance PHPsocket framework that can host a large number of concurrent connections. Unlike traditional PHP frameworks, Workerman does not rely on web servers such as Apache or Nginx. Instead, it runs the entire application by itself by starting a PHP process. Workerman has extremely high operating efficiency and better load capacity. At the same time, HBase is a distributed NoSQL database system that is widely used in big data

Learn about HBase caching technology Learn about HBase caching technology Jun 20, 2023 pm 07:15 PM

HBase is a Hadoop-based distributed storage system designed to store and process large-scale structured data. In order to optimize its read and write performance, HBase provides a variety of caching mechanisms, which can improve query efficiency and reduce read and write delays through reasonable configuration. This article will introduce HBase caching technology and how to configure it. HBase cache types HBase provides two basic cache mechanisms: block cache (BlockCache) and MemStore cache (also called write cache). The block cache is in

See all articles