首頁 資料庫 mysql教程 HBase MVCC and built-in Atomic Operations

HBase MVCC and built-in Atomic Operations

Jun 07, 2016 pm 04:28 PM
and atomic hbase mvcc

By Lars Hofhansl (This is a follow to my ACID in HBase post from March this year) HBase has a few special atomic operations: checkAndPut, checkAndDelete - these simply check a value of a column as a precondition and then apply the Put or D

By Lars Hofhansl

(This is a follow to my ACID in HBase post from March this year)
HBase has a few special atomic operations:
  • checkAndPut, checkAndDelete - these simply check a value of a column as a precondition and then apply the Put or Delete if the check succeeded.
  • Increment, Append - these allow an atomic add to a column value interpreted as an integer, or append to the end of a column, resp.
checkAndPut and checkAndDelete are idempotent in the sense that they can safely be applied multiple time (but note that their outcome might differ when other changes are applied between the retries).

Increment and Append are not idempotent. They are the only non-repeatable operations in HBase. Increment and Append are also the only operations for which the snapshot isolation provided by HBase's MVCC model is not sufficient... More on that later.

In turns out that checkAndPut and checkAndDelete are not as atomic as expected (the issue was raised by Gregory and despite me not believing it first he is right - see HBASE-7051).

A look at the code makes this quite obvious:
Some of the Put optimizations (HBASE-4528) allow releasing the rowlock before the changes are sync'ed to the WAL. This also requires the lock to be released before the MVCC changes are committed so that changes are not visible to other transaction before they are guaranteed to be durable.
Another operation (such as checkAndXXX) that acquires the rowlock to make atomic changes may in fact not see current picture of things despite holding the rowlock as there could be still outstanding MVCC changes that only become visible after the row lock was release and re-acquired. So it might operate on stale data. Holding the rowlock is no longer good enough after HBASE-4528.

Increment and Append have the same issue.

The fix for this part is relatively simple: We need a "MVCC barrier" of sorts. Instead of completing a single MVCC transaction at the end of the update phase (which will wait for all prior transactions to finish), we just wait a little earlier instead for prior transactions to finish before we start the check or get phase of the atomic operation. This only reduces concurrency slightly, since before the end of the operation we have to await all prior transactions anyway. HBASE-7051 does exactly that for the checkAndXXX operations.

In addition - as mentioned above - Increment and Append have another issue, they need to be serializable transactions. Snapshot isolation is not good enough.
For example: If you start with 0 and issue an increment of 1 and another increment of 2 the outcome must always be 3. If both could start with the same start value (a snapshot) the outcome could 1 or 2 depending on which one finishes first.

Increment and Append currently skirt the issue with an ugly "hack": When they write their changes into the memstore they set the memstoreTS of all new KeyValues to 0! The effect is that they are made visible to other transactions immediately, violating HBase's MVCC. Again see ACID in HBase for an explanation of the memstoreTS.
This guarantees the correct outcome of concurrent Increment and Append operations, but the visibility to concurrent scanners is not what you expect. An Incremented and Appended value even for partial rows can be become visible to any scanner at any time even though the scanner should see an earlier snapshot of the data.
Increment and Append are also designed for very high throughput so they actually manipulate HBase's memstore to remove older versions of the columns just modified. Thus you lose the version history of the changes in exchange for avoiding a memstore exploding with version of the many Increments or Appends. This is called "upsert" in HBase. Upsert is nice in that it prevents the memstore being filled will a lot of old value if nobody cares for them. The downside is that is a special operation on the memstore, and hard to get right w.r.t. MVCC. It also does not work with mslab (see this Cloudera blog post for explanation of mslab).

If you don't care about visibility this is a simple problem, since you can just look through the memstore and remove old values. If you care about MVCC, though, you have to prove first that is safe to remove a KV.

I tried to fix this almost exactly a year ago (HBASE-4583), but after some discussions with my fellow committers we collectively gave up on that.

A few days ago I reopened HBASE-4583 and started with a radical patch that gets rid of all upsert-type logic (which set the memstoreTS to 0) and just awaits prior transactions before commencing the Increment/Append. Then I rely on my changes from HBASE-4241 to only flush the versions of columns needed when it is time to flush the memstore to disk. Turns out this is still quite a bit slower (10-15%), since it needs to flush the memstore frequently even thought it leads to mostly empty files. Still that was nice try, as it gets rid of a lot of special code and turns Increment and Append into normal HBase citizens.

A 2nd less radical version makes upsert MVCC aware.

That is actually not as easy as it looks. In order to remove a version of a column (a KeyValue) from the memstore you have to prove that is not and will not be seen by any concurrent or future scanner. That means we have to find the earliest readpoint of any scanner and ensure that there is at least one version of the KV older than that smallest readpoint; then we can safely remove any older versions of that KV from the memstore - because any scanner is guaranteed to see a newer version of the KV.
The "less radical" patch in  does exactly that.

In the end the patch I ended up committed with HBASE-4583 does both:
If the column family that has the column to be incremented or appended to has VERSIONS set to 1, we perform an MVCC aware upsert added by the patch. If VERSIONS is > 1, we use the usual logic to add a KeyValue to the memstore. So now this behaves as expected in all cases. If multiple versions are requested they are retained and time range queries will work even with Increment and Append; and it also keeps the performance characteristics (mostly) when VERSIONS is set to 1.

本網站聲明
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn

熱AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智慧驅動的應用程序,用於創建逼真的裸體照片

AI Clothes Remover

AI Clothes Remover

用於從照片中去除衣服的線上人工智慧工具。

Undress AI Tool

Undress AI Tool

免費脫衣圖片

Clothoff.io

Clothoff.io

AI脫衣器

Video Face Swap

Video Face Swap

使用我們完全免費的人工智慧換臉工具,輕鬆在任何影片中換臉!

熱工具

記事本++7.3.1

記事本++7.3.1

好用且免費的程式碼編輯器

SublimeText3漢化版

SublimeText3漢化版

中文版,非常好用

禪工作室 13.0.1

禪工作室 13.0.1

強大的PHP整合開發環境

Dreamweaver CS6

Dreamweaver CS6

視覺化網頁開發工具

SublimeText3 Mac版

SublimeText3 Mac版

神級程式碼編輯軟體(SublimeText3)

在Beego中使用Hadoop和HBase進行大數據儲存和查詢 在Beego中使用Hadoop和HBase進行大數據儲存和查詢 Jun 22, 2023 am 10:21 AM

隨著大數據時代的到來,資料處理和儲存變得越來越重要,如何有效率地管理和分析大量的資料也成為企業面臨的挑戰。 Hadoop和HBase作為Apache基金會的兩個項目,為大數據儲存和分析提供了一個解決方案。本文將介紹如何在Beego中使用Hadoop和HBase進行大數據儲存和查詢。一、Hadoop和HBase簡介Hadoop是一個開源的分散式儲存和運算系統,它可

深入解析MySQL MVCC 原理與實現 深入解析MySQL MVCC 原理與實現 Sep 09, 2023 pm 08:07 PM

深入解析MySQLMVCC原理與實作MySQL是目前最受歡迎的關係型資料庫管理系統之一,它提供了多版本並發控制(MultiversionConcurrencyControl,MVCC)機制來支援高效並發處理。 MVCC是一種在資料庫中處理並發事務的方法,可以提供高並發和隔離性。本文將深入解析MySQLMVCC的原理與實現,並結合程式碼範例進行說明。一、M

MySQL MVCC 原理深入解讀及最佳實踐 MySQL MVCC 原理深入解讀及最佳實踐 Sep 09, 2023 am 11:40 AM

MySQLMVCC原理深入解讀及最佳實務一、概述MySQL是使用最廣泛的關聯式資料庫管理系統之一,其支援多版本並發控制(Multi-VersionConcurrencyControl,MVCC)機制來處理並發存取問題。本文將深入解讀MySQLMVCC的原理,並舉出一些最佳實務的範例。二、MVCC原理版本號MVCC是透過為每個資料行增加額外

了解MySQL MVCC 原理,優化資料庫事務處理 了解MySQL MVCC 原理,優化資料庫事務處理 Sep 09, 2023 am 09:42 AM

了解MySQLMVCC原理,優化資料庫事務處理近年來,隨著資料量的不斷增長和應用需求的提升,資料庫事務處理的效能最佳化成為了資料庫開發和運作中一個非常重要的環節。 MySQL作為最常用的開源關係型資料庫之一,其MVCC(Multi-VersionConcurrencyControl)原則在事務處理中發揮了重要的作用。本文將介紹MySQLMVCC原理,並

掌握MySQL MVCC 原理,提升資料讀取效率 掌握MySQL MVCC 原理,提升資料讀取效率 Sep 10, 2023 pm 04:34 PM

掌握MySQLMVCC原理,提升資料讀取效率簡介:MySQL是一種常用的關聯式資料庫管理系統,而MVCC(Multi-VersionConcurrencyControl)是MySQL中常用的並發控制機制。掌握MVCC原理可以幫助我們更能理解MySQL的內部運作原理,並且可以提升資料讀取的效率。本文將介紹MVCC的原理以及如何運用此原理來提升資料讀取效

Mysql MVCC多版本並發控制的知識點有哪些 Mysql MVCC多版本並發控制的知識點有哪些 May 27, 2023 pm 11:31 PM

1、MVCCMVCC,全名為Multi-VersionConcurrencyControl,即多版本並發控制。 MVCC是一種並發控制的方法,一般在資料庫管理系統中,實現對資料庫的並發訪問,在程式語言中實現事務記憶體。 MVCC在MySQLInnoDB中的實作主要是為了提高資料庫並發效能,用更好的方式去處理讀寫衝突,做到即使有>讀寫衝突時,也能做到不加鎖,非阻塞並發讀。 2、目前讀像selectlockinsharemode(共享鎖定),selectforupdate;update,insert,de

怎麼使用Java中的Atomic原子性功能? 怎麼使用Java中的Atomic原子性功能? May 09, 2023 pm 04:40 PM

執行緒安全當多個執行緒存取某個類別時,不管執行時間環境採用何種調度方式或這些行程將如何交替執行,並且在主調程式碼中不需要任何額外的同步或協調,這個類別都能表現出正確的行為,那麼就稱這個類別時的線程安全。線程安全主要體現在以下三個方面原子性:提供了互斥訪問,同一時刻只能有一個線程對它進行操作可見性:一個線程對主內存的修改可以及時的被其他線程觀察到有序性:一個執行緒觀察其他執行緒中的指令執行順序,由於指令重排序的存在,該觀察結果一般雜亂無序JUC中的Atomic包詳解Atomic包中提供了很多Atomicxxx的類

SQL語句中的AND運算子和OR運算子怎麼用 SQL語句中的AND運算子和OR運算子怎麼用 May 28, 2023 pm 04:34 PM

SQLAND&OR運算子AND和OR運算子用於基於一個以上的條件來篩選記錄。 AND和OR可在WHERE子語句中把兩個或多個條件結合起來。如果第一個條件和第二個條件都成立,則AND運算子顯示一筆記錄。如果第一個條件和第二個條件中只要有一個成立,則OR運算子顯示一筆記錄。 "Persons"表:LastNameFirstNameAddressCityAdamsJohnOxfordStreetLondonBushGeorgeFifthAvenueNewYorkCarter

See all articles