Coprocessor access to HBase internals
By Lars Hofhansl Most folks familiar with HBase have heard of coprocessors. Coprocessors come in two flavors: Observers and Endpoints. An Observer is similar to a database trigger, an Endpoint can be likened to a stored procedure. This ana
By Lars HofhanslMost folks familiar with HBase have heard of coprocessors.
Coprocessors come in two flavors: Observers and Endpoints.
An Observer is similar to a database trigger, an Endpoint can be likened to a stored procedure.
This analogy only goes that far, though.
While triggers and stored procedures are (typically) sandboxed and expressed in a highlevel language (typically SQL with procedural extensions), coprocessors are written in Java and are designed to extend HBase directly (in the sense of avoiding subclassing the HRegionServer class in order to extend it). Code in a coprocessor will happily shutdown a region server by calling System.exit(...)!
On the other hand coprocessors are strangely limited. Before HBASE-6522 they had no access to a RegionServer's locks and leases and hence it was impossible to implement check-and-set type as a coprocessor (because the row modified would need to be locked), or to time out expensive server side data structures (via leases).
HBASE-6522 makes some trivial changes to remedy that.
It was also hard to maintain any kind of share state in coprocessors.
Keep in mind that region coprocessors are loaded per region and there might be 100's of regions for a given region server.
Static members won't work reliably, because coprocessor classes are loaded by special classloaders.
HBASE-6505 fixes that too. Now the RegionCoprocessorEnvironment provides a getSharedData() method, which returns a ConcurrentMap, which is held by the coprocessor environment as a weak reference (in a special map with strongly referenced keys and weakly referenced values), and held strongly by the environment that manages each coprocessor.
That way if the coprocessor is blacklisted (due to throwing an unexpected exception) the coprocessors environment is removed, and any shared data is immediately available for garbage collection, thus avoiding ugly and error prone reference counting (maybe this warrants a separate post).
This shared data is per coprocessor class and per regionserver. As long as there is at least one region observer or endpoint active this shared data is not garbage collected and can be accessed to share state between the remaining coprocessors of the same class.
These changes allow coprocessor to be used for a variety of use cases.
State can be shared across them, allowing coordination between many regions, for example for coordinated queries.
Row locks can be created and released - allowing for check-and-set type operations.
And leases can be used to safely expire expensive data structures or to time out locks among other uses.
Update:
I should also mention that RegionObservers already have access to a region's MVCC.
原文地址:Coprocessor access to HBase internals, 感谢原作者分享。

热AI工具

Undresser.AI Undress
人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover
用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool
免费脱衣服图片

Clothoff.io
AI脱衣机

AI Hentai Generator
免费生成ai无尽的。

热门文章

热工具

记事本++7.3.1
好用且免费的代码编辑器

SublimeText3汉化版
中文版,非常好用

禅工作室 13.0.1
功能强大的PHP集成开发环境

Dreamweaver CS6
视觉化网页开发工具

SublimeText3 Mac版
神级代码编辑软件(SublimeText3)

热门话题

DeepSeek 无法直接将文件转换为 PDF。根据文件类型,可以使用不同方法:常见文档(Word、Excel、PowerPoint):使用微软 Office、LibreOffice 等软件导出为 PDF。图片:使用图片查看器或图像处理软件保存为 PDF。网页:使用浏览器“打印成 PDF”功能或专用的网页转 PDF 工具。不常见格式:找到合适的转换器,将其转换为 PDF。选择合适的工具并根据实际情况制定方案至关重要。

Oracle 可以通过以下步骤读取 dbf 文件:创建外部表,引用 dbf 文件;查询外部表,检索数据;将数据导入 Oracle 表。

昨日,BotanixLabs宣布累计完成1150万美元融资,PolychainCapital、PlaceholderCapital等参投。融资将用于构建去中心化的EVM等效BTCL2Botanix。Spiderchain结合了EVM的易用性与比特币的安全性。自2023年11月测试网上线以来,已有超过20万个活跃地址。Odaily将于本文解析Botanix的特色机制与测试网交互流程。Botanix按照官方定义,Botanix是一个基于比特币构建的去中心化的图灵完备L2EVM,由两个核心组件以太坊虚

在Node.js环境中解决第三方接口返回403的问题当我们在使用Node.js调用第三方接口时,有时会遇到接口返回403错误�...

Laravel框架中Redis连接的共享与select方法的影响在使用Laravel框架和Redis时,开发者可能会遇到一个问题:通过配置...

多线程的好处在于能提升性能和资源利用率,尤其适用于处理大量数据或执行耗时操作。它允许同时执行多个任务,提高效率。然而,线程过多会导致性能下降,因此需要根据 CPU 核心数和任务特性谨慎选择线程数。另外,多线程编程涉及死锁和竞态条件等挑战,需要使用同步机制解决,需要具备扎实的并发编程知识,权衡利弊并谨慎使用。

Node环境下如何避免第三方接口返回403错误在使用Node.js调用第三方网站接口时,有时会遇到返回403错误的问题。�...

1.在开始菜单中搜索找到控制面板页面2.随后在控制面板中将视图更改为Category类别,并点击SystemandSecurity系统和安全选项3.找到并点击System系统下的Allowremoteaccess允许远程访问按钮4.在弹出的窗口中点击Remote系统属性栏目,勾选允许远程连接该计算机按钮并点击确定保存
