HBase数据库性能调优
HBase数据库性能调优,因官方Book Performance Tuning部分章节 没有按配置项进行索引,不能达到快速查阅的效果。所以我以配置项驱
因官方Book Performance Tuning部分章节 没有按配置项进行索引,不能达到快速查阅的效果。所以我以配置项驱动,重新整理了原文,并补充一些自己的理解,如有错误,欢迎指正。
配置优化zookeeper.session.timeout
默认值:3分钟(180000ms)
说明:RegionServer与Zookeeper间的连接超时时间。当超时时间到后,ReigonServer会被Zookeeper从RS集群清单中移除,HMaster收到移除通知后,会对这台server负责的regions重新balance,让其他存活的RegionServer接管.
调优:
这个timeout决定了RegionServer是否能够及时的failover。设置成1分钟或更低,可以减少因等待超时而被延长的failover时间。
不过需要注意的是,对于一些Online应用,RegionServer从宕机到恢复时间本身就很短的(网络闪断,crash等故障,运维可快速介入),,如果调低timeout时间,反而会得不偿失。因为当ReigonServer被正式从RS集群中移除时,HMaster就开始做balance了 (让其他RS根据故障机器记录的WAL日志进行恢复)。当故障的RS在人工介入恢复后,这个balance动作是毫无意义的,反而会使负载不均匀,给RS 带来更多负担。特别是那些固定分配regions的场景。
hbase.regionserver.handler.count
默认值:10
说明:RegionServer的请求处理IO线程数。
调优:
这个参数的调优与内存息息相关。
较少的IO线程,适用于处理单次请求内存消耗较高的Big PUT场景(大容量单次PUT或设置了较大cache的scan,均属于Big PUT)或ReigonServer的内存比较紧张的场景。 【Linux公社 】
较多的IO线程,适用于单次请求内存消耗低,TPS要求非常高的场景。设置该值的时候,以监控内存为主要参考。
这里需要注意的是如果server的region数量很少,大量的请求都落在一个region上,因快速充满memstore触发flush导致的读写锁会影响全局TPS,不是IO线程数越高越好。
压测时,开启Enabling RPC-level logging ,可以同时监控每次请求的内存消耗和GC的状况,最后通过多次压测结果来合理调节IO线程数。
这里是一个案例 Hadoop and HBase Optimization for Read Intensive Search Applications ,作者在SSD的机器上设置IO线程数为100,仅供参考。
hbase.hregion.max.filesize
默认值:256M
说明:在当前ReigonServer上单个Reigon的最大存储空间,单个Region超过该值时,这个Region会被自动split成更小的region。
调优:
小region对split和compaction友好,因为拆分region或compact小region里的storefile速度很快,内存占用低。缺点是split和compaction会很频繁。
特别是数量较多的小region不停地split, compaction,会导致集群响应时间波动很大,region数量太多不仅给管理上带来麻烦,甚至会引发一些Hbase的bug。
一般512以下的都算小region。
大region,则不太适合经常split和compaction,因为做一次compact和split会产生较长时间的停顿,对应用的读写性能冲击非常大。此外,大region意味着较大的storefile,compaction时对内存也是一个挑战。
当然,大region也有其用武之地。如果你的应用场景中,某个时间点的访问量较低,那么在此时做compact和split,既能顺利完成split和compaction,又能保证绝大多数时间平稳的读写性能。
既然split和compaction如此影响性能,有没有办法去掉?
compaction是无法避免的,split倒是可以从自动调整为手动。
只要通过将这个参数值调大到某个很难达到的值,比如100G,就可以间接禁用自动split(RegionServer不会对未到达100G的region做split)。
再配合RegionSplitter这个工具,在需要split时,手动split。
手动split在灵活性和稳定性上比起自动split要高很多,相反,管理成本增加不多,比较推荐online实时系统使用。
内存方面,小region在设置memstore的大小值上比较灵活,大region则过大过小都不行,过大会导致flush时app的IO wait增高,过小则因store file过多影响读性能。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The retention period of Oracle database logs depends on the log type and configuration, including: Redo logs: determined by the maximum size configured with the "LOG_ARCHIVE_DEST" parameter. Archived redo logs: Determined by the maximum size configured by the "DB_RECOVERY_FILE_DEST_SIZE" parameter. Online redo logs: not archived, lost when the database is restarted, and the retention period is consistent with the instance running time. Audit log: Configured by the "AUDIT_TRAIL" parameter, retained for 30 days by default.

The amount of memory required by Oracle depends on database size, activity level, and required performance level: for storing data buffers, index buffers, executing SQL statements, and managing the data dictionary cache. The exact amount is affected by database size, activity level, and required performance level. Best practices include setting the appropriate SGA size, sizing SGA components, using AMM, and monitoring memory usage.

Oracle database server hardware configuration requirements: Processor: multi-core, with a main frequency of at least 2.5 GHz. For large databases, 32 cores or more are recommended. Memory: At least 8GB for small databases, 16-64GB for medium sizes, up to 512GB or more for large databases or heavy workloads. Storage: SSD or NVMe disks, RAID arrays for redundancy and performance. Network: High-speed network (10GbE or higher), dedicated network card, low-latency network. Others: Stable power supply, redundant components, compatible operating system and software, heat dissipation and cooling system.

The amount of memory required for an Oracle database depends on the database size, workload type, and number of concurrent users. General recommendations: Small databases: 16-32 GB, Medium databases: 32-64 GB, Large databases: 64 GB or more. Other factors to consider include database version, memory optimization options, virtualization, and best practices (monitor memory usage, adjust allocations).

To create a scheduled task in Oracle that executes once a day, you need to perform the following three steps: Create a job. Add a subjob to the job and set its schedule expression to "INTERVAL 1 DAY". Enable the job.

Apple's latest releases of iOS18, iPadOS18 and macOS Sequoia systems have added an important feature to the Photos application, designed to help users easily recover photos and videos lost or damaged due to various reasons. The new feature introduces an album called "Recovered" in the Tools section of the Photos app that will automatically appear when a user has pictures or videos on their device that are not part of their photo library. The emergence of the "Recovered" album provides a solution for photos and videos lost due to database corruption, the camera application not saving to the photo library correctly, or a third-party application managing the photo library. Users only need a few simple steps

How to use MySQLi to establish a database connection in PHP: Include MySQLi extension (require_once) Create connection function (functionconnect_to_db) Call connection function ($conn=connect_to_db()) Execute query ($result=$conn->query()) Close connection ( $conn->close())

Oracle listeners are used to manage client connection requests. Startup steps include: Log in to the Oracle instance. Find the listener configuration. Use the lsnrctl start command to start the listener. Use the lsnrctl status command to verify startup.
